scieee Science in your language
[en] (orig)
AUTOMATIC DETECTION OF CONTRADICTIONS IN REQUIREMENTS
SPECIFICATIONS
vorgelegt von
M.Sc.
Alexander Elenga Gärtner
an der Fakultät V Verkehrs- und Maschinensysteme
der Technischen Universität Berlin
zur Erlangung des akademischen Grades
Doktor der Ingenieurwissenschaften
- Dr. Ing. -
genehmigte Dissertation
Promotionsausschuss:
Promotionsausschuss:
Vorsitzender: Prof. Dr.-Ing. Rainer Stark
Erstgutachter: Prof. Dr.-Ing. Dietmar Göhlich
Zweitgutachterin: Prof. Dr.-Ing. Beate Bender
Tag der wissenschaftlichen Aussprache: 10. Juni 2024
Berlin 2024
Vorsitzender:
Erstgutachter: Prof. Dr.-Ing. Dietmar Göhlich
Zweitgutachterin: Prof. Dr.-Ing. Beate Bender
Tag der wissenschaftlichen Aussprache: 10.06.2024
Berlin 2024
Danksagung
Diese kumulative Dissertation entstand während meiner Tätigkeit bei der IAV GmbH und
meiner Promotion am Lehrstuhl Methoden der Produktentwicklung und Mechatronik (MPM) an
der TU Berlin.
Zunächst möchte ich Professor Göhlich, meinem Doktorvater, meinen aufrichtigen Dank für
die Betreuung dieser Arbeit aussprechen. Sein Vertrauen in mich sowie seine fachliche
Expertise haben meine Promotion maßgeblich geprägt, ebenso wie sein zugewandtes
Mentoring.
Ein herzliches Dankeschön geht auch an Professorin Bender für ihre Bereitschaft, als
Zweitkorrektorin für diese Dissertation zu fungieren, sowie an Professor Stark für den Vorsitz
des Prüfungsausschusses.
Ein besonderer Dank gebührt meiner Mutter und Cansu, die mir während meiner
Promotionszeit stets zur Seite standen und mich in meiner Doppelbelastung unterstützten.
Ohne ihr Verständnis hätte ich dieses Pensum niemals durchhalten können.
Ich möchte auch Grigorii Gerdzhikov ausdrücklich für unseren regen und regelmäßigen
Austausch danken. Unsere Gespräche haben mich Woche für Woche angespornt,
kontinuierlich Ergebnisse zu liefern. Dr. Felix Matthies verdient meinen Dank für seine
persönliche Unterstützung und die wertvollen Tipps, die er mir gegeben hat.
Dr. Tu-Anh Fay danke ich für ihre umfassende Einführung in die wissenschaftliche Arbeits-
und Denkweise.
Des Weiteren möchte ich mich bei all meinen Freunden und meiner Familie bedanken. Ihr habt
mich stets motiviert gehalten.
Berlin, Juni 2024 Alexander Elenga Gärtner
Advertisement
Abstract I
Abstract
Alexander Elenga Gärtner
Automatic detection of Contradictions in Requirements Specifications
Technische Universität Berlin Faculty of Mechanical Engineering and Transport Systems
Chair of Methods for Product Development and Mechatronics
June 2024
Defining a complete, unambiguous, and contradiction-free target system is essential in product
development. When developing a product or system, comprehensive requirements detail the
desired functionalities and characteristics. However, these documents often contain errors in
the form of contradictions. This occurrence can be attributed to the inherent complexity of the
requirements specifications, encompassing several thousand individual requirements derived
through interdisciplinary collaboration. Furthermore, these requirements are articulated in
natural language, which introduces an additional layer of complexity due to potential
ambiguities and inconsistencies arising from the diverse perspectives and interpretations of
the multiple stakeholders involved.
Identifying and rectifying these contradictions manually presents considerable challenges and
drawbacks. The manual correction process is time-consuming and incurs high costs due to the
extensive effort required to inspect and analyze each requirement to detect potential
contradictions. Moreover, the implications of not addressing these contradictions on time can
give rise to complications in subsequent stages of product development. Failure to resolve
contradictory requirements can lead to misunderstandings, conflicts, and inefficiencies during
the design, implementation, and testing phases, ultimately hindering the successful realization
of the intended product or system.
Thus, the detection and resolution of contradictions in complex requirements documents
emerge as a critical area of research and development within the field of product development.
By creating automated methods and tools to identify and address these contradictions, it
becomes possible to mitigate the costs, complexities, and risks associated with manual error
correction. Additionally, automating this process enables the timely identification of
contradictions, allowing for their resolution during the early stages of product development,
thereby preventing potential complications and improving the overall efficiency and
effectiveness of the development process.
This work addresses this issue by discussing scientific theories regarding contradictions and
providing a comprehensive definition of what they are and how they can occur in requirement
documents. Drawing from the Aristotelian Logic of non-contradiction, specific subtypes of
contradictions within the requirements engineering context are defined, each exhibiting varying
levels of criticality. The proposed methodology involves formalizing requirements sentences
by extracting conditions and effects, as well as variables and actions, resulting in the
development of a taxonomy. Analyzing these constituent elements using specific yet simple
questions makes detecting the subtypes of contradictions possible.
To accomplish this objective, natural language processing methods based on grammatical
rules, machine learning, and deep learning models have been applied and thoroughly
analyzed. Consequently, a new method, ALICE (Automated Logic for the Identification of
Contradictions in Engineering), has been proposed. ALICE combines the logical analysis
capabilities of symbolic AI and the data-driven approach of LLMs, leading to the accurate
identification and classification of contradictions. The findings also demonstrate that this
Abstract II
approach outperforms the sole utilization of either symbolic AI or LLMs. As a result, ALICE
contributes to improved product development outcomes.
In conclusion, this work paves the way for future research endeavors in the field of automated
requirements engineering by offering a solution and proof-of-concept for detecting
contradictions within complex requirements documents. By enabling the automated
identification of contradictions, this research aims to enhance the overall quality of
requirements, thereby fostering more precise communication, increased consistency, and
reduced conflicts in product development.
Advertisement
Zusammenfassung III
Zusammenfassung
Alexander Elenga Gärtner
Automatische Detektion von Widersprüchen in Anforderungsspezifikationen
Technische Universität Berlin Fakultät für Verkehrs- und Maschinensysteme
Fachgebiet Methoden der Produktentwicklung und Mechatronik
Juni 2024
In der Produktentwicklung ist die Definition eines vollständigen, eindeutigen und
widerspruchsfreien Zielsystems von entscheidender Bedeutung. Während der
Produktentwicklung werden umfangreiche Anforderungsdokumente erstellt, um die
gewünschten Funktionalitäten und Eigenschaften detailliert zu beschreiben. Jedoch enthalten
diese Dokumente oft Fehler in Form von Widersprüchen. Dies ist auf die inhärente Komplexität
der Dokumente zurückzuführen, da sie tausende individuelle Anforderungen umfassen und
durch interdisziplinäre Zusammenarbeit entstehen. Die Formulierung in natürlicher Sprache
führt zusätzliche Komplexität durch potenzielle Mehrdeutigkeiten und Inkonsistenzen aus
verschiedenen Perspektiven und Interpretationen der beteiligten Stakeholder ein.
Die manuelle Identifizierung und Behebung dieser Widersprüche sind zeitaufwändig und
kostenintensiv. Der Korrekturprozess erfordert eine umfassende Überprüfung und Analyse
jeder Anforderung, um potenzielle Widersprüche zu erkennen. Nicht behobene Widerspche
können zu Komplikationen in späteren Phasen der Produktentwicklung führen, was
Missverständnisse, Konflikte und Ineffizienzen verursachen und letztendlich die erfolgreiche
Umsetzung des Produktes oder Systems behindern kann.
Deshalb ist die Erkennung und Behebung von Widersprüchen in komplexen
Anforderungsdokumenten ein kritisches Forschungs- und Entwicklungsfeld. Durch die
Einführung automatisierter Methoden und Werkzeuge zur Identifizierung und Behebung dieser
Widersprüche ist es möglich, die mit manueller Fehlerkorrektur verbundenen Kosten,
Komplexitäten und Risiken zu minimieren. Darüber hinaus ermöglicht die Automatisierung
dieser Aufgabe die rechtzeitige Identifizierung von Widersprüchen und deren Auflösung in den
frühen Phasen der Produktentwicklung. Dadurch werdenn potenzielle Komplikationen
vermieden, und die Gesamteffizienz und -effektivität des Entwicklungsprozesses verbessert.
Diese Arbeit diskutiert wissenschaftliche Theorien zu Widersprüchen und bietet eine
umfassende Definition der Arten von Widersprüchen in Anforderungsdokumenten. Unter
Bezugnahme des aristotelischen Gesetz des Widerspruchs werden spezifische Unterarten von
Widersprüchen mit unterschiedlicher Kritikalität für das Anforderungsengineering identifiziert.
Die vorgeschlagene Methodik umfasst die Formalisierung von Anforderungen durch
Extrahieren von Bedingungen, Effekten, Variablen und Aktionen, was zur Entwicklung einer
Taxonomie führt. Durch die Analyse dieser Konstituenten anhand einfacher Fragen wird die
Erkennung von Widersprüchen ermöglicht.
Um dieses Ziel zu erreichen, wurden Methoden zur natürlichen Sprachverarbeitung, basierend
auf grammatikalischen Regeln, maschinellem Lernen und Deep-Learning-Modellen analysiert
und angewendet. Dies führte zur Entwicklung eines neuen Ansatzes namens ALICE
(Automated Logic for the Identification of Contradictions in Engineering) zur Erkennung von
Widersprüchen in Anforderungen. ALICE kombiniert analytische Fähigkeiten symbolischer KI
mit dem datengetriebenen Ansatz von LLMs, was zu einer genauen Identifizierung und
Klassifizierung von Widersprüchen führt. Die Ergebnisse zeigen, dass dieser Ansatz die
alleinige Verwendung von symbolischer KI oder LLMs zur Erkennung von Widersprüchen
übertrifft. ALICE trägt folglich zu einer Verbesserung des Produktentwicklungsprozesses bei.
Zusammenfassend ebnet diese Arbeit den Weg für zukünftige Forschungsaktivitäten im
Anforderungsengineering, indem sie eine automatisierte Lösung und einen Proof-of-Concept
Zusammenfassung IV
zur Erkennung von Widersprüchen in komplexen Anforderungsdokumenten bietet. Durch die
automatisierte Identifizierung von Widersprüchen zielt diese Forschung darauf ab, die
Gesamtqualität der Anforderungen zu verbessern, was zu effektiverer Kommunikation,
erhöhter Konsistenz und reduzierten Konflikten in der Produktentwicklung führt.
Advertisement
Table of Contents V
Table of Contents
ABSTRACT ..................................................................................................................................................... I
ZUSAMMENFASSUNG ............................................................................................................................... III
TABLE OF CONTENTS ................................................................................................................................ V
LIST OF ABBREVIATIONS ........................................................................................................................ VII
LIST OF FIGURES ...................................................................................................................................... VII
LIST OF TABLES ......................................................................................................................................... IX
1 INTRODUCTION AND MOTIVATION ................................................................................................. 1
1.1 MOTIVATION........................................................................................................................................... 1
1.2 CONCEPT AND HOW-TO-READ .............................................................................................................. 3
2 THEORETICAL EMBEDDING ............................................................................................................. 6
2.1 KNOWLEDGE: FROM KANT TO DIKW AND NLP ................................................................................... 6
2.2 WISDOM: ONTOLOGIES OR LLMS? ....................................................................................................... 7
2.2.1 Ontologies ........................................................................................................................................ 7
2.2.2 Large Language Models in Knowledge Representation ........................................................... 8
2.2.3 Conclusion ..................................................................................................................................... 10
2.3 PROPOSITIONAL LOGIC ....................................................................................................................... 11
3 FUNDAMENTAL RESEARCH ON DETECTING CONTRADICTIONS IN REQUIREMENTS:
TAXONOMY AND SEMI-AUTOMATED APPROACH ............................................................................. 13
3.1 ABSTRACT ........................................................................................................................................... 13
3.2 INTRODUCTION .................................................................................................................................... 13
3.2.1 Problem .......................................................................................................................................... 13
3.2.2 Contribution ................................................................................................................................... 14
3.3 FUNDAMENTALS................................................................................................................................... 14
3.3.1 Formulation and building blocks ................................................................................................. 14
3.3.2 Contradictions ............................................................................................................................... 14
3.4 RELATED WORK .................................................................................................................................. 16
3.4.1 Classification of conflicts ............................................................................................................. 16
3.4.2 Natural Language Processing for detecting conflicts .............................................................. 17
3.4.3 Ontologies for detecting conflicts ............................................................................................... 17
3.5 METHOD FOR DETECTING CONTRADICTIONS ...................................................................................... 17
3.5.1 Nomenclature ................................................................................................................................ 17
3.5.2 Contradictions Subcategories ................................................................................................. 18
3.5.3 Process .......................................................................................................................................... 20
3.6 MATERIALS AND RESULTS ................................................................................................................... 21
3.6.1 Materials......................................................................................................................................... 21
3.6.2 Results ........................................................................................................................................... 21
3.7 DISCUSSION ......................................................................................................................................... 27
3.8 CONCLUSION ....................................................................................................................................... 27
3.9 OTHER ................................................................................................................................................. 28
4 AUTOMATED CONDITION DETECTION IN REQUIREMENTS ENGINEERING ......................... 29
4.1 ABSTRACT ........................................................................................................................................... 29
4.2 INTRODUCTION .................................................................................................................................... 29
4.3 TERMINOLOGY ..................................................................................................................................... 30
4.4 RELATED WORK .................................................................................................................................. 31
4.5 STUDY .................................................................................................................................................. 32
4.5.1 Method ........................................................................................................................................... 32
4.5.2 Data ................................................................................................................................................ 33
4.6 RESULTS .............................................................................................................................................. 33
4.6.1 Dataset 1 ........................................................................................................................................ 33
4.6.2 Dataset 2 ........................................................................................................................................ 34
Table of Contents VI
4.6.3 Detection of constituents ............................................................................................................. 35
4.6.4 Discussion of Validity ................................................................................................................... 36
4.7 SUMMARY, CONCLUSION AND OUTLOOK ............................................................................................ 36
4.8 AUTHOR CONTRIBUTIONS ................................................................................................................... 37
4.9 ACKNOWLEDGMENTS .......................................................................................................................... 37
5 AUTOMATED REQUIREMENT CONTRADICTION DETECTION THROUGH FORMAL LOGIC
AND LLMS ................................................................................................................................................... 38
5.1 ABSTRACT ........................................................................................................................................... 38
5.2 INTRODUCTION .................................................................................................................................... 39
5.3 FUNDAMENTALS................................................................................................................................... 40
5.3.1 Contradictions ............................................................................................................................... 40
5.3.2 Theoretical Background for NLP ................................................................................................ 41
5.4 RELATED WORK .................................................................................................................................. 42
5.4.1 Classifying Conflicts ..................................................................................................................... 42
5.4.2 Natural Language Processing for Detecting Contradictions .................................................. 43
5.5 AUTOMATION METHOD ........................................................................................................................ 44
5.5.1 Method Fundamentals ................................................................................................................. 44
5.5.2 Preprocessing ............................................................................................................................... 46
5.5.3 Questions ....................................................................................................................................... 47
5.6 VALIDATION AND RESULTS .................................................................................................................. 47
5.6.1 Data Analysis and Methodology ................................................................................................. 47
5.6.2 Results for Dataset 1 .................................................................................................................... 48
5.6.3 Results for Dataset 2 .................................................................................................................... 50
5.6.4 Application to large datasets ....................................................................................................... 52
5.6.5 LLM Comparison .......................................................................................................................... 52
5.6.6 Criticality Assessment .................................................................................................................. 53
5.6.7 Conclusion ..................................................................................................................................... 53
5.7 LIMITS .................................................................................................................................................. 54
5.7.1 Limits of LLMs for Contradiction Detection ............................................................................... 54
5.7.2 Limits of Formal Logic .................................................................................................................. 55
5.7.3 Threats to Validity ......................................................................................................................... 56
5.8 CONCLUSION AND OUTLOOK ............................................................................................................... 56
5.9 APPENDIX ............................................................................................................................................ 58
5.10 AUTHOR CONTRIBUTIONS ................................................................................................................... 60
5.11 DATA AVAILABILITY .............................................................................................................................. 60
5.12 ACKNOWLEDGMENTS .......................................................................................................................... 60
5.13 COMPETING INTERESTS ...................................................................................................................... 60
6 DISCUSSION ....................................................................................................................................... 61
6.1 COMPARISON OF THE RESULTS WITH THE LITERATURE ..................................................................... 61
6.1.1 Taxonomy ...................................................................................................................................... 63
6.1.2 Preprocessing and Part-of-Speech tagging:............................................................................. 64
6.1.3 Interpretation Part 1...................................................................................................................... 64
6.1.4 Interpretation Part 2...................................................................................................................... 65
6.1.5 Conclusion ..................................................................................................................................... 66
6.2 FROM METHOD TO MECHANISM: ENVISIONING ALICE AS A TOOL .................................................... 67
6.2.1 Operational Workflow ................................................................................................................... 67
6.2.2 Usability and Integration .............................................................................................................. 68
6.3 DEVELOPMENT PROCESSES INTEGRATION ........................................................................................ 70
6.3.1 Product Requirements Specification Process .......................................................................... 70
6.3.2 Process Models and Guidelines for Product Development .................................................... 71
6.4 LIMITATIONS AND THREATS TO VALIDITY ............................................................................................. 75
6.4.1 Limitations of the Methodology ................................................................................................... 75
6.4.2 Threats to Validity ......................................................................................................................... 75
7 SUMMARY AND OUTLOOK .............................................................................................................. 76
7.1 SUMMARY ............................................................................................................................................ 76
Advertisement
List of Abbreviations VII
7.2 OUTLOOK ............................................................................................................................................. 77
8 PUBLICATION BIBLIOGRAPHY ........................................................................................................ X
List of Abbreviations
ACC
Accuracy
ADVB
Adverbial
AI
Artificial Intelligence
ALICE
Automated Logic for the Identification of Contradictions in Engineering
API
Application Programming Interface
DIKW
Data, Information, Knowledge, Wisdom
kNN
k-Nearest Neighbor
LLM
Large Language Model
LNC
Law of non-Contradiction
ML
Machine Learning
NB
Naïve Bayes
NLI
Natural Language Inferencing
NLP
Natural Language Processing
PoC
Proof of Concept
POS
Part-of-Speech
RE
Requirements Engineering
REC
Recall
SC
Subordinate-Clause / Sub-Clause
SN
Sensitivity
TPR
True Positive Rate
List of Figures
Figure 1-1: Dilemma of product development (Ehrlenspiel and Meerkamm 2017) .................. 2
Figure 1-2: Contents and structure of this dissertation ............................................................ 4
Figure 2-1: DIKW pyramid (Ackoff 1989) ................................................................................ 6
Figure 2-2: DIKW and Contradiction Detection ....................................................................... 7
Figure 2-3: How to represent Ontologies (Sack and Alam 2020) ............................................ 8
Figure 2-4: Working with LLMs ............................................................................................... 9
Figure 2-5: Data, Information, and Knowledge (Awad 2003)................................................. 11
Figure 3-1: Formulation and building blocks ......................................................................... 14
List of Figures VIII
Figure 3-2: Square of opposition .......................................................................................... 15
Figure 3-3: Contradictions .................................................................................................... 17
Figure 3-4: Classification of contradictions ........................................................................... 18
Figure 3-5: Process overview ............................................................................................... 20
Figure 3-6: Building blocks for a Simplex Subaltern contradiction, consisting of two
requirements ........................................................................................................................ 22
Figure 3-7: Process for Simplex Subaltern ........................................................................... 22
Figure 3-8: Building blocks for an Alius Subaltern contradiction, consisting of two requirements
............................................................................................................................................. 23
Figure 3-9: Process for Alius Subaltern ................................................................................ 23
Figure 3-10: Building blocks for an Alius Contradictory contradiction, consisting of two
requirements ........................................................................................................................ 24
Figure 3-11: Process for Alius Contradictory ........................................................................ 25
Figure 3-12: Building blocks for an Alius Contrary contradiction, consisting of two requirements
............................................................................................................................................. 26
Figure 3-13: Process for Alius Contrary ................................................................................ 26
Figure 4-1: variable and action ............................................................................................. 31
Figure 4-2: code structure .................................................................................................... 32
Figure 4-3: confusion matrices for dataset 1 based on 313 requirements ............................. 34
Figure 4-4: confusion matrices for dataset 2 based on 300 requirements ............................. 35
Figure 4-5: condition/effect verbal expression detection ....................................................... 36
Figure 4-6: action/variable verbal expression detection ........................................................ 36
Figure 5-1: Contradictions relevant to RE ............................................................................. 40
Figure 5-2: Nine types of contradictions for RE (Gärtner et al. 2022) .................................... 43
Figure 5-3: Modular decision tree for contradiction detection in RE ...................................... 45
Figure 5-4: Showcase of Condition and Effect, as well as Variable and Action in a formal
requirement .......................................................................................................................... 46
Figure 5-5: LLM prompt - pseudo code ................................................................................. 50
Figure 5-6: GPT3 has enough knowledge to detect and explain the present contradiction ... 54
Figure 5-7: GPT3 is not able to properly detect the present contradictions. Firstly, the
sentences are not conditions but mere sentences, and secondly, although the conditions can
indeed be true at the same time, this would lead to contradicting effects. ............................. 55
Figure 5-8: Prompt for the first question ............................................................................... 58
Figure 5-9: Prompt for the third question .............................................................................. 59
Figure 5-10: Prompt for the fourth question .......................................................................... 59
Figure 5-11: Condition Detection with GPT-3 ....................................................................... 59
Figure 5-12: Prompt for the seventh question ....................................................................... 60
Figure 6-1: Procedure, based on Figure 2-2: DIKW and Contradiction Detection ................. 61
Figure 6-2: Different semantic decompositions: (1) - Preum et al. (2017), (2) - dissertation at
hand, (3) - Sarafraz (2011) ................................................................................................... 65
Figure 6-3: ALICE Workflow: Identifying conditions and contradictions in requirements
engineering .......................................................................................................................... 67
Figure 6-4: ALICE’s structure (Gärtner and Göhlich 2023) ................................................... 68
Related document tools
Tools for careful academic writing
Plag helps review text similarity and possible source overlap. Identific can support academic and institutional document processes. They help keep academic document workflows clearer.
List of Tables IX
Figure 6-5: Product Requirements Specification Process based on (Bender 2020) .............. 71
Figure 6-6: Process model according to VDI 2221 (Göhlich et al. 2021) ............................... 72
Figure 6-7:Tasks when developing requirements (own illustration based on Bender and
Gericke (2021)) .................................................................................................................... 73
List of Tables
Table 1-1: Impairment and Success factors of project execution (The Standish Group 2014) 1
Table 1-2: Refinements from the first paper to the second and third paper ............................. 5
Table 2-1: Most basic connectives (Sack and Alam 2020) .................................................... 11
Table 2-2: How to model facts? (Sack and Alam 2020) ........................................................ 11
Table 3-1: Formalized contradictions .................................................................................... 19
Table 3-2: Distribution .......................................................................................................... 21
Table 5-1: extract from Dataset 1 ......................................................................................... 49
Table 5-2: Results for Reference Dataset in the form of confusion matrices ......................... 49
Table 5-3: confusion matrices for the first dataset ................................................................ 50
Table 5-4: Results for Dataset 2 in the form of confusion matrices ....................................... 52
Table 5-5: Different answers generated by GPT3, GPT3.5, GPT4 (22.03.2023) and LLaMA 53
Table 6-1: Possible cases of conflict (Preum et al. 2017) ..................................................... 63
Table 6-2: Possible cases of contradictions. ......................................................................... 64
Introduction and Motivation 1
1 Introduction and Motivation
This chapter offers a comprehensive introduction to this dissertation. It begins with exploring
the research motivation and provides an overview of the key concepts. Additionally, it guides
navigating and understanding the dissertation's structure.
1.1 Motivation
Requirements are fundamental to the development process of most products. They serve as
a vital means of communication between stakeholders involved in developing solutions,
especially when they are being developed by different individuals or groups simultaneously or
successively. Errors in requirements can have grave consequences, as illustrated by a 1993
study that identified the primary cause of safety-related software errors in NASA's Voyager
and Galileo spacecraft stemming from functional and interface requirements errors. These
errors can potentially result in serious accidents (Lutz 1993).
Effective requirements management ensures that all stakeholders clearly understand the
desired product functionality, performance, and constraints. It helps align the development
efforts, minimizes misunderstandings, and reduces the risk of rework and project delays. This
is shown in Table 1-1, which presents the most common project impairment and success
factors, as identified by The Standish Group (2014). In requirements management, the
analysis reveals that incomplete requirements account for 13.1% of the cases, lack of user
involvement constitutes 12.4%, unrealistic expectations contribute to 9.9%, and changing
requirements and specifications represent 8.7%. Moreover, when examining the success
factors associated with requirements management, clear statements of requirements emerge
as a significant factor, accounting for 13.0% of the cases. Realistic expectations constitute
8.2% of the cases, and clear vision and objectives represent 2.9% of the successful outcomes.
Table 1-1: Impairment and Success factors of project execution (The Standish Group 2014)
Project Impaired Factors
% of Responses
Project Succes Factors
% of Responses
1. Incomplete Requirements
13.1%
1. User Involvement
15.9%
2. Lack of User Involvement
12.4%
2. Executive Management Support
13.9%
3. Lack of Resources
10.6%
3. Clear Statement of Requirements
13.0%
4. Unrealistic Expectations
9.9%
4. Proper Planning
9.6%
5. Lack of Executive Support
9.3%
5. Realistic Expectation
8.2%
6. Changing Requirements & Specifications
8.7%
6. Smaller Project Milestones
7.7%
7. Lack of Planning
8.1%
7. Competent Staff
7.2%
8. Didn‘t Need it Any Longer
7.5%
8. Ownership
5.3%
9. Lack of IT Management
6.2%
9. Clear Vision & Objectives
2.9%
10. Technology Illiteracy
4.3%
10. Hard-Working, Focused Staff
2.4%
Other
9.9%
Other
13.9%
Legend:
Highlight: Factor directly or indirectly related to requirements management
Typically, requirements are documented in a specification sheet to capture all the requirements
for a development project. The primary objective of this document is to define a
comprehensive, unambiguous, and coherent target system without any contradictions.
However, the complexity and diversity of modern systems, such as mechatronic products,
necessitate distributed development across various systems and levels of aggregation of
product development. Consequently, the requirements for each system and level are typically
managed in separate documents, making it challenging to establish correlations between
Advertisement
Introduction and Motivation 2
them. Given these factors, it is unsurprising that contradictions and other inconsistencies are
often found in these documents (Dick 2017; Bender and Gericke 2021). This issue can lead to
a significant impact on project costs and timelines. Discovering and rectifying these issues late
in the development process often requires extensive rework, modifications, and additional
resources, leading to delays and increased expenses, as seen in Figure 1-1.
Additionally, the quality of the final product may be compromised, as conflicting requirements
can result in a system that fails to meet user needs effectively or lacks the desired functionality.
This can lead to customer dissatisfaction, potential reputational damage, and even financial
losses (Ehrlenspiel and Meerkamm 2017; Standish Group 1995; Giffin et al. 2009) Therefore,
it becomes clear that an automated method to detect such errors would be highly beneficial.
Figure 1-1: Dilemma of product development (Ehrlenspiel and Meerkamm 2017)
To this day, in most cases, requirements are expressed in natural language (Luisa et al. 2004),
which is why Natural Language Processing (NLP) plays a crucial role in handling large sets of
requirements documents. NLP is a subfield of artificial intelligence (AI) concerned with the
interaction between computers and humans using natural language. The development of NLP
can be traced back to the 1950s when the idea of using computers to analyze and understand
was first proposed by Chomsky (1957). However, it was not until the 1980s that significant
progress was made when researchers developed various new techniques and algorithms for
analyzing and generating natural language. A significant breakthrough during this time was the
development of statistical machine translation systems, which used large amounts of data to
improve the quality of translations (Foote 2019). In the mid-2000s, researchers began to focus
on developing more sophisticated natural language understanding systems. These systems
were designed to extract meaning and context from text rather than simply analyzing its surface
features. A notable example of this was the development of sentiment analysis systems, which
could analyze the emotional content of a text (Mäntylä et al. 2018).
One of the most significant breakthroughs in recent years has been the emergence of Large
Language Models (LLMs), notably ChatGPT, thanks to the development of large datasets and
powerful computing resources, which have revolutionized the field of NLP. LLMs are machine
learning models trained on massive amounts of text data using deep neural networks. They
can generate human-like responses to natural language queries and translate text from one
language to another. The emergence of LLMs has led to significant advances in several NLP
applications, such as speech recognition, sentiment analysis, and machine translation
(Bowman 2023).
In summary, NLP methods can enable extracting and parsing textual information from
requirements documents, facilitating the ability to extract insights, detect conflicts, and improve
the accuracy and efficiency of requirement analysis processes.
Introduction and Motivation 3
1.2 Concept and How-to-Read
Several key concepts play a crucial role in the study of language and information processing,
which will serve as a red thread for this thesis: syntax, semantics, and taxonomy. Syntax refers
to the rules governing the structure of sentences and phrases in a language. It involves
understanding how words and phrases are arranged to form meaningful expressions. In
English, "[…] the main device for showing the relationship among words is word order"
(Britannica, The Editors of Encyclopaedia 2023). Conversely, semantics refers to the meaning
of words, phrases, and sentences (Duden 2023a). It involves understanding how words and
phrases convey meaning and how they relate to each other in a sentence. Another essential
concept is taxonomy, which refers to the segmentation and classification of linguistic units
(Duden 2023b). It involves creating a categorization system to group similar things based on
their properties.
Together, these concepts form the foundation of language and information processing. They
enable us to communicate effectively and efficiently, understand and organize information, and
create systems to reason and learn independently. The emergence of large language models
and other advanced technologies has dramatically expanded language and information
processing potential.
Other terms important in this context are thesauri and ontologies. A thesaurus helps to find
synonyms or antonyms. Ontologies define concepts and relationships in a domain (Biagetti
2020). Initially, the application of thesauri and ontologies was considered for the third paper.
However, following the outcomes detailed in section 2.2, Wisdom: Ontologies or LLMs, the
decision was made to employ LLMs instead.
This cumulative dissertation presents a comprehensive approach to detecting contradicting
requirements. It comprises three papers, prefaced by an introduction and encapsulated by a
discussion, as depicted in Figure 1-2. Section 1 introduces the topic, detailing the motivation
and conceptual framework, and guides how to navigate this dissertation. Section 2 delves into
various methodologies for representing knowledge.
The first publication in Section 3 establishes the various contradictions that may arise in
requirements documents, serving as the basis for subsequent publications. The second
publication in Section 4 introduces a crucial step toward automated contradiction detection,
offering a rule-based method for identifying conditionals and pseudo-grammatical elements
within requirements. The third publication in Section 5 incorporates LLMs to detect
contradictions between requirements by combining the strengths of both symbolic AI and
LLMs. While symbolic AI is adept at detecting conditions and their effects within requirements,
LLMs excel at understanding the nuances of natural language and identifying subtle
contradictions that may not be immediately apparent. Together, these two AIs form a powerful
tool to detect contradictions in requirements automatically. Importantly, this approach is
grounded in the theoretical foundation established in the first publication, which enables
systematically identifying and addressing the different types of contradictions with varying
levels of criticality.
Advertisement
Introduction and Motivation 4
Figure 1-2: Contents and structure of this dissertation
Paper 1: Syntax and semantics of requirements and taxonomy of contradicting requirements.
It establishes the theoretical framework for the dissertation and is presented in a
modular format (Gärtner et al. 2022).
Title: Fundamental research on detecting contradictions in requirements: Taxonomy
and semi-automated approach
Paper 2: Adjustment of the first paper’s taxonomy and automated syntax, semantics, and
taxonomy detection called constituent analysis. It introduces symbolic AI, mainly
based on grammatical rules, and is the foundation of the automation presented in the
third paper. The results form the basis of the formal logic used in the third paper
(Gärtner et al. 2023).
Title: Automated Condition Detection in Requirements Engineering
Paper 3: Extension of the symbolic AI from the second paper and integration of LLMs, resulting
in a successful, automated detection of contradicting requirements (Gärtner and
Göhlich 2024a).
Title: Automated Requirement Contradiction Detection through Formal Logic and
LLMs
Section 6 discusses the development prospects for a software tool named ALICE, an acronym
for Automated Logic for the Identification of Contradictions in Engineering. This exploration is
Introduction and Motivation 5
twofold: first, the feasibility of developing such a tool, and second, its potential applications in
the development process. Though not a component of this dissertation, the research
encompasses a proof-of-concept (PoC) for ALICE and a fourth paper published within the
scope of this work. The PoC is an executable Python script following the methodology this
research delineated. The additional paper focuses on practically implementing contradiction
detection within the development process. Both are discussed in this section.
To rapidly gain a comprehensive understanding of the dissertation, it is acceptable to focus on
the third paper, which offers a condensed overview of the theoretical definitions and processes
of the first and second papers in the form of a decision tree. However, to fully comprehend the
development of the decision tree and for possible future adaptions, it is recommended to read
the papers in the correct reading order.
It should be noted that the terms (taxonomy) used in the first paper were modified for the
second and third papers based on practical findings. In order to prevent any possible confusion
when reading this dissertation, these modifications will also be outlined here (see Table 1-2),
in addition to being discussed in the second paper, where the definitions and rationale for the
modifications can be found.
Table 1-2: Refinements from the first paper to the second and third paper
Paper 1
Paper 2 and Paper 3
Cause
Condition
Effect
Effect
Variable
Variable
Condition
Action
Advertisement
Theoretical Embedding 6
2 Theoretical Embedding
Section 2 presents a theoretical embedding of the underlying methods used in the papers.
Section 2.1 explores how knowledge is created, as requirements documents are essentially a
specific form of knowledge management that can be analyzed using various methods. These
findings lead to section 2.2, where two prominent approaches for this task are compared.
Finally, section 2.3 examines the theoretical basis of the formal syntax used in the papers.
2.1 Knowledge: From Kant to DIKW and NLP
The ancient Greek philosopher Aristotle whose work logic of non-contradiction (MP: 1005b)
is the basis for the first paper contributed significantly to the development of epistemology
(the art of knowledge, ger: Erkenntnistheorie). He believed that knowledge is acquired through
experience and observation (Hauser 2019). Immanuel Kant built upon the works of Aristotle to
develop his own epistemological framework in 1787. He argued that knowledge is a product
of both sensory experience and innate concepts, which he called 'categories.' He believed the
mind actively structures sensory data to produce knowledge (Kant 2011).
The DIKW framework, developed by Ackoff (1989), is a modern approach to knowledge
management that emphasizes the relationships between Data, Information, Knowledge, and
Wisdom. Although the works of Aristotle or Kant did not officially influence the framework, their
ideas on epistemology can be seen as its precursor particularly Kant's notion of knowledge
consisting of hierarchical layers of sensory experience and innate concepts.
The DIKW framework represents a pyramid, where data forms the base, information builds
upon the data, knowledge builds upon information, and wisdom is at the top of the pyramid
(Rowley 2007), as seen in Figure 2-1:
Figure 2-1: DIKW pyramid (Ackoff 1989)
In the context of Natural Language Processing, the DIKW has become an essential framework
for understanding the processing of textual data. NLP techniques extract meaning and insights
from unstructured data such as text, speech, and images. Unlike structured data, which can
be organized in a tabular format with clearly defined fields and relationships, unstructured data
does not have a predetermined organization or standardized format. Text, speech, and images
often contain free-form expressions and visual content that do not follow a strict structure.
DIKW provides a structure for understanding the different levels of information that can be
derived from such data.
At the pyramid's base, NLP algorithms extract information from raw data, identifying individual
sentences and primary constituents such as subjects, verbs, or objects. 'The difference
between data and information is functional, not structural.' (Ackoff 1989).
Knowledge, the next level in the DIKW hierarchy, is derived from information by making
connections and drawing inferences (conclusions) between different pieces of information. In
NLP, this involves using techniques such as semantic analysis or machine learning to identify
Wisdom
Knowledge
Information
Data
Theoretical Embedding 7
patterns and relationships between different pieces of information, such as a sentence
condition.
Finally, at the top of the pyramid is wisdom, which represents the highest level of understanding
and insight that can be derived from data. Wisdom is not just about having access to
information or knowledge but about having the ability to apply that information and knowledge
in a meaningful way, which, in our case, would correspond to the realization that two
statements contradict each other. In NLP, wisdom can be achieved by developing intelligent
systems that can reason.
The development process and milestones for finding contradictions in requirements can be
mapped to the DIKW pyramid, as depicted in Figure 2-2.
Figure 2-2: DIKW and Contradiction Detection
2.2 Wisdom: Ontologies or LLMs?
As mentioned earlier, two methods will be compared for their ability to detect contradictions.
Ontologies, also referred to as knowledge graphs, are a probate method to handle knowledge
in natural language processing. However, in recent years, LLMs have gained significant
attention in this field. In the following section, both approaches will be discussed to understand
why, in this dissertation, LLMs were ultimately chosen to detect contradictions in requirements
documents.
2.2.1 Ontologies
Ontologies are vital to knowledge representation in computer science and artificial intelligence.
They are a formal, explicit specification of a shared conceptualization. In other words, an
ontology defines a set of concepts and their relationships to each other in a structured,
organized manner. It is used to represent knowledge machine-readable so that computers can
understand and reason about the relationships between concepts (Gruber 1993; Uschold and
Gruninger 1996).
A benefit of using ontologies is that they enable automated reasoning, which can lead to the
acquisition of wisdom. For example, an ontology could represent the relationships between
different functions and their effects on different components. Automated reasoning tools could
then be used to identify potential conflicts based on the known effects of the functions. Figure
2-3 shows an example from KIT (Sack and Alam 2020) of a simple ontology on a scientific
article about the effects of air pollution.
Contradiction detection
between sentences
Wisdom
Condition/Effect
Action/Variabe
Knowledge
Sentences
Grammar
Information
Strings
Characters
Data
Advertisement
Theoretical Embedding 8
Figure 2-3: How to represent Ontologies (Sack and Alam 2020)
Ontologies serve as a foundational concept in knowledge representation across diverse
domains. Their applications extend to various fields, from healthcare to finance, manufacturing,
and natural language processing. For instance, in the healthcare domain, ontologies aid in
standardizing medical terminology, enabling more accurate diagnoses and treatment
recommendations (Dimitrieski et al. 2016). In finance, ontologies are utilized for risk
assessment and fraud detection by representing complex financial relationships (Kingston et
al. 2004). In supply chain management optimization, ontologies manufacture benefits through
the structured representation of product and process information (Scheuermann and Leukel
2014). In natural language processing, ontologies facilitate the extraction of semantic meaning
from text, enhancing information retrieval and understanding (Estival et al. 2004).
One of the fundamental advantages of ontologies is their ability to enable automated
reasoning. Reasoning engines utilize ontologies to infer new knowledge and detect
inconsistencies. This process enhances decision-making by providing logical and context-
aware insights. For example, in a healthcare ontology, automated reasoning can help identify
potential drug interactions based on known interactions between drug components and patient
medical histories (Naz et al. 2020). Similarly, reasoning engines can optimize production
processes in industrial applications by identifying conflicting constraints in heterogenous
databases (Ram and Park 2004; Nguyen 2007), i.e., different specification documents.
Ontology construction typically follows one of three approaches: manual construction,
cooperative construction (involving human intervention), and (semi-)automatic construction
(Al-Aswadi et al. 2020). The latter is also known as ontology learning and refers to the process
of creating an ontology in an automatic or semi-automatic way with limited human exert (Gillani
Andleeb 2015). For instance, text mining algorithms can extract domain-specific concepts and
their relationships from large textual corpora.
As ontologies grow in size and complexity, scalability and performance become critical
considerations. This becomes especially important when combining or interoperating different
ontologies. Strategies like mapping distributed ontologies and optimization techniques address
these challenges (Maedche et al. 2002).
However, using ontologies is challenging (Simperl and Tempich 2006; Ontology learning and
population 2008), notably their complexity when building and maintaining large amounts of
data. According to Wong et al. (2012) and Al-Aswadi et al. (2020) achieving fully automatic
ontology construction remains a formidable challenge and is unlikely to be attainable at all.
2.2.2 Large Language Models in Knowledge Representation
Large Language Models (LLMs) have emerged as powerful tools for knowledge representation
and understanding. These models are powerful tools capable of generating human-like
language and performing various tasks such as translation and text summarization while
displaying creativity in their output (Jun 2023). However, as pointed out by several authors,
including Zodhya (2023) and Saenko (2023), the training process incurs substantial
computational resources ranging from 1000-1300 MWh. In perspective, Germany had a per
capita electricity consumption of approximately 7 MWh, with a total electricity consumption of
547 TWh in 2022 (Pawlik 2022).
Theoretical Embedding 9
Sequential in nature, textual data relies heavily on the order of words and their contextual
relationships within a sentence to convey complete meaning. Traditional unsupervised
machine learning models overlook this inherent word order and context and grapple with the
constraints of fixed input sizes that are often relatively small. This prompts the rationale for
embracing computationally deep approaches when dealing with textual data (Acheampong et
al. 2021). LLMs attain their capabilities by using vast amounts of data, enabling them to apply
billions of parameters during their training phase. By repeating prediction tasks during the initial
training such as successively identifying each next word in the text given the previous words
and learning from its mistakes, the model recognizes similar textual context. Therefore, a
general knowledge of a language and the world is accumulated. This knowledge can then be
deployed for tasks of interest, such as question answering or text classification (Manning
2022). Looking at this, one could argue that LLMs may effectively exhibit inherent ontologies
of world knowledge.
This makes them valuable for chatbots, virtual assistants, and sentiment analysis tasks. They
enable computers to comprehend and respond to natural language queries, bridging the gap
between humans and machines (Kasneci et al. 2023; Acheampong et al. 2021). Furthermore,
LLMs can extract structured information from unstructured text, facilitating converting textual
data into structured knowledge. This is particularly useful for data mining, content
summarization, and knowledge base population.
Beyond their technical applications, LLMs excel in non-technical use cases as well. For
example, they are powerful translation systems, enabling seamless communication across
multiple languages. They can automatically translate text from one language to another while
preserving context and meaning. Another interesting aspect is their capability to enhance
search engines by enabling semantic search, which goes beyond keyword matching to
understand the user's intent and context. This results in more relevant search results and
improved user experiences. Finally, LLMs can generate human-like text, making them valuable
for content-generation tasks such as automated article writing, creative writing assistance, and
code generation (Acheampong et al. 2021).
LLMs come in various architectures and sizes, each designed for specific tasks and scalability.
Some prominent types include Transformer Models, Recurrent Models, and Convolutional
Models. Since 2018, transformers like Googles BERT introduced in 2018 (Devlin et al.),
OpenAIs GPT3 introduced in 2020 (openai) and Metas LLaMa introduced in 2023 (Touvron
et al.) in many cases replaced convolutional and recurrent neural networks, the most popular
types of deep learning models for NLP applications (Merritt 2022; Manning 2022).
Transformer Models have gained widespread popularity for their ability to handle various tasks
related to understanding natural language. A transformer model is a neural network that learns
context and thus meaning by tracking relationships in sequential data like the words in this
sentence. They use attention mechanisms to capture context and relationships between words
in a text, are pre-trained on vast amounts of text data, and can generate coherent and natural
language (Merritt 2022; Manning 2022).
Such models are designed to process natural language text, which is both the input and output
of the model, see Figure 2-4. The input text is tokenized, dividing it into individual words or
sub-words, and then fed into the model as a sequence. The model processes the input
sequence and generates an output sequence of words or sub-words that form a coherent text.
The output can be further processed or used for downstream NLP tasks such as language
translation or text summarization.
Figure 2-4: Working with LLMs
Advertisement
Theoretical Embedding 10
Despite their remarkable capabilities, LLMs have their fair share of challenges and limitations.
Firstly, they can inherit biases from their training data, potentially resulting in biased or unfair
outputs. For example, if a model is trained on data biased towards certain groups of people, it
may produce unfair or discriminatory results towards those groups. Secondly, the training and
operation of LLMs demand significant computational resources, rendering them inaccessible
to many researchers and organizations. To address this challenge, using pre-trained models
and cloud technology in combination with cooperative schemes for usage in partnership with
institutions and companies can serve as a starting point. Lastly, the models raise ethical
concerns, particularly about privacy, misinformation, and potentially generating harmful
content (Kasneci et al. 2023; Merritt 2022).
The field of LLMs continues to evolve rapidly, with several noteworthy trends on the horizon.
Firstly, Multimodal models capable of handling both text and other modalities like images and
audio are gaining traction, opening up new possibilities for knowledge representation and
understanding. Furthermore, researchers are actively exploring fine-tuning and transfer
learning techniques to adapt pre-trained models for specific tasks, reducing the need for
extensive training data. Lastly, there is a growing emphasis on ethical AI within the LLM
domain, with concerted efforts to address bias, ensure fairness, and enhance transparency in
their systems.
Large Language Models have revolutionized knowledge representation and natural language
understanding. They show potential in detecting contradictions in text thanks to their ability to
capture complex linguistic patterns and relationships between words and phrases, having an
inherent ontology on trivial knowledge. However, they have yet to be trained on requirements
specifications, which are generally not publicly available. Therefore, to this day, they cannot
detect contradicting requirements, as shown in the third paper, and therefore cannot be solely
relied upon when working in this domain. In the second and third papers, a self-written symbolic
AI is presented, which is necessary to heavily preprocess the data, which an LLM can then
analyze.
2.2.3 Conclusion
While solutions already exist or can be programmed via symbolic AI for the first three layers of
the DIKW pyramid as shown in the second paper the main challenge lies in the top layer.
As stated by Awad (2003): 'Wisdom is the highest level of abstraction, with vision, foresight,
and the ability to see beyond the horizon.' In the preceding sections, two methods were
proposed to tackle this challenge, which could theoretically solve the issue. However, providing
a relation for every term in an ontology to any other term in that ontology is virtually impossible
since this would require an immense amount of manual work for each project. Large Language
Models possess this wisdom intrinsically and do not require individual training per project, as
demonstrated in the third paper. Therefore, the decision was made to use LLMs. The DIKW
pyramid adapted by Awad (see Figure 2-5) emphasizes that non-algorithmic solutions are
required for the upper levels of the pyramid, which excludes ontologies and suggests the use
of large statistical models.
Theoretical Embedding 11
Figure 2-5: Data, Information, and Knowledge (Awad 2003)
2.3 Propositional Logic
Propositional logic, or sentential logic, is a branch of mathematical logic that deals with
propositions or statements that are either true or false. This notation is used but not
scientifically introduced as underlying syntax in the first paper. The syntax uses symbols to
represent propositions and connectives to combine them into more complex statements. The
most basic connectives are shown in Table 2-1:
Table 2-1: Most basic connectives (Sack and Alam 2020)
Connective
Name
Intentional meaning
¬
negation
not
conjunction
and
disjunction
or
implication
if then
equivalence
if, and only if, then
'In propositional logic, the world consists simply of facts and nothing else (statements of
assertions).' (Sack and Alam 2020). For example, let 𝑝 be the statement It is raining and 𝑞 be
the statement I am indoors. Then, the negation of 𝑝 is ¬𝑝, which means It is not raining. The
conjunction of 𝑝 and 𝑞 is 𝑝 𝑞, which means It is raining and I am indoors. And the disjunction
of 𝑝 and 𝑞 is 𝑝 𝑞 , which means It is either raining or I am indoors, or both..
Table 2-2 shows simple and composed assertations, which, by the way, do not need to be true
in the real world.
Table 2-2: How to model facts? (Sack and Alam 2020)
Simple Assertion
Modeling
The moon is made of green cheese.
𝑔
It rains.
𝑟
The street is getting wet.
𝑛
Composed Assertation
Modeling
If it rains, then the street will get wet.
𝑟 𝑛
If it rains and the street does not get wet, then
the moon is made of green cheese.
(𝑟 ¬𝑛) 𝑔
Advertisement
Theoretical Embedding 12
Propositional logic can be used to analyze and reason about complex systems or situations by
breaking them down into smaller, formal, and more manageable objects and relations. It forms
the basis for more advanced forms of logic and reasoning, such as predicate logic and modal
logic.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 13
3 Fundamental research on detecting contradictions in
requirements: Taxonomy and semi-automated approach
This article is an open access article distributed under the terms and conditions of the Creative
Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). The
content of this chapter was published:
Authors: Alexander Elenga Gärtner, Tu-Anh Fay, Dietmar Göhlich. Fundamental Research
on Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach.
Publisher: Appl. Sci. 2022, 12(15), 7628 (This article belongs to the Special Issue
Requirements Engineering: Practice and Research)
DOI: https://doi.org/10.3390/app12157628
Published: 28 July 2022
Author Contributions: Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.;
vali-dation, A.E.G..; formal analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G and
D.G.; data curation, A.E.G.; writingoriginal draft preparation, A.E.G.; writingreview and
editing, D.G. and T.-A.F.; supervision, D.G.
3.1 Abstract
Requirements documents can contain several thousand individual requirements. They must
be error-free to avoid unnecessary complications and costs in the later product development
stages. An important part of this is to identify contradictions between two requirements. The
first step is therefore to define what contradictions are and in what form they can occur in
requirement documents. In this paper the scientific theories regarding contradictions are
discussed, concerning to their usefulness for the topic. In doing so, the Aristotelian Logic
proved to provide the best basis for an application in the Requirements Engineering context.
Based on this theory, we have created specific subtypes of contradictions to match them to
the requirements engineering field. The identification of these subtypes is done by a
formalization of the requirement sentences and a subsequent analysis by means of simple
questions. To validate the method, industrial requirement documents were searched for
contradictions. For each detected type of contradiction, we present an example of the detection
process. Thereby, we show that the method is easy to apply and may also be used by non-
specialists. Thus, our method provides a taxonomy as a basis for further research on
automated contradiction detection as well as on automated quality anal-ysis of requirements
documents.
3.2 Introduction
Complete and error-free requirements specifications are crucial for effective product
development. One aspect of this, is to ensure that the documents are contradiction-free. On
the way toward this, contradictions must first be defined in the Requirements Engineering (RE)
context to recognize and classify them. Subsequently, the quality of the requirements
specification can be determined and, depending on the class of the contradiction, a solution
can be pursued.
3.2.1 Problem
Requirements form the basis for project planning, risk management, acceptance testing, and
many other fields (Dick 2017). Requirements specifications that describe an entire system are
often written in an interdisciplinary manner. The partial results must be merged according to
their logical and temporal dependencies, to form the overall solution. This process involves a
risk of error, especially for complex systems, regarding the consistency of the partial solutions
within the overall solution (Bender and Gericke 2021). Therefore, it is not surprising that errors,
e.g., in the form of contradictions, are often found in these documents.
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 14
Empirical research on requirements quality focuses on improvement techniques, with very few
primary studies addressing evidence-based definitions and evaluations of quality attributes
(Montgomery et al. 2022b).
3.2.2 Contribution
In this paper, a proposal is made on what contradictions are in the RE context, how they can
be classified, and how they can be determined. The classification of distinctive categories
allows for a more consistent assessment of the quality of requirements documents , as different
types of contradictions have different criticality levels. Also, depending on the type of
contradiction, different approaches are needed to solve them. Finally, the proposed
standardized solution provides a partially automated method, which is the basis for a potential
fully automated contradiction detection.
Within our validation, examples from real requirements documents are used.
3.3 Fundamentals
The market study from Luisa et al. concluded that in most cases (95%) requirements
documents were expressed in Natural Language (Luisa et al. 2004), which is inherently
ambiguous (Babcock 2007) and must therefore be interpreted to a certain degree. This
interpretation can often be an undetected source of errors.
In this paper, we focus on contradictions between pairs of text-based requirements in tabular
form. Below, we will specify how these requirements should be phrased and how contradictions
are generally defined.
3.3.1 Formulation and building blocks
Ideally, requirements should be based on a specific scheme (Dick 2017): Requirement
Expression = Boilerplate + Placeholder values. A boilerplate for a typical non-causal
requirement shows the following form: The <stakeholder type> must be able to <capability>.
Another example for a boilerplate could look like this: If <operational condition (cause)>, the
<system> shall <function> not less than <quantity> <object>, e.g. If the fuel tank is empty, the
Flexray shall sustain communication not less than 1 h. Simplified, sentences are built with
<Cause> + <Effect> which in turn consist of variables and conditions, as shown in Figure 3-1:
Figure 3-1: Formulation and building blocks
3.3.2 Contradictions
In this paper, we differentiate between the term contradiction which occurs when a statement
is in opposition either with itself or an established fact and the term contradictory, which we
will explain below. In the literature on Requirements Engineering, many definitions of
contradictions can be found, see section 3 Related Work. To get the most generically valid and
scientifically accepted definition, we base our theory on the logical philosophy of Aristotle. The
foundation of his logic also known as term logic, traditional logic, or formal logic developed
in his work Metaphysics is the law of non-contradiction (LNC) (Horn 2018). There, he argues
that it is impossible that the same thing belongs and does not belong simultaneously in an
identical way to the same object (Aristoteles 1986). “The doctrine of the square of opposition
[as seen in Figure 3-2; note by the author] originated with Aristotle in the fourth century BC
and has occurred in logic texts ever since. Although severely criticized in recent decades, it is
still regularly referred to(Parsons 2021) and will hence serve as a basis for our purposes.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 15
Figure 3-2: Square of opposition
The relations can be described as follows:
“Every S is P” and “Some S is not P” are contradictories.
“No S is P” and “Some S is P” are contradictories.
“Every S is P” and “No S is P” are contraries.
“Some S is P” and “Some S is not P” are subcontraries.
“Some S is P” is a subaltern of “Every S is P”.
“Some S is not P” is a subaltern of “No S is P”.
Therefore, we have four main oppositions: contradictories, contraries, subcontraries, and
subalterns.
1. Contradictory opposites, e.g., “he is sick” / he is not sick”, are mutually exhaustive and
mutually inconsistent. This means, that one statement must be true and the other false
or vice versa. They cannot both be true or false at the same time.
2. Contrary opposites, e.g., it is black” / “it is white”, are also mutually inconsistent, but
not exhaustive. While they cannot both be true, they can both be false.
3. Subcontraries, e.g., “you can If you want to call in sick” / you can If you want to
not call in sickare mutually consistent. While they can simultaneously be true at the
same time, they cannot simultaneously be false at the same time.
4. The statement “some people are sick” is the subaltern of “everybody is sick”, while the
latter is the superaltern of the former. If the superaltern is true, the subaltern must also
be true and if the subaltern is false, the superaltern must also be false.
By these definitions, the four central kinds of opposition contradictory, contrariety,
subcontrariety, and subaltern are mutually inconsistent.
In addition to the contradictions considered so far, there are other types. Kesselring
differentiates between the Aristotelian LNC-contradictions, dialectic contradictions, and
antinomies (Kesselring 2013).
Dialectic contradictions are comparable to antagonisms or so-called „conflict of goals“. For
example:
The vehicle must have high performance
The vehicle must have low consumption
They don’t stand in a mathematical/logical conflict but are incompatible in practice.
Antinomies denote conceptual or propositional structures in which the truth value oscillates. A
famous example is: Plato says, "Socrates speaks the truth," and Socrates says, "Plato lies."
They are often confused with self-contradictions (Kesselring 2013).
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 16
In this paper, we will tackle the LNC conflicts, except for the subcontraries. As they can be
valid simultaneously, subcontraries are not contradictions that need to be resolved for the RE
work.
3.4 Related Work
In this section, we assembled related works in terms of classification of conflicts, detection of
antagonisms, natural language processing for detecting conflicts, and finally ontology-based
approach for detecting conflicts. In general, we saw a lack of real validation in these topics, as
it is difficult to find non-academic institutions that are willing to share their requirements-
documents for scientific analyses (Landhäußer and Körner 2017).
3.4.1 Classification of conflicts
A classification of conflicts is suggested by Marneffe et al., in antonymy, negation, or numeric
mismatches (Marneffe, Rafferty, Manning 2008). Negations and numeric mismatches do not
fit the LNC classification of Aristotle and can only be partially combined.
Lamsweerde et. al. are classifying conflicts into nine categories (Lamsweerde, Darimont, Letier
1998):
1. Process-Level Deviation: Conflict between a process-level rule and a specific process
state.
2. Instance-Level Deviation: Inconsistency between a product-level requirement and a
specific state of the running system.
3. Terminology Clash: Usage of different terms for the same event
4. Designation Clash: Usage of the same term for different events
5. Structure Clash: Different explanations for a single real-world concept
6. Conflict: Two assertations are directly logically inconsistent
7. Divergence: Two assertations are indirectly (through a boundary condition) logically
inconsistent
8. Competition: Particular case of the divergence
9. Obstruction: Another particular case of the divergence
This doesn’t represent a classification with a consistent structure, since on the one hand the
system level is taken as classification criteria and on the other hand the context is taken as
classification criteria. Also, 7, 8, and 9 cannot clearly be differentiated.
Marneffe et al. propose a looser classification than ours. "Pairs such as 'Sally sold a boat to
John' and 'John sold a boat to Sally' are tagged as contradictory" (Marneffe, Rafferty, Manning
2008). In the context of requirements engineering though, this should not be interpreted as a
contradiction. This becomes clear in the following example: "Control unit 1 sends a signal to
control unit 2. Control unit 2 sends a signal to control unit 1." It becomes more complicated
when it says, "Control unit 1 sends the signal X to control unit 2. The control unit 2 sends the
signal X to the control unit 1.” This would indeed be a contradiction, but it would be classified
as a dialectic contradiction: theoretically, it is possible, but it wouldn't make any practical sense.
Guo et al. propose a classification into three basic conflict types - inconsistencies, inclusions,
and interlocks - which in turn can be divided into seven subcategories (Guo et al. 2021).
Inconsistencies are defined as contradictions between requirements that cannot both be
fulfilled at the same time. Compared to LNC, this could correspond to contradictories or
contraries as well as dialectical contradictions. Inclusions can correspond to both
contradictories and contraries. Interlocks can be compared with subalterns. This represents a
promising approach and to a large extent can be combined with LNC. In section 2.2
Contradictions Subcategories, parts of this classification are taken up, placed in the logical
context, and further refined.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 17
3.4.2 Natural Language Processing for detecting conflicts
According to Zhao et al., most of the studies (67.08%) in the Nature Language Processing
domain combined with RE are “solution proposals, assessed by a laboratory experiment or an
example application, while only a small percentage (7%) are assessed in industrial settings”
(Zhao et al. 2021), so they rarely have a practical validation.
However, to the best of our knowledge, no Machine-Learning oriented studies tried to classify
or find contradictions. Many papers deal with classifying requirements, for example in
functional and non-functional requirements (Kurtanovic and Maalej 2017) or in security-related
requirements (Jindal et al. 2016).
3.4.3 Ontologies for detecting conflicts
Guarino et al. describes computational ontologies as means to formally model the structure
of a system, i.e., the relevant entities and relations that emerge from its observation.” (Guarino
et al. 2009). It is required to conduct a mapping of statements to concepts and relationships.
Inconsistencies and opposing elements can be recognized this way (Sandhu and Sikka 2015).
This shows that ontology-based methods are not easy to apply. A certain amount of
preparatory work is needed, including system knowledge. The resulting advantage is that not
only LNC contradictions, but also dialectic contradictions can be detected.
In this paper, however, we want to focus on LNC contradictions and lay the foundation for
automatically finding contradictions in the future, without requiring system knowledge or
preparatory work. Neither is expedient with ontologies.
3.5 Method for detecting Contradictions
The findings from the section 2 Fundamentals can be summarized as shown in Figure 3-3,
while dialectical contradictions, antinomies, and subcontraries as already explained will not
be considered:
Figure 3-3: Contradictions
In the following, we present a formal method by which contradictions can be identified and
classified. To have a meaningful application in the RE context, we first must subdivide the LNC
contradictions. Since many requirements are not just simple statements, we have added the
principle of cause and effect to LNC, as this is not specifically represented in this theory. With
this, our categories are still based on Aristotle but adapted to the Requirement context. In
Section 3, they will be validated with examples from the automotive sector.
3.5.1 Nomenclature
First, we must define a nomenclature, to be able to refer to it in the following sections.
Capital letters as 𝐴, 𝐵, 𝐶, and 𝐷:
Are events that represent, for example, conditions
Are always unequal
Can occur simultaneously
Do not depend on each other
Lowercase letters 𝑥 and 𝑦:
Are variables
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 18
Lowercase letters 𝑐 and 𝑘:
Are parameters
Are unequal to each other
Can occur in parallel: 𝑐 can be equal to 1 and at the same time 𝑘 equal to 2.
Operators
=
!; <
!; >
!: must be equal, must be smaller, must be bigger
: implies; if... then. E.g., 𝐴 𝑥 =
!1 translates to “If 𝐴 is true, then 𝑥 must be 1".
¬: not. E.g., The statement ¬𝐴 is true if and only if 𝐴 is false.
; : and; or. E.g., The statement 𝐴 𝐵 is true if 𝐴 and 𝐵 are both true; otherwise, it is
false. Another example is: The statement 𝐴 𝐵 is true if 𝐴 or 𝐵 (or both) are true; if
both are false, the statement is false.
3.5.2 Contradictions Subcategories
The suggested categories are shown in Figure 3-4:
Figure 3-4: Classification of contradictions
The terms Simplex, Idem, and Alius are described below. It should be noted that requirements
can be formulated as “condition + conclusion” as well as inverted as “conclusion + condition”.
Simplex (lat. = simple) refers to contradicting requirements without conditions (non-causal):
The car must be red: 𝑥 =
!𝑘.
The car must be blue: 𝑥 =
!𝑐.
Idem (lat. = same) refers to contradicting causal requirements with the same conditions or
pairs where only one requirement has a condition:
If the customer wishes, the car must be red: 𝐴 𝑥 =
!𝑘.
If the customer wishes, the car must be blue: 𝐴 𝑥 =
!𝑐.
Alius (lat. = different) refers to contradicting causal requirements with the different conditions
(causal):
The car must be red if the customer wishes it to be: 𝐴 𝑥 =
!𝑘.
The car must be blue if the car has four doors: 𝐵 𝑥 =
!𝑐.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 19
The last example is a contradiction because the conditions of both requirements can be fulfilled
at the same time (they are independent of each other) and the conclusions would then
contradict each. Also, this is an example where the condition and conclusion have been
inverted in order.
If one requirement is non-causal and the other is causal, the contradiction as a whole is said
to be causal.
"Contradictory" refers only to the effect, not to the requirement as a whole. If the effects are
contradictory but the requirements as a whole would be contrary, we still refer to them as
contradictory here.
The following Table 3-1 lists all types of contradiction in their formalized form. The column
“Multiple condition” shows examples of formalized requirements with multiple conditions. It is
not an exhaustive list of all possible multiple conditions:
Table 3-1: Formalized contradictions
Contradictions
Simple
Examples for
multiple
conditions
Contradictory
Simplex
𝑥 =
!𝑘
𝑥 =
!¬𝑘
-
Idem
𝐴 𝑥 =
!𝑘
𝐴 𝑥 =
!¬𝑘
𝐴 𝐵 𝑥 =
!𝑘
𝐴 𝐵 𝑥 =
!¬𝑘
Alius
𝐴 𝑥 =
!𝑘
𝐵 𝑥 =
!¬𝑘
𝐴 𝐵 𝑥 =
!𝑘
𝐴 𝐶 𝑥 =
!¬𝑘
Contrary
Simplex
𝑥 =
!𝑐
𝑥 =
!𝑘
-
Idem
𝐴 𝑥 =
!𝑐
𝐴 𝑥 =
!𝑘
𝐴 𝐵 𝑥 =
!𝑐
𝐴 𝐵 𝑥 =
!𝑘
Alius
𝐴 𝑥 =
!𝑐
𝐵 𝑥 =
!𝑘
𝐴 𝐵 𝑥 =
!𝑐
𝐶 𝐷 𝑥 =
!𝑘
Subaltern
Simplex
𝑥 <
!𝑐 + 𝑘
𝑥 <
!𝑐
-
Idem
𝐴 𝑥 <
!𝑐 + 𝑘
𝐴 𝑥 <
!𝑐
𝐴 𝐵 𝑥 <
!𝑐 + 𝑘
𝐴 𝐵 𝑥 <
!𝑐
Alius
𝐴 𝑥 <
!𝑐 + 𝑘
𝐵 𝑥 <
!𝑐
𝐴 𝐵 𝑥 <
!𝑐 + 𝑘
𝐶 𝐷 𝑥 <
!𝑐
If a condition is composed of two or-statements, it can be split into two sentences, as
intermediate. This will be shown in section 5.2.4 Alius Contrary. Each partial cause can then
separately be considered with the effect. From 𝐴 𝐵 𝑥 =
!𝑐 follows 𝐴 𝑥 =
!𝑐 and 𝐵 𝑥 =
!𝑐.
This facilitates the comparison of requirements that consist of compound conditions.
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 20
3.5.3 Process
The following Figure 3-5 shows how the types of contradiction can be recognized based on
simple, but specific questions.
Figure 3-5: Process overview
The first three questions refer to the effects of the requirements to be compared. The following
three questions refer to the causes, if any. The questions are elaborated on below. For a
contradiction to be identified, all questions must be answered as specified in the corresponding
column. The check mark stands for "yes" and the cross for "no". The circle stands for questions,
that do not apply in that case. Condition 1 and condition 2 are the respective conditions of the
effects of requirement 1 and requirement 2. The same applies for cause 1 and cause 2.
Effect-related questions:
1. Are the variables from effect 1 and effect 2 the same or a subset of each other?
Two statements can contradict each other in the sense of LNC only if the variables, i.e. the
object in question, are the same or one is a part of the other, for example, table and table leg.
2. Does one effect include the other one?
If one condition includes the other, it could be a subaltern contradiction, for example, "...
between 15m and 30m" and "... between 20m and 22m". The range of the second statement
is included in the range of the first statement and therefore the former is the superaltern of the
latter.
3. Are effect 1 and effect 2 mutually exhaustive and mutually inconsistent?
This question aims at finding contradictory opposites, for example, “the car is ready” / “the car
is not ready”. If one is true, the other must be false and vice versa.
Cause-related questions:
4. Is there a condition?
If there is a condition, any form of Simplex-contradiction can be excluded.
5. Can cause 1 occur at the same time as cause 2?
Two statements can only contradict each other, if they can theoretically occur at the same time.
The statements “If it rains, …” and “If it does not rain, …” cannot occur at the same time and
are therefore not contradicting each other. If “it rains” in the first statement were to be replaced
with “it’s hot outside”, the two statements could theoretically contradict each other.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 21
6. Are cause 1 and cause 2 the same?
This question simply aims at detecting Idem-contradictions, who must have the same cause,
for example, “If I am here, you are there” and “If I am here, you are here”.
3.6 Materials and Results
In this section, first, the underlying data set for the validation is explained. Afterward, the
contradiction types defined above are validated using an example from the dataset, if so found.
The dataset was analyzed by hand. The document was read through to find all existing
contradictions. Not only contradictions, but also duplicates, repetitions, ambiguities and other
conflicts were found. For a complete and automated application over a large data set, see
section 7 Conclusion.
3.6.1 Materials
The data set consists of several interrelated requirements documents. The originator is the
company IAV GmbH (Berlin, Germany), which was kind enough to make the documents
available. The goal was to create a complete requirements package for the development of E-
buses, which are in use today. The document consists of about 3500 functional and non-
functional requirements, from system to software level. The original language of the documents
is German and was translated to English. For confidentiality issues, signal names are
anonymized by using square brackets.
3.6.2 Results
Contradictory connections are counted as one contradiction. In other words: every
contradictory pair is counted as a single contradiction.
From a total of 6500 objects 3500 were requirements. Besides the above mentioned other
conflicts, 49 (1.35%) LNC-contradictions were found. However, it should be noted that not all
contradictions were evenly distributed across all levels. 46 of the 49 contradictions were found
at the software level, where they account for 2.53% of all requirements. The distribution of the
different contradiction types is displayed in Table 3-2.
Table 3-2: Distribution
Simplex
Subaltern
Alius
Subaltern
Alius
Contradictory
Alius Contrary
4
3
2
40
These figures must be viewed with caution, as the analysis was done manually, and it is likely
that further inconsistencies were overlooked.
We didn’t find any contradictions for the following species: Simplex and Idem Contradictories,
Idem Contraries, and Idem Subalterns. This will be reflected in section 6 Discussion. In the
following, we explain the method using one example each from the requirements documents.
3.6.2.1 Simplex Subaltern
The two selected requirements are:
1. The safe state must be reached within 1000 ms.
2. The safe state must be reached within 800 ms.
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 22
The building blocks are shown in Figure 3-6:
Figure 3-6: Building blocks for a Simplex Subaltern contradiction, consisting of two
requirements
Its formalized form is:
𝑥 <
!𝑘
𝑥 <
!𝑐
𝑤ℎ𝑖𝑙𝑒 𝑐 < 𝑘
(1)
(2)
(3)
where “safe state” is 𝑥, “1000 ms” is k and “800 ms” is 𝑐.
The questions presented in our methodology can then be answered as shown in Figure 3-8:
Figure 3-7: Process for Simplex Subaltern
The questions were answered as given in the column for Simplex Subaltern contradictions.
The last two cause-questions did not need to be answered, because the Simplex-
contradictions do not have causes.
3.6.2.2 Alius Subaltern
The two selected requirements are:
1. If the actual heater stage CbnHeatg_[…] > 0, the requested pump power
CbnHeatg_SpOfCooltPmp must be limited by the parameter
CbnHeatg_TrigForDutyCycOf[…].
The safe state must be reached within 1000ms.
The safe state must be reached within 800ms.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 23
2. If BattChrgnMngt_MsgVld[…] = false, the requested pump power
CbnHeatg_SpOfCooltPmp must be limited to 20%.
The parameter CbnHeatg_TrigForDutyCycOf[...] is initialized elsewhere with 80%. Therefore,
we have a similar case as above, only this time there are conditions. The building blocks are
shown in Figure 3-8:
Figure 3-8: Building blocks for an Alius Subaltern contradiction, consisting of two
requirements
And results in:
𝐴 𝑥 <
!𝑘
𝐵 𝑥 <
!𝑐
𝑤ℎ𝑖𝑙𝑒 𝑐 < 𝑘
(4)
(5)
(6)
where “If the actual heater stage CbnHeatg_[…] > 0is A, “If BattChrgnMngt_MsgVld[…] =
false” is B, CbnHeatg_SpOfCooltPmp” is x, “Cbn-Heatg_TrigForDutyCycOf[…]” is k and
“20%” is c.
The questions presented in our methodology can then be answered as shown in Figure 3-9:
Figure 3-9: Process for Alius Subaltern
If the actual heater stage CbnHeatg_[ ] > 0, the requested pump power CbnHeatg_SpOfCooltPmp must be limited by the parameter Cbn Heatg_TrigForDutyCycOf[ ].
If BattChrgnMngt_MsgVld[ ] = false, the requested pump power CbnHeatg_SpOfCooltPmp must be limited to 20%.
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 24
3.6.2.3 Alius Contradictory
It gets more complicated when getting to the following contradictories:
1. Suitable potential equalization is required for all conductive covers or housings of all
HV components.
2. If additional external conductive sheaths or covers are fitted over covers or enclosures
consisting of solid insulating materials, equipotential bonding is not required for these.
By considering the context, it becomes clear, that the demonstrative these” in the second
sentence is a variable y. It refers to “covers or housings consisting of insulating materials” and
not to covers or housings” or “solid insulating materials”. However, the variable x of the first
sentence is “covers or housings“, which means that 𝑦 𝑥.
Therefore, the building blocks are as shown in Figure 3-10:
Figure 3-10: Building blocks for an Alius Contradictory contradiction, consisting of two
requirements
The formalized form is:
𝑥 =
!𝑘
𝐴 𝐵 𝑦
!𝑘
𝑤ℎ𝑖𝑙𝑒 𝑦 𝑥
(7)
(8)
(9)
where “conductive covers or housings of all HV components” is x, “potential equalization” is k,
“additional external conductive covers or housings are fitted over covers” is A and “housings
consisting of solid insulating materials” is B. “these” is y and is actually a subset of x. It denotes
“conductive covers or housings of all HV components with additional external conductive
covers or housings fitted over covers or housings consisting of solid insulating materials”.
Suitable potential equalization is required for all conductive covers or housings of all HV components.
If additional external conductive covers or housings are fitted over covers or housings consisting of solid insulating materials, potential equalization is not required for these.
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 25
After the formalized form has been determined, filling in the table works as usual, as shown in
Figure 3-11:
Figure 3-11: Process for Alius Contradictory
3.6.2.4 Alius Contrary
The two selected requirements are:
1. If the value of the signal ComVehFrnt_ChrgnCur[…] exceeds the value of 0 (A), the
signal Chrgn[…] must be set to TRUE.
2. If the parameter ChrgnCurChk_SubVal[…] is set to TRUE, the signal Chrgn[…]
corresponds to the parameterizable value ChrgnCurChk_SubValChrgn[…], otherwise,
the signal is forwarded unchanged.
The second condition of the second requirement should be transferred to a separate
requirement to apply this method. The second requirement thus splits and can be checked
separately against other requirements for contradictions. Accordingly, our customized
requirements look like this, while we will be using 2.1 in the further analysis:
1. If the parameter ChrgnCurChk_SubVal[…] is set to TRUE, the signal Chrgn[…]
corresponds to the parameterizable value ChrgnCurChk_SubValChrgn[…].
2. If the parameter ChrgnCurChk_SubVal[…] is not set to TRUE, the signal Chrgn[…] is
forwarded unchanged.
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 26
Then, the building blocks are as shown in Figure 3-12:
Figure 3-12: Building blocks for an Alius Contrary contradiction, consisting of two
requirements
The formalized form results in:
1.
𝐴 𝑥 =
!𝑐
𝐵 𝑥 =
!𝑘
(10)
(11)
where “the value of the signal ComVehFrnt_ChrgnCur[…] exceeds the value of 0 (A)” is A, “the
parameter ChrgnCurChk_SubValForChrgn[…] is set to TRUE” is B, Chrgn[123]” is x and
“TRUE” is c and “ChrgnCurChk_SubValChrgn[…]” is k.
The questions presented in our methodology can then be answered as shown in Figure 3-13:
Figure 3-13: Process for Alius Contrary
If the parameter ChrgnCurChk_SubValForChrgn[ ] is set to TRUE,
the signal Chrgn[123] corresponds to the parameterizable value ChrgnCurChk_SubValChrgn[ ].
If the value of the signal ComVehFrnt_ChrgnCur[ ] exceeds the value of 0 (A), the signal Chrgn[123] must be set to TRUE
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 27
3.7 Discussion
It is important to note, that we didn't find any Idem-contradictions and only one Simplex-
contradiction. Idem-contradictions are so conspicuous that the requirements engineer would
probably notice them immediately since he would have to formulate exactly the same cause
twice with the same variables but different effects. The reason for the absence of Simplex-
contradictions is, that the examined system is so complex that simple statements without
conditions would simply not be sufficient to describe the system precisely.
Besides mentioned reasons, internal validity mistakes could play a role in not finding certain
contradiction types. In the project documents are about 3500 requirements with
𝑘 = 6 123 250
𝑛−1
𝑘=0 theoretical combinations. We therefore cannot rule out the possibility, that
we missed Idem- or Simplex-contradictions.
If the requirements are not formulated according to the guidelines, borderline cases can
certainly occur in which contradictions cannot be clearly assigned or even identified. This is
because language is often ambiguous and human interpretation is often needed. When it
comes to complex formulations, even common sense can reach its limits.
3.8 Conclusion
Especially in the early development phase, ambiguities are very common in Requirements
documents due to the use of natural language. In this paper, we examined contradictive
requirements, which we defined using formal logic. In contrast to other papers, we did not
classify contradictions according to our data set or our code, but according to a generally
accepted, well-tested systematic model. Then, we created a classification tailored to RE, in
which conditions and effects now take a prominent role. Finally, we proposed a way to identify
our contradictions using clear questions.
We have analyzed about 6500 objects, approximately 3500 of which were requirements. In
total, we were able to identify many different conflicts, 49 of which were LNC-related
contradictions that could be identified using our method. The majority of the detected
contradictions were of the Alius Contraries. Furthermore, most of the contradictions were found
at the deeper system levels, namely those of the software requirements. This corresponds to
our expectations, since requirements on the higher levels are written less concretely and
describe the general functionality of the product. As a result, there is often no risk of
contradictions in the first place.
With our method, contradictions can be found in a semi-automated way: The classification into
cause and effect, as well as variable and condition, are fully automated, for example by using
Fischbach's parser (Fischbach et al. 2022). This way our method can be applied automatically
up to the step "Building Blocks". In the then following formalization the building blocks must be
replaced by symbols and formulas. However, this step is not automated. To the best of our
knowledge, there are currently no methods available which allow for this. Therefore, further
research is required, as mentioned in section 4.2 Future work. Once the formalization is done,
answering the questions in Figure 3-5 presents a simple yet manual task. This was shown
with examples in section 5.2 Results.
When applying our method, a requirements reviewer does not have to be familiar with
requirements in general or with the topic of the document anymore, to recognize contradictions,
as our method provides a simple recipe for detecting LNC-related contradictions.
Future work could entail automation, quality analysis and non-LNC-contradictions.
Requirements documents can become very extensive due to the necessary level of detail
(Göhlich and Fay 2021b). Therefore, an automated determination of contradictions would be
useful. The formalization of contradictions proposed in this paper provides strong implications
for automation, by serving as the basis for a fully automated contradiction-detection method.
The queries that would have to be made in such a code are already mathematically formulated
here.
Advertisement
Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 28
We can also derive implications for an automated quality analysis. The classification into
different types of contradictions is an important step to quantify the quality of a requirements
document. The logical next step would be to assess the criticality of the contradiction. Based
on this, a meaningful key performance indicator could be determined. This would require
analyzing a large number of inconsistencies, to assess the impact on the product, as well as
any different resolution methods per type of contradiction. The greater the impact and the more
difficult the solution, the more critical the contradiction.
As we saw in section 1.4 Fundamentals, there are other types of contradictions besides LNC-
contradictions, that have not been discussed in this paper: dialectic contradictions and
antinomies. In our opinion, dialectic contradictions cannot be detected by applying simple
rules, instead, they require context and language comprehension. It might be possible to
achieve results with a sufficiently large and clean data set and by using machine learning
algorithms. Regarding antinomies, it should first be checked whether they occur at all in
requirements documents. A solution to these contradictions is similar to the solution of
dialectical contradictions.
3.9 Other
Author Contributions: “Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.;
validation, A.E.G..; formal analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G and D.G.;
data curation, A.E.G.; writingoriginal draft preparation, A.E.G.; writingreview and editing,
D.G. and T.-A.F.; supervision, D.G. All authors have read and agreed to the published version
of the manuscript.” Please turn to the CRediT taxonomy for the term explanation.
Funding: We acknowledge support by the German Research foundation and the Open Access
Publication Fund of TU Berlin.
Data Availability Statement: “Not applicable”.
Conflicts of Interest: “The authors declare no conflict of interest.”
Automated Condition Detection in Requirements Engineering 29
4 Automated Condition Detection in Requirements Engineering
This article has been published in Proceedings of the Design Society
https://doi.org/10.1017/pds.2023.71. This version is free to view and download for private
research and study only. Not for re-distribution or re-use. © Alexander Elenga Gärtner. This is
an Open Access article, distributed under the terms of the Creative Commons Attribution-
NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Authors: Alexander Elenga Gärtner, Dietmar Göhlich, Tu-Anh Fay. Automated Condition
Detection in Requirements Engineering.
Publisher: Cambridge University Press
DOI: https://doi.org/10.1017/pds.2023.71
Published: 19 June 2023
Author Contributions: Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.;
validation, A.E.G.; formal analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G.; data
curation, A.E.G., D.G. and T.-A-F.; writingoriginal draft preparation, A.E.G.; writing
review and editing, D.G. and T.-A.F.; supervision, D.G. and T.-A.F.
4.1 Abstract
In product development, it is of great importance that a complete, unambiguous, and, as far as
possible, contradiction-free target system is defined. Requirements documents of complex
systems can contain several thousand individual requirements, derived in an interdisciplinary
manner and written in natural language by many different stakeholders. Hence, errors, in the
form of contradictions, cannot be completely avoided in these documents and today they must
be corrected manually with high effort.
This paper presents an important building block for automated contradiction detection and
quality analysis of requirements documents. We discuss the necessary identification of
conditions in requirements and the extraction of the verbal expressions associated with
condition and effect, respectively. We applied and analyzed natural language processing
methods based on grammatical versus machine learning models. The models have been
applied to 1,861 real-world requirements. Both approaches generate promising results, with
an accuracy partly over 98%. However, in structured specification texts, a grammatical model
is preferable due to lower effort in preprocessing and better usability.
4.2 Introduction
Requirements play a central role in the development process of virtually all products
(Loucopoulos 2005; Gericke and Blessing 2012) because they are usually a central basis of
communication when solutions are developed by several involved persons, areas or even
companies in parallel or successively (VDI-Guideline VDI 2221 Blatt 1:2019-11). It is a
widespread practice that requirements documents are formulated in a specification sheet,
which should contain the totality of the requirements within a development project (DIN DIN
69901-5:2009-01). In this context, it is of great importance that a complete, unambiguous and,
as far as possible, contradiction-free target system is defined (Bender and Gericke 2021).
However, specification sheets can contain several thousand individual requirements written in
natural language and are typically derived in an interdisciplinary manner. Additionally, the
variety of mechatronic products and the complexity of modern systems make distributed and
concurrent development at different aggregation levels of the product development process
indispensable. Typically, the requirements for each level are managed in different documents,
for the overall product in a product requirements document and for the subsystems in
component requirements documents (Göhlich et al. 2021). Therefore, it is not surprising that
errors, e.g., in the form of contradictions, are often found in these documents.
To the best of our knowledge, currently, neither a comprehensive automated contradiction
detection nor an automated quality analysis of industrial specification documents is available.
Advertisement
Automated Condition Detection in Requirements Engineering 30
Our work aims at providing the necessary building blocks for an automated quality assurance
of specification documents. In a previous study (Gärtner et al. 2022), we presented different
types of contradictions that can occur in requirements specifications and developed a method
to classify contradictions. In this context, the question of whether the requirement includes a
condition or not plays a central role. In fact, they give the decisive hint as to whether statements
contradict each other. For example, the two statements x must be 4 and x must be 5 are
contradictory unless the conditions state otherwise, e.g. if y=3 and if y=4. Therefore, the
detection of conditions and extraction of their constituents is crucial for reaching the goal of an
automated contradiction detection. In this paper, we want to show how conditions can be
detected automatically and which verbal expressions form the condition. In detail, this paper
aims to:
Identify all requirements in a specification that contain conditions.
Identify the verbal expressions that form the condition.
Evaluate which is the best-suited approach in this context: self-written rules or machine
learning.
Create a basis to identify contradictions automatically. According to our previous study,
the conditions and effects must be compared regarding their verbal expressions to find
contradictions (Gärtner et al. 2022).
For this purpose, we compare two natural language processing (NLP) techniques: the first one
is a model that consists mainly of grammatical rules that are directly embedded into the source
code. The second one uses a bag-of-words model for machine learning (ML). We, therefore,
elaborate on the terminology, define rules on how to detect conditionals and the related verbal
expressions, and finally present the results of our algorithms on 1,861 "real world"
requirements.
Our main dataset is in German, therefore the terminology, as well as the method, are tailored
to German grammar. As Mark Twain in The Awful German Language (1880) notes, the
German language is much more complicated than English. In this respect, we found that rules
which work well for requirements written in German can be transferred easily into rules for
requirements in written English, but not vice versa. We demonstrate this by translating and
analyzing a public English dataset gathered by Fischbach et al. (2021a) into German.
4.3 Terminology
Conditions are sub-statements of sentences that are "essential to the appearance or
occurrence of something else" (Merriam-Webster). In other words, the condition must be true
for the effect to happen. In German, the condition can be formulated either in a subordinate
clause (SC) or in an adverbial (ADVB) within the main clause (Duden 2006; Eisenberg 2016).
SC-clauses are, in German, usually marked by conjunctions such as wenn, falls, sofern (eng:
if, when, in case) (1), or by a special word position, the so-called verb-first position (2). The
second case is non-existent in English and is always translated by using one of the mentioned
conjunctions. In the examples below, the conditions are noted in italic:
1. Wenn drei Sekunden vergangen sind, dann muss das Licht ausgehen. Eng.: When
three seconds have passed, the light must go out.
2. Sollten drei Sekunden vergangen sein, dann muss das Licht ausgehen. Eng.: When
three seconds have passed, the light must go out.
Adverbial conditions occur within the main clause without the need for a subordinate clause. It
is important to note, that adverbs are defined differently in English. In German, they express
the closer circumstances of an action, a process, or a condition (Duden 2006). They are usually
marked by bei, nach, während, im Falle, and more (Eng.: at, after, while, in case of, and more).
For example: In case of a message timeout, a message should be sent to the error manager.
Although there is a comma after timeout, the first part of the sentence is not a subclause.
Automated Condition Detection in Requirements Engineering 31
It is important to note, that conditional statements are not statements of causality. The strike
of a match is a cause of the match lighting. The presence of oxygen is a condition for the match
lighting (Broadbent 2008). Confusion arises often, as both conditional and causal statements
can be introduced by If …, then: If I strike a match, then the match lights (causal statement)
and If oxygen is present, then the match can light (conditional statement). In requirements
engineering (RE), causal statements would be classified as information rather than as
requirements, as they describe why something happens and not how.
Another research topic of this paper is to identify the verbal expressions that belong to the
condition as well as the verbal expressions that belong to the subject and the verb of the
clause. These terms are also referred to as constituents (words or groups of words that function
as a unit). Unlike Fischbach's terminology (2021a) which we adopted in our previous paper,
(Gärtner et al. 2022), we now propose the following terms: variable and action. For example,
in the sentence If the threshold is reached, the controller must limit the speed decay, we must
first differentiate between the condition (If the threshold is reached) and the effect (the
controller must limit the speed decay), as shown in Figure 4-1. For the condition, the threshold
is the variable, and if is reached is the action. For the effect, the controller is the variable, and
must limit the speed decay is the action. In other words: the variable is the protagonist and the
action that what shall happen.
Figure 4-1: variable and action
4.4 Related Work
In the literature we found several approaches to automatically identify conditions and effects.
While rule-based methods were predominant in the past, machine-learning-based methods
have emerged with increasing computing power (Asghar 2016).
The rule-based approach from Khoo et al. (1998) extracts conditionals using linguistic clues
and pattern-matching, reaching an accuracy of 68 %. It was designed based on prose text from
the wall street journal, and not specifically designed for RE, which has more linguistic
restrictions than a newspaper article. This approach, as well as an approach from Liu et al.
(2021), searches for specific trigger words. Although this is a good approach, especially for
English, in the analysis of our dataset for this paper we found that 17 % of the conditions were
not based on trigger words but on special grammatical forms, as explained in section 4.3.
Águeda and Olivas (2008) present an approach to automatically extract conditionals for search
engine optimization. However, the constraints are too narrow, as "the cause must precede the
effect" (Águeda and Olivas 2008). In RE, but also in general, the effect can precede the cause,
e.g., the engine must shut down, if the engine speed is too high.
In multiple papers, Frattini and Fischbach present different machine-learning-based
approaches for the automatic detection of causalities (Frattini et al. 2022a; Fischbach et al.
2021a; Fischbach et al. 2020) and the automatic detection of their constituents (Fischbach et
al. 2021b). Although their theory is explicitly designed for RE, they do not focus on conditionals,
but on causalities. However, they seemed to have looked at both conditionals and causalities,
which could be due to an incorrect definition of the terms, as explained in section 4.3
Terminology. In addition, they have classified sentences as causal, which are not causal nor
conditional, e.g.: The fire is burning down the house. Although the fire is indeed the cause of
the house burning down and there is an implicit causal relationship, the sentence is
grammatically not causal. Furthermore, they define causality as a causal relation that requires
the effect to occur if - and only if - the cause has occurred. However, this definition is not useful
in the RE context, since effects can be triggered by multiple conditions: If the speed exceeds
the threshold, the motor is to be switched off in an emergency and if the torque exceeds the
Advertisement
Automated Condition Detection in Requirements Engineering 32
threshold, the motor is to be switched off in an emergency. They investigated 14,983 sentences
to track the extent and form in which causality occurs. Their approach achieved an accuracy
of 82%.
4.5 Study
In this paper, we compare the performance of two approaches: a rule-based grammatical
model versus two ML-based models. The former applies the rules mentioned in section 4.3
Terminology. The latter is implemented using a bag-of-words model for the document and two
popular classifying theorems for NLP tasks: Naïve Bayes (NB) and k-Nearest Neighbor (kNN)
(Bramer 2013). The bag-of-words algorithm is used to represent the document, "depicting [it]
as a bag and each vocabulary in the texture as the items in the bag" (Ersoy 2021). The NB
classifier uses probability to find the most likely of the possible classifications. kNN estimates
the classification using the classification of its neighbors, with k representing the number of the
closest instances to consider (Bramer 2013). For this study, it was deemed appropriate to limit
the approach to utilizing these two simple and lightweight methods due to their demonstrated
efficacy in section 4.6 - Results. Hence, the advanced features of more complex methods,
such as GloVe and BERT, including improved contextual comprehension and semantic
representation, were not considered for the current task and were not incorporated into the
analysis.
4.5.1 Method
All 1,861 requirements of our datasets, see section 4.5.2 Data, were manually divided into
conditionals and non-conditionals according to section 4.3. This classification was verified
independently by 3 persons, all having extensive experience in requirements engineering. The
classification is necessary to train the ML models and to determine the accuracies of both the
ML and the grammatical models.
Our method as well as the software which was developed in form of a Python code are
structured in four parts as shown in Figure 4-2:
Figure 4-2: code structure
1. Data Preprocessing (1): "Data directly taken from the source will likely have
inconsistencies, errors or most importantly, it is not ready to be considered for a data
mining process." (García et al. 2014). Therefore, data reduction techniques must be
applied to remove irrelevant and noisy elements from the dataset, for example by
replacing "<" with "is smaller than".
2. Data Preprocessing (2): "Parsing is the task of analyzing grammatical structures of an
input sentence and deriving its parse tree." (Bojić and Bojović 2017). This task is
sometimes considered to be part of the preprocessing (Gudivada 2018). For the
grammatical model, we used parse trees, also called dependency trees, and Part-of-
Speech (POS) tagging. A parse tree is a graphical representation of the dependencies
between words. POS tagging categorizes words in correspondence with a particular
part of speech, e.g., nouns, verbs, adverbs, conjunctions, etc. The parsing was
outsourced to the dependency parser ParZu, by Sennrich et al. (2013). For the ML
models tokenization was done via countvectorizer, a method to convert text to
numerical data, making a separate tokenizer redundant. While it does not employ
lemmatization techniques, this is not critical for the specific task at hand, as semantic
considerations do not play a crucial role in this classification problem.
Automated Condition Detection in Requirements Engineering 33
3. Identifying conditional requirements: The grammatical model uses grammatical rules,
trigger words, and POS tags to detect conditions, as explained in section 4.3. The ML
models are trained using the labeled German dataset. An 80/20 split was chosen, i.e.,
1,249 requirements for training and 313 requirements for testing. For kNN,
hyperparameter tuning was done via gridsearch. This is a technique to determine the
optimal values for a given model. The best results were achieved with leaf_size = 1,
metric = minkowski, n_neighbors = 1, p = 2 and weights = uniform. The results are
discussed in sections 4.6.1 and 4.6.2.
4. Identifying constituents: The final task is to identify the constituents, as explained in
section 4.3. From the POS tags and the parse tree, the verbal expressions associated
with the condition and the effect can be determined, as well as the variables and
actions. We did not use ML for this task, as the dataset is too large to label all variables
and actions manually. The results for the grammatical model are discussed in section
4.6.3.
4.5.2 Data
We used two different datasets. The first dataset originates from a recent project for electric
buses. In this context a complete requirements package was derived, that describes a modular
system, which shall replace conventional bus powertrains with new, electric powertrains. The
role of requirements in the development process of electric buses can be found in Design of
urban electric bus systems (Göhlich et al. 2018). We based our study on 1,561 functional and
non-functional requirements. The specifications are written in German and the examples
discussed in this paper were translated into English.
The second dataset is public, originating from Fischbach (Fischbach et al. 2020). It is an
accumulation of approximately 15,000 requirements written in English from many different
sources available online. From this dataset we used 300 randomly selected requirements,
available online at Swarm-Engineer (2023). For our analysis, we translated them into German
via DeepL. We used this dataset to show the feasibility to get correct results with non-
automotive requirements and an originally English dataset.
4.6 Results
The models were validated with two different datasets. In sections 4.6.1 and 4.6.2 the results
of the questions on how to automatically detect conditional requirements are shown and
discussed. In section 4.6.3 the result for identifying the verbal expressions assigned to the
condition/effect and to the action and variable are shown.
4.6.1 Dataset 1
Out of the 1,561 analyzed requirements, 785 (50,3%) were labeled as conditional by us,
according to section 4.3 Terminology. The labeling was verified by three persons to increase
confidence in the correctness of the labels. For the ML algorithms, an 80/20 split was chosen,
i.e., 1,249 requirements for training and 313 requirements as a test set.
The overall accuracies are:
grammatical model: 99%
NB model: 91%
kNN model: 94%
To measure the effectiveness of the models, we used confusion matrices, as shown in Figure
4-3. They are used for performance measurements for machine learning classification
problems, where the output can be two or more classes (Narkhedem 2018). In these tables,
the different combinations of predicted and reference values can be examined. The
combinations are called true negative (reference: 0; predicted 0), false negative (reference: 1;
predicted: 0), false positive (reference: 0; predicted: 1), and true positive (reference: 1;
predicted: 1).
Advertisement
Automated Condition Detection in Requirements Engineering 34
Figure 4-3: confusion matrices for dataset 1 based on 313 requirements
The grammatical model (Figure 4-3(1)) only has a few false predictions compared to the other
models. It misclassified only 3 requirements, consisting of 1 false positive and 2 false
negatives. The Naïve Bayes model (Figure 4-3(2)) had the biggest difficulties with false
positives, as 23 non-conditional requirements were misclassified as conditionals. Whereas the
k-Nearest Neighbor model (Figure 4-3(3)) had the biggest difficulties with false negatives, as
14 conditional requirements were misclassified as non-conditionals.
Although we have results for the grammatical model classifying all 1,561 requirements, for the
ML models we only have results on the test set consisting of 313 requirements. Comparing the
results between 1,561 requirements and 313 would not be fair, which is why we determined
the accuracy of the grammatical model also using the 313 requirements. The accuracies are
calculated as the number of all correct predictions divided by the total number of the dataset:
𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃+𝑇𝑁
𝑃+𝑁 (1)
4.6.2 Dataset 2
To show that the algorithm is not overfitted to the main German dataset, it was tested against
a public English dataset. 300 out of approximately 15,000 sentences were randomly selected
and then labeled so that they could now be used as a second test set, as explained before.
The same models trained on the previous dataset were used and no new ML training was
applied since the behavior of the models was to be tested with unknown data from different
fields than automotive. The overall accuracies are:
Grammatical model: 92%,
NB model: 67%
kNN model: 91%.
In contrast to the ML models, the reasons for the poorer accuracy of the grammatical model
(99% with dataset 1 compared to 92% with dataset 2) can be clearly identified: some
requirements were formulated in an inconsistent and complex way. Particularly adverbial
trigger words were used much more liberally compared to the first dataset, causing the model
to incorrectly identify ADVB-conditions.
The confusion matrices for dataset 2 are shown in Figure 4-4. As expected, the accuracies of
all models are lower than for the dataset 1. On the one hand, this is because we built our theory
using the dataset 1. On the other hand, the requirements in the second mostly don't follow
standard RE documentation guidelines, for example described in Pohl and Rupp (2021). This
complicates the analysis, as patterns do not exist or cannot be identified. Nevertheless, the
grammatical model and the kNN model were able to achieve solid results. Although the
accuracy for the grammatical model dropped from 99% to 92% and the accuracy from the kNN
model dropped from 94,1% to 91%, the results are still good. Thus, we were able to show how
the models can handle English datasets from other disciplines, although many requirements
Automated Condition Detection in Requirements Engineering 35
were formulated differently than in our first, automotive dataset. The NB model, however,
misclassified many requirements, especially when analyzing non-conditional requirements.
Here, only 66% were correctly classified, while 34% were misclassified. This suggests that the
algorithm was overfitted to the original dataset.
Figure 4-4: confusion matrices for dataset 2 based on 300 requirements
This leads to the following insight: some ML approaches, e.g. Naïve Bayes, are not well suited
for our kind of problem, as the data that one wants to use in practice must correspond to the
shape of the data of the training set. Especially important in this respect is the distribution of
the labels True and False. In the German training dataset, there was a distribution of about 1
True to 1 False. Naturally, the German test set had the same distribution, which is why the
accuracy is high, see Figure 4-3(2). The English test set though had a distribution of
approximately 1 True to 10 False. The algorithm, however, expects a distribution of 1/1 and
includes this in its classification. This can be seen in the result of Figure 4-4(2). More
conditionals were recognized than there actually are: namely 114 (95 + 19) True predictions
instead of 22 (3 + 19) True references.
The kNN approach looks at the classification of its nearest neighbor, which is why this ML
algorithm is not as biased toward the original label distribution. The grammatical model also
ignores the original label distribution, as it considers a fixed rule set.
4.6.3 Detection of constituents
Another research point of this paper is to identify the verbal expressions that are linked to the
condition and the effect, as well as the verbal expressions that are linked to the variable and
action, as explained in section 4.3. In this case, we did not use ML as a comparison. First, it is
very time-consuming to label a training set accordingly. Second, and much more important, if
the basis (condition-recognition) is correct, see Figure 4-2 steps 1-3, we can accurately match
all verbal expressions to the condition, using a fixed rule set. This problem is not complex
enough to expect a better solution with ML. In the previous problem of condition recognition,
for example, we could not be sure of the grammatical model being better than ML, because
there are many variables involved and ML could have found a better way - which still it did not.
The results for three exemplary sentences of the first dataset can be seen in Figure 4-5: correct
verbal expression assignments of an SC-condition and an ADVB-condition as well as an
unsuccessful assignment. Figure 4-5(1) shows, the correctly determined SC and the resulting
assignment of the verbal expressions that correspond to the condition. Figure 4-5(2) shows,
that while no SC was detected, the ADVB was correctly determined and the verbal expressions
that correspond to the condition were correctly assigned. Figure 4-5(3) shows a requirement
that could not be processed by our algorithm because the parser had difficulty interpreting the
input: For unknown reasons, it detected a main clause with the verb must and a subordinate
clause with the alleged verb transfer. Therefore, the output resulted in a completely wrong
assignment. The correct interpretation would be that must transfer is a compound main clause
verb and thus no subordinate clause prevails as a condition.
Advertisement
Automated Condition Detection in Requirements Engineering 36
Figure 4-5: condition/effect verbal expression detection
Figure 4-6 shows the results for variable and action detection. The first two examples show
correct verbal expressions assignments, while the third one shows a failed assignment. The
correctness of the assignments is directly linked to the correctness of the assignments of the
verbal expressions for condition and effect, see Figure 4-5 - and also fails for the same
reasons.
Figure 4-6: action/variable verbal expression detection
4.6.4 Discussion of Validity
Descriptive validity (Maxwell 1992) deals with the risk of not having remained objective when
conducting a study. Although we defined conditions based on fixed criteria, there may have
been inconsistencies when labeling the dataset. We have tried to reduce this risk by having
three persons review the labeling of the dataset. In addition, the grammatical model has an
advantage here, because a different understanding of a condition can immediately be
implemented, whereas for ML everything would have to be re-labeled and re-trained.
Generalizability (Maxwell 1992) is given when the results can be applied to other situations
that are outside the present research, which is also a threat to the study's validity. We
addressed this, by testing and validating our method with a non-automotive dataset. That is
why we believe that other industries write requirements in a similar way to the automotive
industry. This is also supported by Göhlich et al. (2021) who found that processes to manage
requirements and specifications do not differ significantly with regard to the industrial context.
However, further testing should be conducted in the future to verify its applicability in different
industries.
4.7 Summary, Conclusion and Outlook
In this paper, we have provided a building block for how to make requirements engineering
(RE) and requirements management intelligent using automated methods. In specific, we
made a proposal on how to automatically detect conditionals and how they occur in the RE. In
section 4.3, we elaborated on the terminology and on how to detect conditional sentences.
Furthermore, we laid the foundation to identify the verbal expressions respectively associated
with the variables and actions. In section 4.5, we presented the results on 1,861 requirements
while comparing a grammatical model with two machine learning (ML) models.
We found that in structured texts, such as usually found in specifications, grammatical models
are well suited for identifying conditionals and their constituents. Grammatical models show
better or at least similar results than ML approaches. Some ML algorithms e.g. Naïve Bayes,
Automated Condition Detection in Requirements Engineering 37
are not well suited for our kind of problem. For example, for this algorithm, the data that one
wants to use in practice must correspond to the shape of the data of the training set.
Furthermore, every dataset used for ML methods must first be labeled, which can be very time-
consuming and prone to human errors. Therefore, we conclude that the grammatical model is
preferable as the rules can be tracked and easily adjusted if needed, for example, by changing
trigger words. It is important to note, that this is not the case for every natural language
processing problem. These findings do not apply to unstructured text, such as in newspapers
or books. In RE, however, there seem to be certain explicit or implicit rules - depending on the
industry - according to which sentences are formulated, which massively reduces the
complexity of the problem and makes machine learning redundant.
Complete and error-free requirements specifications are crucial for a good product design.
Some specifications contain several thousand individual requirements. Therefore, it is obvious,
that automatization would result in an enormous leap in the manageability of such large
datasets. Moving in this direction, this paper creates the possibility to identify contradictions
between two requirements automatically. The approach is, that if two requirements had the
same condition and in the effect the same variables, but different actions, this would indicate
a contradiction (Gärtner et al. 2022). This research contributes a building block for this
approach by identifying these corresponding verbal expressions. Such a method could, for
example, help developers to identify critical requirements already during the design process or
even in later stages like the review phase. This will be elaborated further in a future paper.
4.8 Author Contributions
Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.; validation, A.E.G.; formal
analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G.; data curation, A.E.G., D.G. and T.-
A-F.; writingoriginal draft preparation, A.E.G.; writingreview and editing, D.G. and T.-A.F.;
supervision, D.G. and T.-A.F. All authors have read and agreed to the published version of the
manuscript.
4.9 Acknowledgments
We thank IAV GmbH for providing us with the requirements specification on electric buses.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 38
5 Automated Requirement Contradiction Detection through formal
logic and LLMs
This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons licence, and indicate if changes were made
( http://creativecommons.org/licenses/by/4.0/). The content of this chapter was published:
Authors: Alexander Elenga Gärtner and Dietmar Göhlich
Publisher: Springer Automated Software Engineering
DOI: https://doi.org/10.1007/s10515-024-00452-x
Published: 06 June 2024
Conceptualization, A.E.G., and D.G.; methodology, A.E.G.; implementation, A.E.G.;
validation, A.E.G.; resources, A.E.G.; data curation, A.E.G.; writing original draft
preparation, A.E.G.; writing review and editing, D.G.; supervision, D.G. All authors have
read and agreed to the published version of the manuscript.
5.1 Abstract
This paper introduces ALICE (Automated Logic for Identifying Contradictions in Engineering),
a novel automated contradiction detection system tailored for formal requirements expressed
in controlled natural language. By integrating formal logic with advanced large language
models (LLMs), ALICE represents a significant leap forward in identifying and classifying
contradictions within requirements documents. Our methodology, grounded on an expanded
taxonomy of contradictions, employs a decision tree model addressing seven critical questions
to ascertain the presence and type of contradictions. A pivotal achievement of our research is
demonstrated through a comparative study, where ALICE's performance markedly surpasses
that of an LLM-only approach by detecting 60% of all contradictions. ALICE achieves a higher
accuracy and recall rate, showcasing its efficacy in processing real-world, complex
requirement datasets. Furthermore, the successful application of ALICE to real-world datasets
validates its practical applicability and scalability.
This work not only advances the automated detection of contradictions in formal requirements
but also sets a precedent for the application of AI in enhancing reasoning systems within
product development. We advocate for ALICE's scalability and adaptability, presenting it as a
cornerstone for future endeavors in model customization and dataset labeling, thereby
contributing a substantial foundation to requirements engineering.
Automated Requirement Contradiction Detection through formal logic and LLMs 39
5.2 Introduction
Requirements are crucial in product development as they serve as a basis for communication
among stakeholders, teams, and companies (Loucopoulos 2005; Gericke and Blessing 2012).
Specification sheets are commonly used to capture project requirements, aiming for a
complete, unambiguous, and contradiction-free system definition (DIN DIN 69901-5:2009-01;
Bender and Gericke 2021). However, these sheets often contain thousands of requirements
in natural language, derived from interdisciplinary collaboration, typically organized and
managed in separate documents for different levels of the product (VDI-Guideline VDI 2221
Blatt 1:2019-11). The complexity of modern systems and the need for distributed and
concurrent development across various levels further complicate the process, which is why the
requirements are typically organized and managed in multiple documents (Göhlich and Fay
2021b). As a result, errors, including contradictions, are frequently found in these documents.
Currently, there is a lack of comprehensive automated tools for detecting contradictions and
performing quality analysis on industrial specification documents.
Product development is a complex and dynamic process that requires proper management
and understanding of the requirements. Multiple studies have demonstrated the critical
importance of high-quality requirements in the success of development projects. In the realm
of requirements quality research, the attention is centered on individual characteristics of
requirements, such as completeness, complexity, ambiguity, and consistency (Montgomery et
al. 2022a; IEEE/ISO/IEC 29148-2018). This paper focuses on addressing inconsistencies in
the form of contradictions, which can potentially cause misunderstandings and product defects.
Natural Language Inferencing (NLI) is a commonly encountered problem in the field of natural
language processing (NLP), in which the objective is to determine the nature of the relationship
between a premise and a hypothesis, both of which are represented as sentences (Jang et al.
2020). In recent years, NLP has become more sophisticated, with machine learning models
capable of handling complex tasks such as question answering, text extraction, and sentence
generation. It was shown that extending NLI to manage sentence relationships could have
significant implications for scientific text analysis during product development (Ritter et al.
2008).
We emphasize formal requirements written in natural language that adhere to specific
formulation guidelines, particularly using single-sentence requirements that follow the principle
of atomicity. For example, employing standard sentence templates for requirements can
ensure adherence to these guidelines. Such templates are widely used across different
industries and sectors involved in hardware, software, or system development, especially for
lower system levels (Dick 2017; Sophist GmbH 2016; Wiegers and Beatty 2013; Robertson
and Robertson 2013). This approach facilitates consistency and minimizes errors arising from
linguistic nuances.
Various solutions have been proposed to tackle this challenge, including knowledge graphs
and thesauri. Knowledge graphs are formal representations of concepts, ideas, or objects
about a specific domain and all the relationships between those concepts. Thesauri group
together words with similar (synonyms) and opposite (antonyms) meanings (Ahmad et al.
2020). While these solutions can be helpful in identifying contradictions in requirements, they
have limitations in their practicality, as we will discuss further in Section 5.4.
Building upon our foundational work (Gärtner et al. 2022), which introduced a taxonomy for
detecting contradictions and proposed a semi-automated approach, we made further strides
in automation in 2023 (Gärtner et al.). Our contribution involved developing a method to identify
conditions and other sentence components within requirements. This advancement builds
upon our previous framework, emphasizing our ongoing commitment to enhancing the
automated detection of contradictions in requirements engineering.
Our current research introduces ALICE (Automated Logic for Identifying Contradictions in
Engineering). This system synergizes the capabilities of formal logic with large language
models (LLMs) to automatically detect and resolve contradictions between two requirements.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 40
Formal logic is a branch of mathematics that deals with the precise rules and structures for
reasoning and inference. On the other hand, LLMs are artificial intelligence (AI) models that
use machine learning algorithms to learn patterns and relationships between tokens in large
natural language datasets. We will elaborate further on this in Section 5.3.2 and explain why
LLMs are needed to detect contradictions. Combining these two solutions offers a unique
approach to identifying and resolving contradictions automatically, leveraging the capabilities
of formal logic and the data-driven approach of LLMs to identify inferences in natural language.
Through this research, we aim to provide an understanding of the strengths and limitations of
this approach and its potential applications in product development. We evaluate the
effectiveness using real requirements specifications, demonstrating its potential and paving
the way for future optimization.
This paper significantly contributes to automated contradiction detection in requirements
expressed in controlled natural language (Schwitter 2010). They are particularly relevant in the
automotive domain and related fields, as sentence templates are standard for formulating
requirements (Dick 2017; Sophist GmbH 2016). However, it is essential to note that formal
requirements have different definitions. The requirements we consider are more natural and
intuitive than those discussed in the related work (see section 5.4.2).
To establish the groundwork for this research, we built upon our previous work (Gärtner et al.
2022), which proposed six essential questions for identifying contradictions. In this paper, we
implemented these questions and improved their effectiveness by modifying and expanding
them, resulting in a comprehensive set of seven categorized inquiries. These refined inquiries
serve a dual purpose: firstly, to determine the presence of contradictions in the requirements,
and secondly, to identify the specific types of contradictions that may be present.
We provide a detailed, step-by-step explanation of our implementation and discuss the
evaluation using real-world requirements.
5.3 Fundamentals
There are several ways to identify contradictions between sentences without human
intervention. In the following section, we clarify the term contradiction, discuss two prominent
methods formal logic and machine learning and explain why we chose a combination of
both.
5.3.1 Contradictions
Contradictions are statements that conflict with one another. However, the definition is not
intuitively clear, so it must be explicitly defined:
Different definitions of contradictions can be found in the literature on requirements
engineering (RE). To get the most generically valid and scientifically accepted definition, we
build our theory on the logical philosophy of Aristotle (Aristoteles 1986), as shown in Gärtner
et al. (2022). The foundation of his logic also known as term logic, traditional logic, or formal
logic developed in his work Metaphysics is the law of non-contradiction (LNC) (Horn 2018).
Derived from this, Figure 5-1 shows different types of RE-relevant contradictions.
Figure 5-1: Contradictions relevant to RE
In alignment with the interpretation of Karlova-Bourbonus (2019), it should be acknowledged
that contraries, in addition to contradictories, will be considered. Despite falling under the term
Automated Requirement Contradiction Detection through formal logic and LLMs 41
of contradictions, it is imperative to recognize that they are not synonymous and must be
distinctly discerned.
Contradictory opposites, such as he is sick and he is not sick, are mutually exclusive;
one must be true if the other is false, without overlap.
Contrary opposites, like it is black and it is white, cannot be true at the same time but
can both be false, indicating they are not exhaustive.
Subaltern relation follows, where if everybody is sick is true, it implies some people are
sick must also be true, demonstrating a logical step-down.
In practice, identifying contradictions in natural language can be challenging. Based on
empirical evidence, it is apparent that only a small number of contradictions observed in real-
world situations exhibit explicit characteristics. 'Rather, contradictions make use of a variety of
natural language devices […]. The most sophisticated kind of contradictions, the so-called
implicit contradictions, can be found only when applying world knowledge and after conducting
a sequence of logical operations' (Karlova-Bourbonus 2019) such as 'the car must be as fast
as possible' and 'the car must be as fuel efficient as possible.' Those familiar with physical
rules know that the faster a car drives, the more fuel it consumes. In this study, however, we
focus solely on explicit (prototypical) (Marneffe, Rafferty, Manning 2008) contradictions, such
as 'The car is fast' and 'The car is slow,' without conducting deep meaning processing.
5.3.2 Theoretical Background for NLP
In the following section, we will explore formal logic as a subset of symbolic AI and Large
Language Models (LLMs) as a subset of machine learning (ML).
5.3.2.1 Formal Logic
Symbolic artificial intelligence represents intelligence using abstract symbolic representations
and logical rules, which are the core elements of formal logic. Unlike earlier approaches aimed
at imitating human thought processes (Newell and Simon 1956) the symbolic AI paradigm
sought to replicate human knowledge and understanding at a conceptual level.
Formal logic plays a crucial role in symbolic AI, as it encompasses a wide range of applications,
including handling simple mathematical tasks, performing string comparisons, and extending
to more complex symbolic reasoning and logical operations such as rule-based reasoning,
knowledge representation, expert systems, and automated theorem proving. Tasks such as
parsing sentences, planning responses, reasoning about meaning, and building expert
systems have relied on symbolic problem-solving techniques (Jurafsky and Martin 2019).
It is essential to acknowledge that symbolic AI, which relies on formal logic, has achieved
significant milestones. However, it faces limitations when dealing with complex, ambiguous,
and uncertain problems (Russell and Norvig 2016). It struggles with tasks that require fluid and
intuitive thinking, perceiving subtle distinctions, and generalizing knowledge across domains:
it would have difficulties determining whether something could be fast and big simultaneously
but not fast and slow.
While rule-based systems alone cannot match human intelligence, they provide key
capabilities such as logical reasoning, explanation, and abstraction integral to human cognition
(Johnson-Laird 2006; Miller 2019).
5.3.2.2 Machine Learning
Machine learning (ML) is an approach to AI that uses data-driven techniques to train models
that can learn and make predictions. Unlike Symbolic AI, ML algorithms learn patterns and
relationships from large amounts of data, enabling the systems to adapt to new information
and make accurate predictions in real-world scenarios.
Classical ML models excel in specific tasks like image classification, speech recognition, or
natural language processing. However, they may struggle to identify contradictions across
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 42
diverse contexts, as their knowledge is confined to the data used during training (Goodfellow
et al. 2016; Kim 2018). Therefore, even these models would have difficulty determining the
relationships between something being fast and big or fast and slow simultaneously.
On the other hand, LLMs like GPT (openai 2020) and LLaMA (Touvron et al. 2023) are pre-
trained on extensive datasets, allowing them to learn general language patterns and
relationships. This broad knowledge enables LLMs to better understand context and
semantics, crucial for identifying contradictions (Surana, S., Dembla, S. & Bihani, P 2022).
They can indeed identify that something could be fast and big simultaneously but not fast and
slow. This is why, in this work, we use LLMs to detect contradictions.
5.4 Related Work
Previous research in requirements management has proposed several methods for tackling
inconsistencies between requirements. The following sections discuss taxonomies, NLP-
based conflict detection, and ontology-based conflict detection approaches.
5.4.1 Classifying Conflicts
In our foundational work (Gärtner et al. 2022), we introduced a nuanced taxonomy for detecting
contradictions based on Aristotle's Law of Non-Contradiction, recognizing the need for a
systematic approach tailored to the intricacies of requirements engineering. This classification
emerged from a rigorous analysis of contradictions commonly observed in requirements
documents, ensuring a methodical and scientifically grounded approach rather than an
arbitrary categorization. By adopting this classification, we aimed to bridge the gap between
theoretical logic and practical application in software development, ensuring our methodology
is grounded in logical rigor and applicable to the demands of modern engineering projects.
The proposed method categorizes contradictions into distinct types: contradictories, contraries,
and subalterns, which fall into three subcategories: Simplex-, Idem-, and Alius-, as shown in
Figure 5-2. The decision to focus on these subtypes was driven by their prevalence in real-
world requirements and unique challenges in automated detection and resolution.
Simplex contradictions (from Latin ‘simple’) are characterized by direct opposition without
conditional statements. These contradictions are straightforward but crucial for establishing
our classification framework. For example, The car must be red’ versus ‘The car must be blue’
showcases apparent, uncomplicated contradictions.
Idem contradictions (from Latin ‘same’) involve identical conditions leading to contradictory
outcomes, presenting challenges due to their conditional nature. An example is ‘If the customer
wishes, the car must be red and ‘If the customer wishes, the car must be blue, where the
same condition yields conflicting requirements.
Alius contradictions (from Latin ‘different’), distinguished by differing conditions that result in
incompatible conclusions, illustrate the complexity of engineering requirements. An instance
of this is If the customer wishes, the car must be red’ versus If the car has four doors, the car
must be blue, demonstrating how different conditions can lead to contradictory outcomes.
These categories Simplex, Idem, and Alius were chosen to address the varied and complex
nature of contradictions in real-world requirements. By providing a clear structure for identifying
and analyzing these contradictions, our approach enhances the capability of automated tools
to manage these challenges.
Automated Requirement Contradiction Detection through formal logic and LLMs 43
Figure 5-2: Nine types of contradictions for RE (Gärtner et al. 2022)
Different types of contradictions occur in varying numbers, have varying levels of project
impact, and require different detection approaches. As a result, our previous study defines
questions that can be used to identify these types of contradictions. This standardized solution
is the basis for the fully automated contradiction detection presented in Section 4 of this paper.
Guo et al. (2021) suggest classifying conflicts into three fundamental types: inconsistencies,
inclusions, and interlocks, each of which can be further subdivided into seven subcategories.
Although this approach shows potential, the authors do not provide a clear method for
identifying these categories.
Wu et al. (2022) categorize contradictions into six types: negation, antonym, replacement,
switch, scope, and latent. In this classification, negation identified through negative words
like 'no,' 'not,' 'never,' and 'neither - nor' aligns with what we define as contradictory. Both
antonyms and replacements fit within our category contraries. The concept of scope, which
involves either narrowing or expanding the scope expressed in a sentence, corresponds to our
subset category. However, switch is not a contradiction in the context of RE, exemplified by
the sentences 'Sally sold a boat to John' and 'John sold a boat to Sally'. This scenario does
not necessarily indicate a contradiction, as it is possible for Sally to sell a boat to John while
John simultaneously sells his boat to Sally. Lastly, latent contradictions, which are not yet
developed, fall outside this paper's focus as we concentrate on explicit contradictions.
5.4.2 Natural Language Processing for Detecting Contradictions
On a general notice, Zhao et al. (2021) report that most studies (67.08%) in the domain of NLP
combined with RE consist of solution proposals evaluated through laboratory experiments or
example applications. In contrast, only a tiny percentage (7%) of these studies undergo
evaluation in industrial settings, resulting in insufficient practical validation.
Marneffe et al. (2008) suggest a definition for contradictions in NLP tasks and analyze the
task's performance. However, their focus is on implicit rather than explicit contradictions,
aligning closely with the switch-contradictions delineated by Wu et al. in the preceding section.
Li et al. (2017) state that traditional context-based word embedding learning algorithms are
ineffective for mapping contrasting words. To address this issue, they developed a neural
network to learn contradiction-specific word embeddings that can separate antonyms.
However, their emphasis is not on explicit contradictions, e.g.: 'Some people and vehicles are
on a crowded street' and 'Some people and vehicles are on an empty street' (Li et al. 2017).
While these sentences contrast with each other, they might still simultaneously be correct.
Other approaches with the same goal as Li et al. utilize WordNet and Thesaurus to obtain
additional antonyms and synonyms as semantic constraints (Chen et al. 2015; Liu et al. 2015).
Law of non-contradiction
for RE
Contra-
dictory
Simplex
Idem
Alius
Contray
Simplex
Idem
Alius
Subaltern
Simplex
Idem
Alius
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 44
These methods might be helpful for fine-grained optimization, as we will mention in Section
5.7.
Ritter et al. (2008) present an approach for automatic contradiction detection. However, their
approach does not address explicit contradictions in the context of RE, so their technique
cannot be directly applied to our case. Nevertheless, it may prove helpful for future
optimization.
Heitmeyer et al. (1996)introduce a technique called consistency checking for detecting errors
in requirements specifications using formal logic. Their paper analyzes formal requirements,
called SCR, and addresses issues like type errors, nondeterminism, missing cases, and
circular definitions.
However, the requirements are not written using everyday language. Instead, they are
expressed in a highly structured manner, where each requirement component has a
predetermined category that it must be placed in during the writing process. Our method aims
to meet requirements written more intuitively and naturally while still being formal.
Gervasi and Zowghi (2005) explore using formal logic and natural language parsing techniques
to identify inconsistencies in requirements. The authors propose a method that automatically
discovers and addresses these logical contradictions using theorem-proving and model-
checking techniques. By translating the requirement into a set of logic formulae, they can
detect contradictions such as and its negation . Hunter and Nuseibeh (1998) suggest
utilizing formal logic, which records and tracks the information, enabling the tracking of
inconsistent information by propagating labels and their associated data. While they do not
only focus on negating assumptions like Gervasi et al., Hunter et al. can only detect conflicts
if at least a part of the argument is negated, such as and . Therefore, Gervasi et al. and
Hunter et al. cover an essential part of the issue addressed in this paper, namely
contradictories (see Figure 5-2). However, they do not consider contraries and subalterns,
such as 'the car must be red' and 'the car must be blue' or 'it must be under 20' and 'it must be
under 10'.
Other research papers that use NLP focus on categorizing requirements, such as
distinguishing between functional and non-functional requirements (Kurtanovic and Maalej
2017), addressing security-related requirements (Jindal et al. 2016) or finding contradictions
in prose texts (Karlova-Bourbonus 2019; Sepúlveda-Torres et al. 2021b).
5.5 Automation Method
The combination of formal logic and LLMs leverages their strengths to identify contradictions
between statements. Formal logic offers a formal framework for reasoning about relationships
between statements. At the same time, the LLMs' ability to access world knowledge can be
used to ask specific prompts, guiding ALICE to identify contradictions. The prompt engineering
process involves designing and fine-tuning prompts to elicit the desired behavior from the
model. This combination enables the model to perform logical inference and identify
contradictions scalable and flexibly. The resulting system can be used to evaluate the
consistency of arguments in natural language text. While we cannot provide a comprehensive
account of all the detailed procedures and optimizations used in the code, we will explain the
relevant steps in this context in the following sections.
5.5.1 Method Fundamentals
In this section, we will delve into the core components of our approach, including a decision
tree, formal logic, and GPT prompting.
5.5.1.1 Decision Tree
In our previous work, we implemented a classifier system to identify if and what type of
contradiction is present. In the present paper, we designed this system as a decision tree, as
shown in Figure 5-3. Here, seven questions must be answered to determine the specific type
of contradiction. The first four questions refer to the effects of the requirements, and the
Automated Requirement Contradiction Detection through formal logic and LLMs 45
following three questions refer to the condition, if any. One notable advantage of this method
is its modular nature, allowing easy adaptability to future advances in the field. Therefore, if
more efficacious methodologies for resolving the questions are developed subsequently, their
integration would be unproblematic. Based on these seven questions, we show how ALICE
leads to the desired goal and discuss the advantages of each approach (ALICE or LLMs-only)
in Section 5.6.
Figure 5-3: Modular decision tree for contradiction detection in RE
5.5.1.2 Formal Logic
To detect contradictions, we need the constituents, i.e., condition and effect, as well as variable
and action, as seen in Figure 5-4. This analysis is pivotal, establishing the core structure of our
method to address the seven questions proposed by our model effectively.
Take the requirement 'If the threshold is reached, the controller must limit the speed decay' as
an example. The segments 'If the threshold is reached' and 'the controller must limit the speed
decay serve as the condition and the resultant effect, respectively. Within the condition, 'The
threshold' represents the variable, and 'is reached' the action. Similarly, within the effect, 'the
controller' is the variable and 'must limit the speed decay' the action. This is shown in Figure
5-4.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 46
In terms of implementation, this step involves utilizing symbolic AI using a grammatical model
that incorporates grammatical rules, trigger words, and POS tags to detect conditions, as
shown in (Gärtner et al. 2023).
It is important to note that conditional statements are not statements of causality. The presence
of oxygen, for example, is a condition for a match to light, whereas striking the match is the
cause. The lighting of the match is called the effect. In terms of implementation, this step
involves utilizing symbolic AI using a grammatical model that incorporates grammatical rules,
trigger words, and POS tags to detect conditions.
For the contradiction detection presented in this paper, we further extended this logic by
incorporating logical and mathematical rules, including string similarity comparisons and
mathematical operations.
Figure 5-4: Showcase of Condition and Effect, as well as Variable and Action in a formal requirement
5.5.1.3 GPT
To leverage the power of LLMs while addressing the unpredictability of their responses, we
adopted a frozen model approach using GPT3. Utilizing this version ensured no updates would
be applied to the model during our experiments, providing consistency in its behavior. The later
models, such as GPT3.5 and GPT4, are subject to constant updates by OpenAI (openai
2023a). For the sake of completeness, we have nevertheless tested and briefly evaluated the
newer models. In our implementation, we employed the following configuration settings: the
engine was set to text-davinci-003, the prompt was provided as the input, a temperature of 0
was used to ensure deterministic responses, the maximum number of tokens was set to 5, top-
p sampling with a probability of 1 was employed to avoid randomness, and both the frequency
penalty and presence penalty were set to 0 to encourage a neutral influence on the generation
process. This configuration allowed us to explore the capabilities of the frozen GPT3 model
and examine its outputs in a controlled and predictable manner.
5.5.2 Preprocessing
In NLP tasks, preprocessing is critical in preparing unstructured textual data for analysis.
Preprocessing aims to convert raw text into a structured and meaningful format suitable for
further analysis. Our research builds on our prior work (Gärtner et al. 2023), making it part of
ALICE. It utilizes a fully automated preprocessing pipeline consisting of the following stages:
(1) data reduction to replace or eliminate special characters such as '<' with their corresponding
expressions; (2) text parsing to generate dependency trees and perform part-of-speech
tagging; and (3) identification of verbal expressions, including conditions, effects, variables,
and actions, collectively referred to as constituents, as explained in Section 5.5.1.2.
Automated Requirement Contradiction Detection through formal logic and LLMs 47
5.5.3 Questions
The following section summarizes the seven key questions that form the basis of the ALICE
methodology for detecting contradictions within requirement specifications. Each question
targets a specific aspect of potential contradictions, enabling a structured and thorough
analysis. Detailed examples and GPT-based prompts for each question are provided in the
appendix.
1. Variable Identity: Identifies whether the two requirements' variables are identical or
subsets.
2. Effect Inclusivity: Investigates whether one effect encompasses another, which is
essential for identifying subaltern contradictions.
3. Mutual Action Exclusivity: Identifies directly opposing actions, indicating
contradictions where the truth of one negates the other.
4. Mutual Action Inconsistency: Assesses if actions are inherently contradicting
without being direct opposites, which is essential for identifying contraries.
5. Condition Presence: Identifies conditional clauses that might affect the interpretation
of requirements, helping eliminate Simplex-contradictions contradictions.
6. Condition Equivalence: Compares conditions in two requirements to detect Idem-
contradictions that require the same condition.
7. Condition Co-Occurrence: Evaluates the possibility of two conditions coinciding,
helping eliminate Alius-contradictions contradictions.
5.6 Validation and Results
We adopt a structured empirical validation protocol based on established research
methodologies to ensure a rigorous and systematic evaluation of ALICE. Our approach is
delineated by specific research questions (RQs) designed to assess the effectiveness and
applicability of ALICE in identifying and resolving contradictions in engineering requirements.
RQ1: How does ALICE compare to existing LLM approaches regarding accuracy and recall in
detecting contradictions within complex specification sheets?
RQ2: How does employing ALICE affect the efficiency and scalability of contradiction detection
processes in real-world datasets?
5.6.1 Data Analysis and Methodology
We used three datasets in our study. Dataset 1 was compiled specifically for this study and
served as an initial test suite for developing our method. This dataset, see Section 5.11,
includes all possible types of contradicting requirements pairs as well as non-contradicting
pairs. Datasets 2 and 3 were obtained from a recent real-world electric bus project: A
comprehensive set of requirements was created to outline a modular system intended to
replace traditional bus powertrains with modern electric ones. The significance of requirements
in developing electric buses can be found in the publication Design of urban electric bus
systems (Göhlich et al. 2018) Usually, real-world specification sheets cannot be shared with
the public due to confidentiality concerns. Nevertheless, we have taken steps to anonymize
210 requirement pairs and have provided access to them, as detailed in Section 9.
Dataset 2 contains 1,071 requirement pairs, which have been manually checked and labeled
for contradictions. Hence, it can be used to validate of our method. Dataset 3 contains 3,916
pairs, which were not manually checked. On the one hand, it was used to show that the method
can handle large datasets, and on the other hand, it served to compare ALICE and GPT3.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 48
Our datasets are inherently unbalanced, reflecting the real-world scenario where the vast
majority of requirements do not contradict each other. This imbalance is not a flaw but a feature
of using authentic datasets, where contradictions are naturally rare when comparing each
requirement against others. Our analysis methods are tailored to recognize and effectively
handle such skewness, aiming to accurately identify the relatively few, yet critically important,
contradicting combinations. This approach underscores the practical applicability of our
method in real-world settings, where the objective is not to balance data artificially but to mirror
and navigate the complexities of actual requirement specifications.
Additionally, rather than employing machine learning techniques to train on the unbalanced
dataset, our methodology leverages formal logic and pre-trained LLMs, reinforcing the
method's adaptability and efficacy in handling the intricacies of requirement specifications
without the need for dataset balancing. Traditional machine learning techniques often
necessitate a significant quantity of labeled data to effectively capture the underlying features
and patterns within a specific problem domain. Due to the absence of a large dataset
specifically tailored for contradicting requirements and inherently unbalanced datasets,
incorporating such ML techniques becomes unfeasible.
We compared the results of ALICE to the results when using only LLMs. We conducted an
evaluation solely focused on detecting the presence of contradictions without classifying the
specific type of contradiction: While ALICE demonstrated the capability to identify specific
types of contradictions, LLMs are not equipped with this classification awareness.
Contradictions in specialized industrial domains hinge on subtle semantic details. However,
LLMs are trained on broad datasets, which limits their ability to specialize in such nuanced
tasks. Thus, we limited the evaluation to general contradiction detection to ensure
comparability.
Finally, our evaluation did not consider methods using only formal logic, see Section 5.4.2. As
mentioned, traditional context-based word embedding algorithms like Word2Vec or GloVe are
ineffective for mapping contrasting words (Li et al. 2017). For instance, they would both
struggle to detect that a car can drive and be red simultaneously, but it cannot drive and be in
the air simultaneously. The latter example represents an apparent contradiction that would
require real-world knowledge to identify.
5.6.2 Results for Dataset 1
We present the first dataset's results based on the method presented in Section 5.5. It
comprises 87 requirement pairs, of which 61 exhibit contradictions. The selection of the
contradictory pairs encompasses all nine types of contradictions and considers as many
variations in their formulations as possible.
An extract of the 87 pairs is shown in Table 5-1, with one example for each type of contradiction
and some examples that are not contradictions. Table 5-2 displays the confusion matrices as
tabular representations of ALICE and GPT3 performance. In these matrices, True Positives
and True Negatives indicate correct predictions for contradictions and non-contradictions,
respectively. False Positives and False Negatives represent errors where non-contradictions
were mistaken for contradictions and vice versa. From these figures, we derive key metrics
like accuracy, precision, and recall, which reflect the model’s performance in identifying true
contradictions among the dataset (Shultz et al. 2010).
Automated Requirement Contradiction Detection through formal logic and LLMs 49
Table 5-1: extract from Dataset 1
Expected Type
Requirement Pairs
Simplex subaltern
x must be less than 12.
x must be less than 10.
Simplex contrary
The playground must be fun for adults
and children.
The playground must only be fun for children.
Simplex
contradictory
x must be equal to three.
x must not be equal to three.
Idem subaltern
If it rains, x must be between 10 and 20.
If it rains, x must be between 12 and 15.
Idem contrary
If it rains, the car must stand.
If it rains, the car must drive.
Idem contradictory
If it rains, x must be equal to 3.
If it rains, x must be unequal to 3.
Alius subaltern
If it rains, x must be between 10 and 20.
If I sing, x must be between 12 and 15.
Alius contrary
If the value of the signal
LapVeh_FueCur exceeds the value 0
(A), the signal Chrg must be set to
TRUE.
If the parameter Chrg_SubVal is set to TRUE,
the signal Chrg corresponds to the
parameterizableble value Chrg_SubValChrg.
Alius contradictory
If y is equal to 4, x must be equal to 3.
If z is equal to 4, x must be unequal to 3
0
The application must be able to run on
multiple platforms and operating
systems.
The application must integrate easily with
other existing software systems.
0
If the signal Pfx is equal to three, y must
be equal to 5.
If the signal Pfx is equal to three, z must be
equal to 5.
0
If the signal Pfx is equal to three, y must
be equal to 5.
If the signal Pfx is equal to 4, y must be equal
to 6.
Table 5-2: Results for Reference Dataset in the form of confusion matrices
ALICE
0
expected
1
expected
GPT3
0
expected
1
expected
0
calculated
17
15
0
calculated
21
41
1
calculated
9
45
1
calculated
5
20
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 50
The LLM-only prompts were formulated as shown in Figure 5-10. The return value is then
analyzed regarding whether it contains a string with the content 'yes,' after which the answer
is considered as positive.
ALICE achieved an accuracy of 72%, a recall of 75%, and a precision score of 83%. On the
other hand, the LLM-only method achieved an accuracy of 47%, a recall of only 32%, and a
precision score of 80%. Attempts to guide the model’s behavior using n-prompting proved
unsuccessful. Several attempts were made, mainly using examples of contradictions and the
seven questions described in Section 5.5.3. N-prompting, or chain of thought prompting,
involves guiding a language model through a series of logical steps to improve its problem-
solving accuracy. This technique mimics human reasoning by breaking down complex
problems into simpler, articulated steps before concluding.
ALICE is more likely to identify contradictions (i.e., 9+45=44) than the LLM (i.e., 5+20=25).
This is even more clear in the following datasets.
Figure 5-5: LLM prompt - pseudo code
5.6.3 Results for Dataset 2
In this section, we analyzed 1,071 combinations to validate our method. ALICE achieved an
accuracy of 99%, a recall of 60%, and a precision score of 94%. On the other hand, the LLM-
only method achieved an accuracy of 97%, a recall of 0%, and a precision score of 0%. Due
to unbalanced data, the accuracy alone may not be a highly relevant metric. The recall, which
measures the proportion of correctly identified positives, provides a better understanding of
the results. Accordingly, ALICE detected 60% of all contradictions, whereas the LLM method
did not detect any. For a more comprehensive analysis of the results, the confusion matrices
are shown in Table 5-3.
Table 5-3: confusion matrices for the first dataset
ALICE
0
expected
1
expected
GPT3
0
expected
1
expected
0
calculated
1044
9
0
calculate
d
1041
26
1
calculated
1
17
1
calculate
d
4
0
An example of an accurately identified contradicting pair (true positive) by ALICE is as follows:
1. If the value of the signal B_LdnTrvOfSin3Load_fKlLdngr exceeds the value of 0, the
value of the segment output signal LdnProcTrvtv must be set to True.
Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 𝑠𝑖𝑔𝑛𝑎𝑙 =
!𝑇𝑟𝑢𝑒
2. If the Parameter Load_DomWrtTgtLdnProcTrvtvToSW is set to True, the signal
LdnProcTrvtv corresponds to the configurable value Load_DomWrtLdnProcTrvtv.
Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2 𝑠𝑖𝑔𝑛𝑎𝑙 =
!𝑐𝑜𝑛𝑓𝑖𝑔.
Prompt: Do the following sentences contradict
each other, yes or no:
1. {Req1}
2. {Req2}
Automated Requirement Contradiction Detection through formal logic and LLMs 51
This is a contradiction of the type Alius contrary since it does not exclude that 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 and
𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2 may occur at the same time, while the effects (𝑠𝑖𝑔𝑛𝑎𝑙 =
!𝑇𝑟𝑢𝑒 and
𝑠𝑖𝑔𝑛𝑎𝑙 =
!𝑐𝑜𝑛𝑓𝑖𝑔) contradict each other.
It is a potential contradiction because it is theoretically feasible that the conditions have been
defined in another place so that parallel occurrence is excluded. Automated detection of such
scenarios is not within the scope of this paper but is conceivable for the future. Nonetheless,
a manual verification was conducted on a selective basis, where the absence of corresponding
exclusions was confirmed for the prior example: The first requirement belongs to the Function
section, while the subsequent requirement belongs to the Manual Output section. It could be
assumed that the Manual Output supersedes all other requirements due to its later placement;
however, this assumption lacks clarity and unambiguity. Sequential processing of
requirements must be explicitly defined and is not inherently evident. On the contrary, in cases
where sequentially overwriting actions are present, they must be distinctly indicated.
Therefore, this is indeed a real contradiction.
An example of a requirement pair falsely identified as Alius contrary (false positive) by ALICE
is as follows:
1. If the value of the signal B_Mt[...]Mngt is above the value of Load_DebVwTgtInpSig == 1s
for longer than Load_ThdOfLdnTrvMtl == 0.8, the value of the segment output signal
LdnProcTrvtv must be set to True.
Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 > 𝐶 𝑠𝑖𝑔𝑛𝑎𝑙 =
!𝑇𝑟𝑢𝑒
2. If none of the input signals is above the value of 0 and the value of the signal
B_Mt[...]Mngt falls below the value of Load_DebVwTgtInpSig == 5s for longer than
Load_ThdOfLdnTrvMtl == 0.8, the value of the segment output signal LdnProcTrvtv must
be set to False.
Formal: (𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2) (𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_3) 𝑠𝑖𝑔𝑛𝑎𝑙 =
!𝐹𝑎𝑙𝑠𝑒
The issue, in this case, traces back to the response provided for question 7. ALICE’s approach
fails to distinguish between 'above x' (condition_1) and 'below x' (condition_3) as mutually
exclusive, leading it to believe that both conditions can co-occur. This would result in
contradicting effects, thereby rendering the requirements contradictive. It is important to note
that in the current configuration question 7 involves GPT3 as LLM. It is possible that more
advanced LLM models could yield better results in the future.
An example of a falsely identified contradicting pair (false positive) when relying solely on
GPT3 (LLM-only) and not using ALICE is as follows:
1. If the CAN status bit B_NchPsst12333 has the value False, an error must be written to
the segment-internal error bus.
Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 𝑥 =
!𝑒𝑟𝑟𝑜𝑟
2. If the CAN status bit BVwLoad_sOfNch12333 has the value True, an error must be
written on the segment-internal error bus.
Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2 𝑥 =
!𝑒𝑟𝑟𝑜𝑟
When asking GPT why it thinks like that, it explains, 'Since these are different CAN message
status bits, both one and the other cannot be fulfilled at the same time. Therefore, the
requirements contradict each other.' ALICE correctly identified this as not contradicting.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 52
5.6.4 Application to large datasets
The third dataset consists of 3,916 combinations of requirements, which took approximately
four hours to complete. GPT-3 took 0.9 seconds on average per requirement pair, resulting in
approximately one hour for the whole dataset. ALICE achieved a precision score of 86%,
signifying high accuracy in its predictions. In contrast, GPT3's precision score was 0%. Our
focus during this analysis was not on optimizing for speed or code efficiency; thus, not all
combinations could be thoroughly analyzed within the scope of this research. We did not
employ parallel processing techniques, indicating a potential for future performance
enhancements. Nevertheless, the analysis successfully showcased the method's capability to
process large datasets and underlined the combined method's advantage over solely using
GPT-3, as shown in Table 5-4.
Table 5-4: Results for Dataset 2 in the form of confusion matrices
ALICE
0
expected
1
expected
GPT3
0
expected
1
expected
0
calculated
3848
0
calculated
3881
1
calculated
6
38
1
calculated
5
0
5.6.5 LLM Comparison
We evaluated the datasets using not only GPT3 but also GPT-3.5 turbo and GPT-4 as well as
LLaMA by Meta. Our findings indicate that GPT-3.5 turbo had a limited performance, detecting
only one contradiction. On the other hand, GPT-4 detected five contradictions, marginally
underperforming when compared to GPT-3. LLaMA detected three contradictions. The
difference between GPT-3 and GPT-3.5 is particularly evident in question 4, Are the actions
mutually inconsistent?’ which involves two questions: Question 1: Does 'must be set to true'
mean the same as 'corresponds to the parameterisable value Ldn_C'? And Question 2: could
the following statements potentially contradict each other? 'it must be set to true'it must
correspond to the parameterisable value Ldn_C.'
Table 5-5 depicts the divergent responses obtained from GPT-3, GPT-3.5, GPT-4, and LLaMA,
with GPT-3.5 yielding a less satisfactory answer than the other two models. While GPT-4's
response is akin to that of GPT-3, its computational time is notably higher, and thus, we opted
to employ GPT-3. LLaMA comprehends problems or inquiries accurately but frequently derives
erroneous conclusions, generating increasingly imaginative responses.
Automated Requirement Contradiction Detection through formal logic and LLMs 53
Table 5-5: Different answers generated by GPT3, GPT3.5, GPT4 (22.03.2023) and LLaMA
GPT3:
text davinci
ChatGPT:
GPT3.5 Turbo
ChatGPT:
GPT4
LLaMA
A1
No, they do
not mean
the same
thing.
Yes, 'must be set to true'
can' mean the same as
'corresponds to the
parameterisable value
Ldn_C' if that specific value
requires the parameter to be
set to true.
No, 'must be set to true'
means that the value must be
set as true, while
'corresponds to the
parameterisable value
Ldn_C' means that it should
match the given parameter
value.
no.
LdN_F is a switch in the
SIM that sets the
simulator to either use
the local directory name
(Ldn_F), or not (Ldn_T)
[…]
A2
Yes, these
statements
could
potentially
contradict
each other.
No, the two statements do
not potentially contradict
each other.
Yes, these statements could
potentially contradict each
other if the parameterisable
value Ldn_C is not set to
true. If Ldn_C is set to true,
then there is no contradiction.
no.
We cannot have more
than one such statement
in any given model, but if
both are present in a
particular model they
could easily refer to
different parameters.
This outcome is likely not attributable to a lack of quality in the model but instead stems from
an alternative objective for which the model was intended. An illustrative example of LLaMA's
propensity to stray further from the target with increasing response length is provided in Table
5-5. The first answer was accurate; however, subsequent elaboration led to irrelevant content.
The initial response to the second question was inaccurate, but the ensuing explanation was
correct. Therefore, one could even argue that the second answer is an oxymoron, as it
contradicts itself. Notably, no definitive comparison between LLaMA and GPT can be drawn
from this study. It is plausible that the model's performance may vary depending on the nature
of the input stimuli. Future investigations utilizing different input conditions may yield more
informative insights into the comparative performance of the two models.
5.6.6 Criticality Assessment
ALICE enables the conclusion on the criticality of requirements and, to some extent, on the
development costs. Contradictions falling under the types Idem (e.g., 𝐴 𝑥 =
!𝑐 and 𝐴
𝑥 =
!𝑘) and Simplex (e.g., 𝑥 =
!𝑐
and
𝑥 =
!𝑘 ) are always considered real contradictions
because either there is no condition, or the conditions are identical. However, contradictions
under the Alius type (e.g., 𝐴 𝑥 =
!𝑐
and 𝐵 𝑥 =
!𝑘) are potential contradictions.
As noted in Section 5.6.3, it is a potential for contradiction as it is theoretically possible for the
conditions to have been defined in a manner that precludes their parallel occurrence. While
this paper does not address the automated detection of such scenarios, it remains a possibility
for future exploration.
Furthermore, question 5, Is there a condition? provides additional insight, as noted by Frattini
et al. (2022b), regarding the correlation between the existence of conditions, lead times, and
volatility of requirements. Fischbach's findings suggest that conditions with a strict semantic
structure can lead to a more understandable requirement, which can subsequently facilitate
the translation of the requirement into downstream artifacts, such as code or test cases.
5.6.7 Conclusion
After conducting a thorough evaluation, we now return to our initial research questions to
contextualize our findings within the broader objectives of our study.
RQ1: How does ALICE compare to existing LLM approaches regarding accuracy and recall in
detecting contradictions within complex specification sheets?
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 54
Our analysis reveals that ALICE demonstrates a notable improvement in both accuracy and
recall rates over LLM-only approaches. As evidenced by the results in Dataset 1, ALICE
achieved an accuracy of 72%, a recall of 75%, and a precision score of 83%, significantly
outperforming the LLM-only method, which registered an accuracy of 47%, a recall of 32%,
and a precision score of 80%. For Dataset 2, ALICE achieved an accuracy of 99%, a recall of
60%, and a precision score of 94%. On the other hand, the LLM-only method achieved an
accuracy of 97%, a recall of 0%, and a precision score of 0. These findings underscore ALICE's
ability to detect contradictions accurately, highlighting the effectiveness of integrating formal
logic with LLMs.
RQ2: How does employing ALICE affect the efficiency and scalability of contradiction detection
processes in real-world datasets?
The approximately 4-hour-long evaluation of Dataset 3, comprising 3,916 requirement pairs,
demonstrates ALICE's proficiency in managing large datasets efficiently with a precision score
of 86%. Despite the inherently unbalanced nature of real-world datasets, ALICE's methodology
was adept at identifying the relatively few but critically important contradicting combinations.
These results have been critically evaluated and validated as actual contradictions by the
second author, who brings over two decades of experience at a major automotive OEM in
Germany. His last role involved responsibility for concept design and component integration of
various vehicles, underscoring his requirements engineering expertise (Knothe et al. 2006;
Göhlich 2008). This showcases ALICE's practical applicability and scalability, marking a
significant step in automating contradiction detection in engineering requirements.
5.7 Limits
Several limits to the method must be considered: on the one hand, limits of the LLM, and the
other, limits of formal logic.
5.7.1 Limits of LLMs for Contradiction Detection
LLMs lack adequate data to facilitate the identification of specific types of contradictions.
Therefore, we consider the limits of general contradiction detection with LLMs.
Detecting fundamental contradictions is a straightforward task, and the corresponding
reasoning can be effectively deduced, as exemplified in the subsequent instance, as seen in
Figure 5-6:
Figure 5-6: GPT3 has enough knowledge to detect and explain the present contradiction
Nevertheless, LLMs do not perform accurately in detecting contradictions according to the
definition in Section 5.3.1. It is worth noting that the model's response is not necessarily
incorrect but rather inappropriate in the RE context. It is plausible that a human, unaware of
the context, might respond similarly to GPT3, as shown in Figure 5-7.
Prompt: Are the following sentences contradictory?
1: The car shall stand.
2: When it rains, the car shall drive.
Response: Yes, these sentences are contradictory.
Follow-up prompt: Elaborate.
Response: Sentence 1 implies that the car should always stand, no matter what
the weather is like. Sentence 2 implies that when the car shall drive when it
rains, which contradicts the first sentence.
Automated Requirement Contradiction Detection through formal logic and LLMs 55
Figure 5-7: GPT3 is not able to properly detect the present contradictions. Firstly, the sentences are not
conditions but mere sentences, and secondly, although the conditions can indeed be true at the same time, this
would lead to contradicting effects.
Another example of LLM shortcomings was mentioned in Section 5.6.2 as the false positive
example concerning the opposite words below and above, which were not correctly classified.
One way to address such shortcomings is to combine machine learning models, such as
detecting opposing words from Li et al. (2017).
Furthermore, an inconsequential usage of the words must and shall can lead to issues with
LLM models. In some cases, the model did not recognize these words as synonyms and
labeled the effects differently. In general, however, one should refrain from using both words
in the same document when writing requirements (Sophist GmbH 2016).
Also, LLMs have trouble with strings that are too long. For example, a parameter name such
as OpfsOpfsCnvrGfgsddhFTiOutWRTEForHGFoUftToGhfS cannot be analyzed as it is a 'long
and complex string that doesn't provide any meaningful information to perform a character
comparison' according to GPT. The string is analyzed correctly after inserting an underscore
between the signal name, such as OpfsOpfsCnvrGfgs_ddhFTiOutWRTEForHGFoUftToGhfS.
Finally, for example, when specifying requirements, using the phrase
'it must be set equal to TRUE'
in one instance and in another instance
'it must be not be manipulated'
it would not result in a contradiction. However, substituting the second instance with
'it must be not be set'
it would produce a correct contradiction detection. Although this may not be immediately
apparent, employing precise and consistent language in technical writing is critical to prevent
errors and misinterpretations. Thus, adhering to formal language conventions when dealing
with NLP tasks in technical documents is highly recommended.
5.7.2 Limits of Formal Logic
Filler words can introduce errors in natural language processing tasks. For instance, consider
the sentence 'Amy and I both have to fight him,' where the term both is treated as the variable
instead of Amy and I. On the other hand, when comparing it to a second sentence, 'Neither
Amy nor I have fought against him,' Amy and I would appropriately be identified as the
variables. No contradiction would be observed in this case because the variables are
considered distinct. Eliminating the filler word both enables accurate identification of the
contradiction in the sentence. These findings suggest that filler words should be cautiously
treated in natural language processing applications to ensure accurate results.
Passive sentence constructions are prone to misinterpretation and should be avoided. For
instance, the statement 'There must be no manipulation of Com_Batt' is better expressed as
'The function must prevent manipulation of Com_Batt.' Using active sentence structures
enhances clarity and enables the algorithm to identify the variable performing the action. This
aligns with general rules for writing requirements that aim to reduce ambiguity (Sophist GmbH
2016).
Prompt: Are the following sentences contradictories?
1: If the actual heater stage „Cbn[...]“ is greater than 0, the requested pump
power „Cbn_Sp[...]“ must be limited by the parameter „Cbn_Tri[...]“.
2: If „Batt[...]“ is false, the requested pump power „Cbn_Sp[...]“ must be
limited to 20%
Response: No, these sentences are not contradictories. They are both conditions
that could be true at the same time.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 56
Improper placement of commas in complex sentences can result in erroneous condition
detection, leading to inaccurate identification of variables and actions. As such, it is crucial to
ensure that commas are placed correctly in technical writing. Failure to do so may introduce
flawed interpretations and subsequent errors. Consequently, it is advisable to exercise caution
when using complex sentence structures and to verify that commas are placed accurately to
enhance clarity and precision.
The creation of neologisms should be minimized in technical writing. In cases where they
cannot be avoided, adding them to a dictionary is crucial to avoid being detected as out-of-
vocabulary terms. Therefore, it is advisable to employ established terminology whenever
possible.
5.7.3 Threats to Validity
In conducting this research, we have identified and addressed several potential threats to the
validity of our findings, guided by established frameworks in empirical research (Wohlin et al.
2012).
External Validity: The study does not provide sufficient grounds to make claims about the
generalizability of the results for non-formal requirements, as only formal requirements were
analyzed. Focusing on formal requirements was driven by their prevalence and criticality in the
domains studied. However, it should be noted that the basis of our investigation is generally
applicable grammatical rules. Also, the analyzed data sets follow rules automakers generally
apply when writing formal requirements (Ahmad et al. 2020; Dick 2017; Sophist GmbH 2016).
Further investigation and comparison with additional data sets are necessary to address
contextual factors such as the domain, development process, requirements engineering
techniques, and technologies used.
Internal Validity: We have minimized researcher and confirmation bias through a
standardized methodology based on validated grammatical rules, reducing subjective
interpretation. Future enhancements could benefit from practitioner insights and further
validation studies.
Construct Validity: In our study, we have taken careful steps to ensure that the operational
definitions used closely align with the theoretical constructs we aim to investigate, a critical
component for maintaining construct validity and ensuring our measurements reflect what we
intend to measure.
Conclusion Validity: We have not further detailed the statistical methods employed to affirm
the reliability of our conclusions. Robust statistical analysis is essential to verify that the
relationships observed in our study are statistically significant and not due to chance, thus
underpinning the integrity of our study's conclusions. The potential for overfitting the
methodology to the data is a legitimate concern, given the limited availability of a sizable
dataset upon which the approach was originally devised.
By acknowledging these limitations and areas for improvement, we underscore our
commitment to advancing automated contradiction detection in engineering requirements.
5.8 Conclusion and Outlook
This paper contributes to a fully automated contradiction detection for requirements written in
controlled natural language that combines formal logic and LLMs called ALICE. This hybrid
method involves prompt engineering, which entails designing and fine-tuning prompts to elicit
the desired behavior from the model, such as identifying contradictions. The seven-step
process is scalable and can be adapted to future advancements in the field, providing a
modular way to evaluate the consistency of arguments.
The performance of ALICE and a purely LLM-based method for automated contradiction
detection was evaluated in this study. ALICE achieved a higher accuracy (99%), recall (60%),
and precision (94%) than the LLM-only method (97%, 0%, and 0%). These findings show that
ALICE is more effective than LLM approaches in identifying contradicting formal requirements.
Automated Requirement Contradiction Detection through formal logic and LLMs 57
However, the limitations of both LLMs and formal logic in detecting contradictions in technical
documents were also highlighted. LLMs may produce false positives due to the use of opposite
words and lack adequate data to accurately detect specific types of contradictions. On the
other hand, formal logic may be misled by filler words and improper placement of commas,
and neologisms should be avoided to ensure accurate condition detection. To ensure precision
and reduce ambiguity, it is recommended to use active sentence structures and established
terminology in technical writing.
Combining different machine learning models, such as the detection of opposing words by Li
et al. (2017), could be a solution to address these shortcomings. Investigating tailored models
should be a priority for future research, as our method is predestined to be customized. On the
one hand, various approaches to answering the seven questions should be explored. On the
other hand, experimenting with different thresholds for string comparisons might prove
beneficial, especially when dealing with datasets that adhere to different formulation rules.
Also, we validated the method on requirements at lower, more detailed system levels. Although
we believe it would also work on higher levels, a thorough validation with actual requirements
should be conducted since the reference dataset included such requirements.
Our study focuses on detecting contradictions between pairs of requirements due to their
significant impact. We recognize the potential for contradictions among multiple requirements.
Addressing these would require advanced analysis methods. Future work could explore
extending our methodology to tackle these more complex scenarios, enhancing contradiction
detection in requirements engineering for complex system development.
Furthermore, using the presented method, it is now possible to create labeled datasets which
do not yet exist, as explained in Section 5.6. This could enable the potential use of individual
ML models for contradiction detection in the future.
Alius contradictions are potential contradictions due to theoretically infeasible conditions, as
explained in Sections 5.6.3 and 5.6.6. An automated analysis of such conditions was not within
the scope of this paper; however, it is conceivable for the future.
A subsequent area of exploration involves the integration of ALICE into the product
development lifecycle. In another work (Gärtner and Göhlich 2024b), we have detailed a
comprehensive overview for embedding ALICE into product development processes, offering
developers and product owners a practical tool for identifying and resolving contradictions in
the development phase. This integration streamlines the requirements engineering process
and significantly reduces the risk of costly reworks and project delays associated with
unresolved contradictions. Future work will continue to refine these implementation strategies,
further elucidating how industry professionals can seamlessly adopt ALICE to bolster the
development of error-free, high-quality products.
In conclusion, advanced AI systems can perform sophisticated reasoning tasks by combining
formal logic and LLMs. ALICE provides a promising approach to detecting contradictions in
natural language text requirements specifications.
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 58
5.9 Appendix
Question 1
The first question determines whether the variables of effect 1 and effect 2 are identical or if
one set of variables is a subset of the other. This determines if two statements can exhibit an
LNC contradiction; see Section 5.3.1. A contradiction can only occur when the variables are
the same or when one set of variables is a part of the other, such as comparing:
1. table
2. table leg
First, we check if both variable 1 and variable 2 are considered as out-of-vocabulary labels,
identifiers, naming conventions, or newly coined terms. If they are, we verify if the characters
match exactly using formal logic. If there is a 100% match, the answer is 'yes, they are
identical.' If the match is not exact, the variables may still have the same meaning, like 'car'
and 'automobile.' In this case, we query GPT-3 to assess whether the variables are
synonymous, as shown in Figure 5-8.
Figure 5-8: Prompt for the first question
If the LLM confirms the variables are synonymous, the answer to the first question is saved as
'yes, they are identical". By explicitly asking to answer with 'Yes or No,' the response can be
framed to include 'Yes' or 'No' in the output.
Question 2
As explained below, the second question was not processed in this implementation. It
examines the potential inclusivity between the actions, determining whether one condition
subsumes the other, with one effect serving as the superaltern and the other as the subaltern.
For example, in the sentence 'It must be [action],' the actions
1. between 15 m and 30 m
2. between 20 m and 22 m
demonstrate this relationship, as the range of the second action is wholly encompassed within
the range of the first action.
This step is only relevant when it is necessary to identify a precise contradiction type (i.e.,
subalterns). If identifying the general contradiction type suffices, this question can be omitted.
Current LLMs cannot accurately answer this question due to GPT-3's limited mathematical and
logical abilities (openai 2023b; Friedman 08.2020). In our case, all contradictions belonging to
the subaltern category are automatically designated as contrary-type contradictions.
Question 3
The third question identifies mutual exclusivity between two actions, detecting contradictory
opposing actions such as
1. the table must be blue
2. the table must not be blue
When one statement is true, the other is false, and vice versa, indicating mutually exclusive
actions. To answer this question, we explored a solution with formal logic and a solution with
GPT-3.
Formal logic approach: First, we identify negation terms like 'not,' 'neither,' 'not equal,' 'none,'
'never,' 'only,’ 'nowhere,' or 'without,' which hint at a potential contradiction. Next, we compare
the actions, disregarding negation words, e.g.:
Prompt: "are the words \"{}\" similar to words \"{}\"? Yes
or No?".format(variable1, variable2)
Exemplary Response: '\n\nYes'
Automated Requirement Contradiction Detection through formal logic and LLMs 59
1. must be blue
2. must not be blue
and evaluate their similarity. As in question 1, this is done by comparing the strings. We
achieved the best results with a similarity factor of 90%.
GPT-3 approach: The GPT-3 prompt is shown in Figure 5-9. This solution yielded better results
than formal logic. Therefore, it was used for the evaluation in Section 5.6. If a similarity is
observed, the answer to the third question is saved as 'yes.'
Figure 5-9: Prompt for the third question
Question 4
The fourth question assesses mutual inconsistency between two actions, identifying contrary
(not contradictory, see Section 5.3.1) opposing actions like
3. the table is red
4. the table is blue
To answer this question, we evaluate whether the actions have the potential to contradict each
other using GPT-3. If the model output yields a 'yes' as response, the answer to the fourth
question is saved as 'yes', as shown in Figure 5-10.
Figure 5-10: Prompt for the fourth question
Question 5
The fifth question determines the existence of a condition, such as
1. If I am in France, …
to eliminate potential Simplex-contradictions. We apply our previous model (Gärtner et al.
2023) using NLP methods based on grammatical rules, as explained in Section 5.5.1.
Although LLMs might seem suitable for resolving this query, their reliability is questionable, as
shown in Figure 5-11. Obviously, the sentence does not contain a condition; in fact, it is a
simple statement. Consequently, we rely on the grammatical model for greater assurance and
100% replicability.
Figure 5-11: Condition Detection with GPT-3
Prompt: "Are the following sentences similar to each other?:\n-
{}\n- {}?".format(action_tokens1, action_tokens2)
Exemplary Response: '\n\nYes, they'
Prompt: "could the following statements potentially contradict each
other?\n- \"it must {}\"\n- \"it must {}\"\n\nAnswer 1:
".format(action1, action2)
Exemplary Response: 'Yes, the two statements could potentially
contradict each other.'
Prompt: Does the following sentence contain a condition: The car
must stand.
Response: 'Yes, the sentence "The car must stand" contains a
condition. This condition is that the car must remain in a standing
position.'
Advertisement
Automated Requirement Contradiction Detection through formal logic and LLMs 60
Question 6
The sixth question identifies if the first condition is equivalent to the second condition to detect
Idem-contradictions that require the same condition, such as
1. If I am in France, (the table must be blue)
2. If I am in France, (the table must be red)
This question is answered with formal logic using string compares. If similarity exceeds a
predefined threshold, the answer is 'yes.' We achieved the best results with a threshold of 95%.
Question 7
The seventh question determines if the first and second conditions can co-occur. The objective
of this question is to assess whether two statements have the potential to contradict each other
based on the theoretical possibility of both conditions happening at the same time. For
example, the statements
1. If I am in France, (the table must be blue)
2. If I am in Germany, (the table must be red)
cannot co-occur and, therefore, are not contradicting regarding RE. However, the statements
1. If I am in France, (the table must be blue)
2. If it is hot outside, (the table must be red)
could theoretically contradict each other as they can occur at the same time.
To answer this question, we use GPT-3, as shown in Figure 5-12. If the model output yields a
'yes' as response, the answer to the seventh question is saved as 'yes.'
Figure 5-12: Prompt for the seventh question
5.10 Author Contributions
Conceptualization, A.E.G., and D.G.; methodology, A.E.G.; implementation, A.E.G.; validation,
A.E.G.; resources, A.E.G.; data curation, A.E.G.; writing original draft preparation, A.E.G.;
writing review and editing, D.G.; supervision, D.G. All authors have read and agreed to the
published version of the manuscript.
5.11 Data Availability
Dataset 1 is available. Datasets 2 and 3 are partially available, as described in Section 5.6.
The data is available in the attachments.
5.12 Acknowledgments
We thank IAV GmbH for providing us with the requirements specifications for electric buses.
5.13 Competing interests
The authors declare no conflict of interest.
Prompt: "can the following states occur at the same time? Yes
or no?:\n1. {} \n2. {}".format(condition1, condition2)
Exemplary Response: '\n\nYes'
Discussion 61
6 Discussion
This dissertation primarily centers on automating an aspect of requirements engineering (RE),
specifically detecting contradictions in requirements. To assess the validity of the findings, this
section compares them to the outcomes of similar studies. Following this, Section 6.2 delves
into the potential development of a software tool named ALICE, which stands for Automated
Logic for the Identification of Contradictions in Engineering. It is a hypothetical software tool
that integrates the insights and discoveries presented in this dissertation, referred to as the
"hybrid model" in the third paper. Two fundamental aspects are explored: firstly, the viability of
developing such a tool, and secondly, its potential utility in the development processes. Finally,
the discussion addresses the method's limitations and reflects the methodology employed.
6.1 Comparison of the Results with the Literature
Contradiction extraction has been widely utilized in various domains, particularly in processing
news and rumors texts (Lendvai and Reichel; Sepúlveda-Torres et al. 2021a; Marques 2015).
However, directly applying these existing models to the domain of requirements engineering
presents challenges due to the generic type of negation commonly found in standard text. The
language used to articulate requirements in the RE field often exhibits a distinct characteristic
abounding in mechanical semantics and symbols, such as signal names thus introducing
an added layer of complexity. Furthermore, the sentence structures employed in requirements
documentation can be non-intuitive, further complicating the adaptation of existing models.
Consequently, adapting previous models directly to the RE domain is challenging.
Remarkably, the specific area of analyzing contradictions within requirements engineering has
received limited research attention despite the widespread use of contradiction extraction
techniques in other domains.
Before comparing the literature, it is essential to provide an overview of this dissertation's
research methodology and procedures (see Figure 6-1).
Figure 6-1: Procedure, based on Figure 2-2: DIKW and Contradiction Detection
1. A taxonomy is a systematic and hierarchical classification system that organizes and
categorizes various elements or concepts based on their inherent characteristics and
relationships. This dissertation's taxonomy serves as a structured framework for
classifying different types of contradictions. This hierarchical arrangement allows for a
more comprehensive and structured approach to detecting contradictions. In this
section, taxonomy is referred to as classification in order to discuss it compared to
existing literature.
2. Information represents data. It is often unprocessed and lacks context. Information
becomes meaningful when it is organized, structured, and contextualized. For example,
a list of numbers is Information. When transforming raw data into meaningful
Information, a preprocessing step is essential. Therefore, in subsequent discussions,
this stage will be referred to as 'preprocessing' in order to discuss it compared to
existing literature.
3. Knowledge is derived from information through organizing, analyzing, and
understanding data. It involves the application of reasoning, experience, and expertise
Advertisement
Discussion 62
to interpret Information. Since interpretation plays a pivotal role, acquiring Knowledge
will be denoted as 'Interpretation Part 1' in the following sections, which will be explored
and discussed.
4. Wisdom is the highest level of the DIKW pyramid and involves the application of
Knowledge in a way that demonstrates judgment, discernment, and ethical
considerations. It goes beyond knowing facts and includes the ability to make wise and
principled decisions. Wisdom is often built upon a foundation of Knowledge and
experience. As interpretation remains crucial in this context, acquiring Wisdom will be
referred to as 'Interpretation Part 2' in the following sections, which will be examined
and discussed further.
The main technical distinction of this dissertation from most existing methods lies in the
approach employed, which does not rely on training. Instead, a general methodology is
adopted that is not tailored exclusively to a specific objective. The approach combines symbolic
AI and LLMs (large language models). Symbolic AI draws upon principles of logic, grammar,
mathematics, and similarity measurements. Conversely, LLMs undergo training, but it is a one-
time process, and they are generally designed for general tasks rather than specifically for
detecting contradictions. In the case of this research, employing machine learning is
impractical due to the absence of available labeled training data specifically focused on
conflicting requirements.
However, there are rule-based methods documented in the literature that do not require
additional training. While these methods are often tailored to specific datasets, there are
approaches comparable to the one presented in this dissertation, which are dataset-agnostic
but focused on a specific topic. It is most reasonable to consider the medical field as a detailed
point of comparison, as medical texts are also scientific and technical, unlike, for example,
newspaper articles.
The following three studies will be discussed and evaluated as their methodologies align with
this dissertation’s approach, presented in Figure 6-1:
Preum et al. (2017) present Preclude, a semantic rule-based system that identifies
conflicts in wellness advice obtained from online health forums. The system utilizes a
polarity lexicon created from verbs in the training set and their synonyms from WordNet.
WordNet is a lexical database and semantic network that organizes words, including
nouns, verbs, adjectives, and adverbs, into sets of synonyms and interlinks them
through conceptual-semantic and lexical relations. This lexicon enables labeling
actions mentioned in the text as positive or negative, facilitating the detection of
conflicting advice.
Tawfik and Spruit (2018) proposed an automated two-phase contradiction detection
model for medical literature. The model integrates semantic properties and a Learning-
to-Rank framework to identify key findings accurately. It incorporates negation,
antonyms, and similarity measures to detect contradictions.
Sarafraz (2011) introduced a method for extracting conflicting molecular events using
lexical, syntactic, and semantic features. The study explores both rule-based and
machine-learning approaches.
Discussion 63
6.1.1 Taxonomy
Preum et al. propose the categorization shown in Table 6-1:
Table 6-1: Possible cases of conflict (Preum et al. 2017)
Cases
Advice 1
Advice 2
1
Opposite
polarity
(actions)
Eat citrus fruits and green leafy
vegetables as they are rich in Vitamin
C.
Be careful about green leafy vegetables
if you are on Coumadin or ACE
Inhibitors.
2
Opposite
polarity
(effects)
Pate made from meats may carry the
listeria bacteria and cause listeriosis.
Avoid eating it while pregnant.
Consume red meat at least two to three
times a week to fight anemia.
3
Temporal
Do stretching exercises when you
wake up.
Avoid stretching or similar exercises
after the end of week 12 of your
pregnancy.
4
Conditional
Alcohol may severely affect your
baby’s development. Avoid alcohol if
pregnant or trying to conceive.
Small amounts of alcohol increase the
body's metabolic rate, causing more
calories to be burned.
5
Sub-typical
Eat calcium-rich foods like milk,
cheese and green vegetables.
Use skimmed milk instead of whole milk
as dairy products often cause bloating
and gas.
6
Quantitative
Limit your caffeine intake to less than
200 milligrams per day during
pregnancy.
Up to 400 milligrams (mg) of caffeine a
day appears to be safe for most healthy
adults.
7
Cumulative
effect
Run for at least 30 minutes a day.
Take Salmeterol 1 inhalation (50 mcg)
twice daily.
What Preum et al. call ‘Opposite polarity (effects)’ is the same as what would be called in this
dissertation ‘Contradictories.’ The other cases by Preum apart from Conditionals’ are equal
to what is called ‘Contraries’ in this dissertation. However, the ‘Conditionals’ from Preum et al.
seem to combine the lexical definition of conditions and causes. These are not the same, as
explained in the second paper in section 4.3 Terminology:
‘The strike of a match is a cause of the match lighting. The presence of
oxygen is a condition for the match lighting (Broadbent 2008). Confusion
often arises, as both conditional and causal statements can be introduced by
If …, then: If I strike a match, then the match lights (causal statement), and
If oxygen is present, then the match can light (conditional statement). ‘
Another distinction is that in this dissertation, contradicting requirements with a condition have
the prefix ‘Alius’ as seen in Table 6-2. This opens the possibility of contradictory and
simultaneously conditional contradictions, i.e., ‘Alius Contradictory’. In Preum et al.’s
taxonomy, this is not clearly defined. For example, the following sentences would be opposite
polarities as well as conditional conflicts: ‘If you are in hospital, avoid eating meat’ and ‘If you
are under 30, consume red meat at least two to three times a week’.
Advertisement
Discussion 64
Table 6-2: Possible cases of contradictions.
Contradictions
Requirement 1
Requirement 2
Simplex
Contradictory
𝑥 =
!𝑘
𝑥 =
!¬𝑘
Idem
𝐴 𝑥 =
!𝑘
𝐴 𝑥 =
!¬𝑘
Alius
𝐴 𝑥 =
!𝑘
𝐵 𝑥 =
!¬𝑘
Simplex
Contrary
𝑥 =
!𝑐
𝑥 =
!𝑘
Idem
𝐴 𝑥 =
!𝑐
𝐴 𝑥 =
!𝑘
Alius
𝐴 𝑥 =
!𝑐
𝐵 𝑥 =
!𝑘
Simplex
Subaltern
𝑥 <
!𝑐 + 𝑘
𝑥 <
!𝑐
Idem
𝐴 𝑥 <
!𝑐 + 𝑘
𝐴 𝑥 <
!𝑐
Alius
𝐴 𝑥 <
!𝑐 + 𝑘
𝐵 𝑥 <
!𝑐
Legend: Contradictory opposites are mutually exhaustive and mutually inconsistent. Contrary opposites are also mutually inconsistent but not
exhaustive. The statement ‘some people are sick’ is the subaltern of ‘everybody is sick’ (superaltern). Simplex (lat. = simple): without conditions.
Idem (lat. = same): with same conditions. Alius (lat. = different): with different conditions.
Sarafraz has no explicit taxonomy but divides the contradictions into one of nine predefined
medical topics. Tawfik’s detection categorizes them as either entailment or contradictory
relations.
6.1.2 Preprocessing and Part-of-Speech tagging:
Preprocessing refers to cleaning and transforming raw data to prepare it for further analysis.
Preum et al., Sarafraz, and Tawik employed traditional techniques as a preliminary step in their
research.
Following the preprocessing stage, they proceeded with feature extraction and part of speech
tagging. Feature extraction involves extracting meaningful and relevant information from the
preprocessed data to capture syntactic and semantic information. Part of speech tagging is a
specific technique within feature extraction that assigns a part of speech (e.g., noun, verb,
adjective) to each word in a text.
Similarly, this dissertation incorporates preprocessing and feature extraction methods,
indicating that the researchers followed similar steps to prepare their data.
6.1.3 Interpretation Part 1
Preum et al. categorize the sentences as ‘Action Clauses,’ ‘Conditional Clauses,’ ‘Temporal
Clauses,’ and ‘Cause / Effect Clauses.’ They can then be further divided into subcategories
called tokens, as depicted in Figure 6-2(1). Thus, a clause can be expressed as a tuple of
multiple semantic tokens. The detection process of Preum et al. also relies on dependency
relationships, similar to this dissertation’s approach.
Discussion 65
Figure 6-2: Different semantic decompositions: (1) - Preum et al. (2017), (2) - dissertation at hand, (3) - Sarafraz
(2011)
In comparison, this dissertation contains similar, partly identical elements. For example, a
sentence by Preum could be decomposed using the semantic objects of Conditional Clauses,
Effects, Actions, and Objects, which is similar to the concept of this dissertation. This is
illustrated in Figure 6-2(2), where a typical semantic decomposition from this dissertation can
be seen:
Sarafraz defines an event representational model, which follows another logic. Here, the
different parts of the clause represent certain events, as seen in Figure 6-2(3). He uses a
hybrid machine learning model and rule-based methods to extract a set number of events.
Tawfik does not break clauses down into constituents or other categories. Instead, they directly
proceed to Interpretation 2 contradiction detection.
6.1.4 Interpretation Part 2
The following critical aspects can be identified for the actual contradiction detection: negation
detection, antonym detection, and implicitly negated formulations. This dissertation will now be
compared to the literature based on these aspects:
Advertisement
Discussion 66
1. Negation:
Tawfik and Sarafraz use a model called NegEx (Chapman et al. 2001). NegEx locates
trigger terms indicating a clinical condition is negated or possible and determines which
text falls within the scope of the trigger terms. This is similar to the method used by
Preum et al. and in this dissertation. Here, a predefined list of trigger words is also
employed.
Furthermore, Sarafraz employed ML classifiers (support vector machines) that classify
a sentence as ‘affirmative’ or ‘negative.’ This led to further improvement in the results
of the rule-based method.
2. Antonymy:
Due to the low number of described antonyms in lexicons, Tawfik decided to utilize
additional lexical sources such as WordNet. Preum et al. also use WordNet to detect
opposite words. Sarafraz, however, does not seem to consider antonyms.
In this dissertation, antonym detection was not explicitly performed. By directly applying
large language models (LLMs), these detections are automatically considered since
LLMs can draw such inferences. Furthermore, LLMs do not require maintenance and
offer a one-fits-all solution, as seen in the next paragraph. The drawback of LLMs is
their low certainty regarding the correctness of the results. The relationships and
associations in lexicons are transparent and verifiable.
3. Implicitly negated formulations:
Implicitly negated formulations were not considered in the literature. Implicit negations
are not to be confused with implicit contradictions. Instead, they refer to statements
such as ‘the signal x should be equal to 1’ and ‘the signal x should be equal to 2.’ The
contradiction arises not from negations or antonyms but from different, conflicting
instructions on handling the signal x.
6.1.5 Conclusion
The comparison of the literature’s methods and this dissertation’s approach reveals a notable
similarity in the basic approach, lending credibility to all the methods. This congruence implies
that the methodology adopted in this work aligns with established practices and demonstrates
its soundness in addressing the research objectives.
It is essential to evaluate the observed differences in the literature with a value-neutral
perspective, recognizing that they all present advantages and disadvantages when compared
to our method. These differences may originate from variations in data sources, domain-
specific considerations, or underlying assumptions. It is worth noting that the implementations
discussed in the literature are tailored to the medical field, limiting their direct applicability to
other domains. Therefore, while the findings and techniques presented in the reviewed
literature contribute to the broader understanding of contradiction detection, they must be
adapted and contextualized for relevance in non-medical contexts.
Examining these approaches serves as a valuable exercise, providing insights into alternative
perspectives and potential areas for improvement. It deepens our understanding of the
challenges and nuances in handling contradicting statements, offering potential avenues for
refining and advancing the current methodology.
Discussion 67
6.2 From Method to Mechanism: Envisioning ALICE as a Tool
At the beginning of the discussion, a potential software tool named ALICE was introduced,
incorporating the insights of the hybrid model developed in the third paper. This section looks
closer at the operational workflow of the proof-of-concept and its usability.
6.2.1 Operational Workflow
The swimlane diagram in Figure 6-3 visualizes the proof-of-concept workflow. The aim is to
identify conditions and contradictions within requirements.
Figure 6-3: ALICE Workflow: Identifying conditions and contradictions in requirements engineering
Advertisement
Discussion 68
The process starts in the Main lane by selecting the dataset for analysis and configuring the
global parameters, class definitions, and any additional parameters the script will require.
Moving to the Preprocessing lane, data reduction is the first step to cleaning the data by
replacing or eliminating special characters with their corresponding expressions, ensuring that
the parsers can effectively process the text. Project-specific preprocessing may follow, dealing
with unique terms or structures within the specific dataset.
The Text Parsing lane is where the cleaned text is submitted to parsers, such as Stanford's
Stanza parser, to generate dependency trees and perform part-of-speech tagging. These
parsers require well-structured sentences to interpret and tag the text accurately.
Next, the Action & Variable lane is where the structure of sentences is examined to identify
main verbs or roots. If Stanza fails to find the root, the workflow switches to Stanford’s CoreNLP
parser for an alternate parsing strategy.
Upon successfully identifying the root, the process seeks out sub-clauses and adverbial
clauses that indicate conditions within the sentences in the Condition Detection lane.
The second paper (Section 4) only aimed at finding conditions. The process may conclude at
this juncture unless contradiction detection is also desired. In the latter case, the identified
conditions are then examined in the Action/Variable lane to extract actions and variables
within both conditions and effects.
Finally, the Contradiction Detection lane involves evaluating the constituents conditions,
effects, actions, and variables to detect contradictions. This evaluation is the culmination of
the process, where the analysis is completed, and results indicating the presence or absence
of contradictions are produced.
This workflow is marked by its modularity, with individual stages operating independently and
synergistically. Moreover, an iterative parsing methodology is employed, wherein a series of
parsers are utilized in sequence. This redundancy is a strategic measure to bolster the
robustness of the text analysis, mitigating any single parser's limitations.
6.2.2 Usability and Integration
The following section attempts to answer three questions that should provide information about
the implementation of ALICE: (1) how the requirements for the algorithm are provided, (2)
computing times, and (3) integration possibilities in classic requirements management tools.
Question 1: How must the requirements be passed to the ALICE?
Transferring requirements to ALICE takes three steps, as shown in Figure 6-4.
Figure 6-4: ALICEs structure (Gärtner and Göhlich 2023)
Step 1: Preprocessing
Preprocessing uses established methods from Natural Language Processing (NLP). Images
and non-textual content should be deleted or replaced with appropriate textual descriptions.
Preprocessing also includes identifying and cleaning up special symbols and characters that
cannot be interpreted. Although most Large Language Models (LLMs) should recognize
symbols such as equal signs, it is recommended to replace these symbols with plain textual
phrases, for example, "=" with "is equal to." This minimizes possible losses of unknown
characters when transferring the requests and ensures compatibility with classic text parsers
(see next step), which usually only use UTF-8 characters, i.e., the alphabet.
Discussion 69
Step 2: Constituent analysis
The second step is to analyze and prepare requirements in text form. This involves checking
whether the requirements only have an effect or consist of a condition and an effect. After this
identification, the action and the variable are identified for the effect and the condition (if any).
This is done, among other things, with the help of text parsers. The requirements were
extracted from a *.txt document in the second paper and analyzed line by line. In theory,
however, the analysis is independent of the origin of the requirements, whether from a text
document, an Excel spreadsheet, or other sources, as long as ALICE can recognize where
one requirement ends and the next begins.
Step 3: Contradiction detection
In the third step, the actual contradiction detection takes place, in which all requirements
concerning their properties are compared, as determined in step 2. This is done according to
the decision tree presented in the third paper. In the validation of the paper, this procedure
was implemented using a combination of formal logic and LLMs. All requirements were stored
together with their properties in a list-like structure. All unique combinations of requirements
were then generated and analyzed individually according to the decision tree. The results of
these comparisons can be saved, for example, in an Excel spreadsheet. This procedure can
lead to many comparisons if there are many requirements, with each pair of requirements
being checked. For example, if there are 20 requirements, 20(20 1) 2
=190 combinations
must be analyzed one by one.
Question 2: What is the average computation time?
The algorithm is implemented in Python, focusing on the results' correctness rather than
maximizing computational speed. During the analysis, unprocessable characters may occur
(see question 1, step 1), resulting in program abort and deteriorating the computational speed.
Also, the OpenAI servers may be responsible for a program abort since a pipeline to OpenAI
must be established, which may be overloaded depending on the general request volume.
Without aborts, the computing time varies depending on the structure of the requests. In the
case of a simple request without conditions, the analysis is performed quickly. It is particularly
accelerated if the decision tree's first question already hints that the requirement pair is not
conflicting.
A complete analysis of a pair of requirements can sometimes take more than 12 seconds,
especially if all seven questions have to be calculated. However, this case does not occur very
often in practice. Usually, the analysis time ranges between 3 and 5 seconds.
For larger data sets of, for example, 4000 combinations, corresponding to about 90
requirements, the analysis could theoretically take up to 13 hours. However, a more realistic
estimate is between 3 to 5 hours, as most pairs do not require further analysis beyond the first
question of the decision tree, as was the case for the specification sheets used in the third
paper.
The research was conducted using conventional, single-threaded hardware configurations.
However, ALICE's calculations are well-suited for optimization, especially parallel processing.
Parallelization refers to breaking down a computational task into smaller, independent parts
that can be executed simultaneously on multiple processing units or cores, thereby speeding
up the overall computation. These components can be simultaneously executed across
multiple processing units or CPU cores, thus accelerating the overall computation remarkably.
As explained earlier in this section, each requirement pair is analyzed separately. The pairs
are individual tasks that do not need to be executed strictly after the other. This suggests that
parallelization techniques hold substantial promise for reducing computational time.
The extent to which parallelization can speed up the entire process depends on several factors,
including the number of processing units available, the efficiency of the parallelization strategy,
and the specific computational tasks being performed. In ALICE's case, it would lead to
substantial speed improvements, potentially reducing analysis times from hours to minutes or
Advertisement
Discussion 70
even seconds. However, the exact gains must be assessed through practical implementation
and testing.
Question 3: How could an integration of ALICE into classic RM tools be achieved?
The answer to this question leads to two approaches:
The first approach is to develop a stand-alone application that provides interfaces to various
RM tools. This approach offers high flexibility, as ALICE would be a versatile solution for
projects with varying RM tools. For example, it could be used with DOORS and Codebeamer
without significantly customizing it.
However, there are some drawbacks to this approach. Developing interfaces to different RM
tools can be challenging, as each tool may have different APIs and data structures. Even if a
standard import/export format such as ReqIF is used, manual requirements transfer between
tools remains error-prone. Another disadvantage is certification. In highly regulated industries
such as automotive or aerospace, the certification of a tool such as ALICE is difficult:
companies usually only allow validated and thoroughly tested software products. Therefore,
independent, new market entrants might have difficulties due to high regulatory hurdles and
company guidelines.
The second approach is to develop ALICE as a plugin for existing RM tools. This option offers
the advantage of integrating it into certified RM tools, bypassing the certification challenge: if
a plugin is developed for an already certified tool, it typically does not require separate
certification.
Another advantage of this option is that integration with the RM tool allows for seamless work
with requirements, as users do not have to switch between different applications and manually
transfer requirements. Also, such a plugin can adopt the design and user interface of the RM
tool, which makes it more intuitive for users.
Overall, both approaches have pros and cons, and the choice between them depends on
several factors, including the specific requirements and challenges of the target project.
Careful consideration of these aspects is critical to choosing the best possible integration
strategy into RM tools.
6.3 Development Processes Integration
The following section reviews the product requirements specification process (PRS process)
and relevant standards and guidelines to discuss how ALICE could be integrated into
development processes.
6.3.1 Product Requirements Specification Process
In the industrial context, a well-established procedure for translating customer requirements
into technical product specifications has gained broad acceptance, known as product
requirements specification process (Göhlich and Fay 2021a); see Figure 6-5. In some projects,
development involves multiple parties, including clients and contractors. When clients place
orders, they typically provide requirements specifications to the contractor, which address the
project's general goal. The contractor then formulates a specification in which he describes
how the requirements of the specification are to be realized. Clients often compile their
requirements by drawing from various specifications from past projects (Bender and Gericke
2021). Consequently, requirements from different, partially incompatible projects may be
merged, resulting in contradictions. Therefore, it is essential to check for completeness and
consistency.
According to current practice, expertise with the product, the application area, and other
relevant framework conditions is required to work in this context. On the one hand, the client
could use ALICE to scan the initial specification before handing it over to the contractor to
improve the document's quality and reduce unnecessary confusion on the contractor's side.
On the other hand, the contractor could also use the software to get a quick overview of the
complexity of the specification, as more contradictions would hint at a more complex project.
Discussion 71
Figure 6-5: Product Requirements Specification Process based on (Bender 2020)
However, it is important to note that fully identifying conflicting goals can be challenging at this
preliminary stage. In later product development phases described in the following section, the
development goals are elaborated in more detail, and contradictions typically become
transparent (Göhlich and Fay 2021b).
6.3.2 Process Models and Guidelines for Product Development
According to Bender and Gericke (2021), each development project is bound to three targets:
deadline targets, cost targets, and product-related targets. This dissertation only focuses on
product-related targets, which we will refer to exclusively in the following. However, it is
conceivable that future methods developed with the help of our findings will also be able to
evaluate cost- and schedule-related targets.
Various design process models exist to tackle these targets, aimed at supporting engineers in
planning, documenting, finding solutions, and decision-making within their work. Despite
variations in terminology and detail, these models share typical stages in the design process
(Gericke and Blessing 2012; Wynn and Clarkson 2018).
One crucial phase common to these models is the early development stage, where the design
task is clarified, creating a requirements list, sometimes referred to as requirements
specification (Eisenbart et al. 2011; Gericke and Blessing 2012). This list encompasses
essential functionality, influences, constraints, and dependencies derived from stakeholder
demands, market conditions, and other factors. While various methods and checklists aid in
identifying requirements, they often prioritize completeness over integrity, making it
challenging to assess consistency and conflicting design goals due to limited solution
principles or details (Göhlich et al. 2021).
Many design process models suggest that this task is complete once the initial requirements
list is formed, despite emphasizing the need for continuous revision and refinement (Göhlich
et al. 2021). Requirements are dynamic and evolve alongside understanding the design
problem, necessitating ongoing engagement with the requirements list throughout the design
process (Maher and Poon 1996; Gericke et al. 2013).
The diagram, as depicted in Figure 6-6, presents a structured yet adaptable approach to the
design process, informed by the revised VDI 2221 guideline (VDI VDI 2221 Blatt 1:2019-11)
and integrated into the updated Pahl/Beitz method, as noted by Bender and Gericke (2021).
The process model is categorized into four phases: Task Clarification, Conceptual Design,
Embodiment Design, and Detail Design. Each phase serves as a fundamental milestone in the
design journey. However, while these phases remain constant, the intrinsic activities within
each phase demand a more dynamic approach. These activities are not set in stone; instead,
Advertisement
Discussion 72
they necessitate individual assessment and potential adaptation based on the specific
circumstances and requirements of a given company or design team.
By allowing this kind of adaptability in the activities while maintaining the integrity of the
overarching phases, the model ensures that the essence of the design process remains intact.
Figure 6-6: Process model according to VDI 2221 (Göhlich et al. 2021)
In the Task Clarification phase, the starting point of the design process, the project's objectives
or problems are clearly outlined. However, the requirements are often not detailed or
thoroughly fleshed out during this early stage. They might be broad, general, or even just
described through use cases. This suggests that while there is a basic understanding of what
needs to be achieved, the specifics, nuances, or detailed criteria might not have been
determined yet, making an application of ALICE difficult.
The result of the phases of Conceptual Design and, to some extent, the Embodiment Design
can be summarized as the initial target system. It serves all stakeholders in all development
phases as a benchmark for assessing the project's success (Bender and Gericke 2021).
Consistency and conflicting design goals can be challenging to assess due to missing solution
principles or details at this stage (Göhlich et al. 2021). While a general contradiction tool would
be beneficial, ALICE primarily focuses on identifying contradictions within technical
specifications.
The Detailed Design phase elaborates on the product's technical specifications, materials,
components, and manufacturing processes (DIN ISO DIN ISO 9000:2015). Although our tool
cannot enhance drawings and schematics, it can uncover contradictions in the specifications,
potentially saving manual work. It may even detect contradictions in material and component
specifications, although this aspect was not examined in this dissertation.
This phase can be further deconstructed into distinct categories. This subdivision yields two
primary categories: Developing Requirements (also referred to as requirements engineering)
and Working with Requirements (also referred to as requirements management), each of
which can be further delineated into specific activities, as shown in Figure 6-7, according to
Bender and Gericke (2021).
Discussion 73
Figure 6-7:Tasks when developing requirements (own illustration based on Bender and Gericke (2021))
This dissertation focuses on Requirements Engineering and its Elicitation, Specification,
Structuring, and Analysis categories. They represent the critical phases where ALICE is most
effectively applied to identify and resolve contradictions in requirements. While Requirements
Engineering forms the cornerstone for utilizing ALICE, the tool also significantly influences
Requirements Management. The integration of ALICE necessitates robust documentation to
ensure a documented state of requirements is available for analysis. Additionally, ALICE
profoundly impacts traceability by precisely tracking requirement origins and their relationships
throughout the project lifecycle, thus ensuring that any changes made due to contradiction
detection are accurately reflected and traceable. In terms of versioning, ALICE facilitates the
management of different versions of requirements documents, helping teams maintain a clear
record of how requirements evolve in response to identified contradictions. Lastly, ALICE
affects change management by enhancing the responsiveness to necessary adjustments in
requirements, enabling quicker and more informed decision-making. This ensures that all
modifications are thoroughly documented and integrated into the project workflow, aligning
with the best practices in Requirements Management and supporting continuous improvement
in project outcomes.
In the following sections, we will delve into the four constituent activities of Requirements
Engineering and their relevance to ALICE:
1. Elicitation: The tool could augment the requirement elicitation process by proactively
identifying contradictions and inconsistencies early in development. This proactive
detection aids in more precise and comprehensive requirement elicitation. However,
findings obtained during the elicitation are often documented informally and not in clean
requirements. It would, therefore, be necessary to check the elicitation results before
deciding on the application of the tool.
2. Specification: By identifying and resolving contradictions, the tool promises to improve
requirement specification, ensuring that requirements are unambiguous, concise, and
devoid of conflicts, thereby enhancing their quality. Nevertheless, specificating
requirements is often a dynamic process in which they are subject to constant
correction and editing. Since the specification continues to undergo regular changes,
the tool's implementation might be premature at this stage.
3. Structuring: Applying the tool in this context could facilitate more effective requirements
structuring. By clarifying conflicting facets and supporting their reorganization into a
Advertisement
Discussion 74
coherent structure, the tool could assist in achieving improved requirement
organization. However, the tool's results would not directly lead to automated structural
insights. Therefore, it is not immediately applicable in this step. Further research might
be valuable.
4. Analysis: Among the various activities in requirement development, the Analysis phase
emerges as a natural and highly relevant domain for applying the software tool. This
alignment is especially evident as it involves a detailed analysis of requirements to
detect contradictions. The tool's analytical capabilities are specifically designed to
uncover contradictions within requirements. Examining the requirement set offers a
systematic means to pinpoint these issues in the development process. This proactive
identification serves as the foundation for resolving contradictions, ensuring that the
requirement set remains coherent, feasible, and aligned with project objectives.
Another important aspect during the Detailed Design phase is that complex development
projects are mainly carried out in interdisciplinary collaboration within cooperation networks.
Typically, the requirements for each level are managed in different documents for the overall
product in product specification books and the subsystems in component specification books
(Göhlich and Fay 2021b). The partial results must be combined according to their logical and
temporal dependencies to form the overall solution. 'For complex systems in particular, this
process involves a risk of error with regard to the consistency of the goals among themselves
and thus also for the consistency […] of the overall solution.' (Bender and Gericke 2021).
ALICE could facilitate this process by identifying potential conflicts across multiple
specifications and assisting engineers in recognizing cross-references spanning various
domains, thereby promoting effective
Changes to the requirements specification close to the implementation should be avoided, as
the complexity of the overall system and their interactions with each other can result in
disproportionately high change costs and efforts in all areas of the company (Giffin et al. 2009;
Ehrlenspiel and Meerkamm 2017; Göhlich et al. 2021). Nevertheless, so-called Change
Requests (Lashin 2021) cannot always be avoided due to unexpected changes in objectives
or new technical findings. ALICE has not been validated for performing impact analyses to
identify potential issues arising from changes due to contradictions. However, it represents a
step towards that objective: Hein et al. (2021) use machine learning to categorize requirements
based on their change behavior and aim to provide insights into managing requirement
changes effectively. Classifying contradicting requirements could be another criterion for Hein
et al.'s ML algorithm.
After discussing when the method can be applied in the development process, it is equally
important to explore how it can be employed within specific steps. Three potential methods are
described below:
1. Real-time contradiction detection during requirement authoring: ALICE could conduct
real-time contradiction checks as requirements are authored. This functionality would
ensure that conflicting statements are promptly identified and resolved during the
requirement formulation process.
2. Periodic or triggered specification checks: ALICE could periodically perform
comprehensive checks on the entire specification, such as after each baseline or at
predefined project milestones. This approach would ensure a systematic review of
requirements to identify and address contradictions or inconsistencies at strategic
points in the development process.
3. Automated checks: ALICE's utility could extend to automated processes like nightly
checks. In this context, automated checks provide continuous monitoring of evolving
requirements. This proactive approach aids in the timely detection and correction of
contradictions, even when manual review processes are not actively engaged.
In summary, ALICE appears most suitable for application in the Detailed Design phase,
specifically during the Analysis step. The choice of the specific method should be determined
based on the particular project at hand.
Discussion 75
6.4 Limitations and threats to validity
The following sections will address the constraints and potential sources of error that should
be considered when considering this dissertation's findings.
6.4.1 Limitations of the Methodology
Due to the intricate nature of natural language requirements, a trade-off had to be made
between effective contradiction detection and the broad applicability of the methodology to any
requirement. Consequently, the methodology is restricted to requirements written in a
controlled natural language. The specifics are elaborated upon below (Dick 2017; Sophist
GmbH 2016; Wiegers and Beatty 2013; Robertson and Robertson 2013):
Utilizing single-sentence requirements that adhere to the principle of atomicity is compulsory
for this methodology. These requirements cannot be further divided or decomposed into
smaller components. The principle of atomicity is widely recognized as a best practice in
requirements engineering. It ensures that requirements remain concise, unambiguous, and
focused on specific functionalities or characteristics of the system being developed.
The presence of filler words in natural language processing tasks can introduce errors. For
example, consider the sentence ‘Amy and I both have to fight him’ where the term ‘both’ is
treated as a variable. If the to-be-compared sentence is ‘Neither Amy nor I have fought against
him,’ the variables would be identified as ‘Amy’ and ‘I.’ Therefore, no contradiction would be
detected, as the variables of both requirements do not match.
According to standard guidelines, passive sentence constructions are prone to
misinterpretation and must be avoided to ensure high accuracy in this method (VDA 2007;
Sophist GmbH 2016). For instance, ‘There must be no manipulation of Com_Batt’ is better
expressed as’ The function must prevent manipulation of Com_Batt.’ Using active sentence
structures enhances clarity and enables the AI to identify the variable performing the action.
Furthermore, incorrect placement of commas can lead to erroneous condition detection,
resulting in inaccurate identification of variables and actions. Therefore, ensuring proper
comma placement in technical writing is crucial to enhance clarity and precision. Failure to do
so may introduce flawed interpretations and subsequent errors. Complex sentence structures
must be avoided, and accurate comma placement should be verified.
All these considerations emphasize the importance of controlling natural language as much as
possible. For example, by utilizing standard sentence templates for requirements, it can be
ensured that the set limits of the methodology are not exceeded. Such templates are commonly
used in various fields and sectors where hardware, software, or system development occurs.
This approach helps maintain consistency, reduces the likelihood of errors introduced by
linguistic nuances, and facilitates accurate condition detection and variable/action extraction.
Finally, it is advisable to minimize the creation of neologisms in technical writing. If their use is
unavoidable, adding them to a dictionary to avoid being detected as out-of-vocabulary terms
becomes essential to facilitate accurate analysis, as explained in the third paper.
6.4.2 Threats to Validity
Further investigation and comparison with additional data sets are necessary to account for
contextual factors such as the domain, development process, requirements engineering
techniques, and technologies used. These additional studies would help ensure a
comprehensive understanding and broader applicability of the methodology.
Given the limited availability of a sizable and labeled dataset on which the approach was
initially devised, there is a concern regarding the potential for overfitting the methodology to
the data. Considering this limitation and striving for a more diverse and representative dataset
in future studies is essential.
Advertisement
Discussion 76
7 Summary and Outlook
The following section will concisely recap the key findings and offer insights into future
directions and potential developments.
7.1 Summary
In conclusion, this research focused on automating requirements engineering (RE) by
detecting contradicting requirements.
The first paper proposes a formal logic-based method to identify and classify contradictions in
requirements documents. In contrast to other papers found in the literature, contradictions
were not classified according to a specific, available data set but rather according to a generally
accepted, well-tested systematic model. Subsequently, a classification tailored to requirements
engineering was developed to identify specific types of contradictions using straightforward
questions.
This method identified 49 contradictions out of around 3,500 requirements. According to the
Aristotelian Law of non-contradiction (LNC), three main contradiction types were defined:
Contradictories, Contraries, and Subalterns, each of which were subdivided into Simplex,
Idem, and Alius. A set of questions, referred to as building blocks, needed to be answered to
detect the specific category. Most of the identified contradictions were Alius Contraries and
were found at the detailed design requirements level. This aligns with expectations, as higher-
level requirements are often less concrete and describe the general functionality of the product,
resulting in fewer opportunities for contradictions to occur, as explained in Section 6.3. It is
important to note that these were not considered in this research.
The method developed in the first paper serves as a classifier, demonstrating its potential to
support manual contradiction detection and hint at future automated identification of
contradictions. The method requires the requirements to be classified into condition, effect,
variable, and action. In the following formalization, these constituents should be replaced by
symbols and formulas. Once the formalization is completed, answering the questions allows
for identifying specific contradictions.
By applying this method, requirements reviewers can recognize contradictions without
requiring extensive familiarity with requirements or the document’s subject matter. Apart from
the implication for automated contradiction detection, implications can also be derived for
automated quality analysis. The classification of different contradiction types could play a vital
role in quantifying the quality of requirements documents. Assessing the criticality of
contradictions based on this classification could facilitate the determination of meaningful key
performance indicators.
The second paper focuses on automatically detecting conditional sentences in requirements,
providing the solution for one of the building blocks of the method from the first paper. It is
important to note that conditional statements do not imply causality. For instance, the presence
of wheels on a car is a condition for it to move, while excessing torque on them is the cause.
The resulting movement is referred to as the effect.
The study found that grammatical models incorporating grammatical rules, trigger words, and
Part-Of-Speech tags are more suitable for identifying conditions than machine learning
methods. Another aspect addressed in this paper is the identification of verbal expressions
associated with the condition and the effect: variable and action. These are called constituents,
which can be individual words or groups of words that function as a single unit. To illustrate,
consider the sentence, ‘If the threshold is reached, the controller must limit the speed decay.’
Here, the condition is: ‘If the threshold is reached.’ The effect is: ‘the controller must limit the
speed decay.’ In the condition, the variable is ‘the threshold,’ and the action is ‘is reached.’ In
the effect, the variable is The controller,’ and the action is ‘must limit the speed decay.’ In other
words, the variable represents the central entity involved, while the action denotes what is
expected to occur.
Discussion 77
A sample of 1,861 requirements was evaluated, and the performance of a grammatical model
against two machine learning models was compared. The findings suggest that the
grammatical model is more effective for identifying conditional requirements and their
constituents. Moreover, labeling datasets for machine learning training can be laborious and
susceptible to human errors.
The third paper completes the proposed method using formal logic and large language models
(LLMs) for fully automated contradiction detection. It is presented as a decision tree based on
the building blocks developed in the first paper:
1. Question 1: This question determines whether the variables of effect one and effect
two are identical or if one set of variables is a subset of the other, establishing the
potential for an LNC contradiction. Formal logic was used to answer this question.
2. Question 2: This question examines the inclusivity between actions, determining if one
condition subsumes the other. However, this question was not processed in the
implementation due to limitations in the mathematical abilities of the current language
models.
3. Question 3: This question identifies mutual exclusivity between two actions, detecting
contradictory opposing actions. Formal logic and GPT-3 were used to assess similarity
and determine if the actions were mutually exclusive.
4. Question 4: The fourth question assesses mutual inconsistency between two actions,
identifying contrary opposing actions. GPT-3 was utilized to determine if the actions
have the potential to contradict each other.
5. Question 5: This question determines the existence of a condition, aiming to eliminate
potential Simplex-contradictions. The model of the second paper was employed for this
condition detection.
6. Question 6: The sixth question identifies if the first condition is equivalent to the second
one to detect Idem-contradictions requiring the same condition. Formal logic and string
comparison were utilized to assess similarity.
7. Question 7: The seventh question determines if the first and second conditions can co-
occur. GPT-3 was used to assess both conditions' theoretical possibility of
simultaneously happening.
This method is named ALICE (Automated Logic for the Identification of Contradictions in
Engineering). The study compared the performance of two methods: the hybrid method and a
purely LLM-based method. Results showed that ALICE outperformed the LLM-only method in
terms of accuracy and recall. ALICE achieved an accuracy of 99% and a recall of 60%%, while
the LLM-only method showed an accuracy of 97%% and a recall of 0%. This demonstrates
that relying solely on LLMs is not feasible for detecting contradictions in requirements
engineering.
The study also identified the limitations of LLMs and symbolic AI in detecting contradictions in
technical documents. LLMs may generate errors due to inadequate data to analyze and
understand requirements accurately. On the other hand, symbolic AI may be deceived by filler
words, improper placement of commas, and neologisms. Using active sentence structures and
established terminology in technical writing is recommended to ensure precision and reduce
ambiguity in the results.
In summary, ALICE, consisting of seven questions, can successfully detect formal
contradictions between requirements, showing better results than only relying on AI, especially
Large Language Models. The method is scalable and can be adapted to future advancements
in the field, providing a modular way to evaluate the consistency requirements specification.
7.2 Outlook
The research presented in this thesis has made significant contributions to requirements
engineering, but there is still room for further exploration and improvement. In addressing the
Advertisement
Discussion 78
limitations identified in Section 6.4.1, this part of the dissertation outlines potential
enhancements and future research directions that could significantly advance the capabilities
of ALICE.
Developing advanced parsing and semantic analysis methodologies is crucial to expanding
ALICE's applicability beyond controlled natural language. These improvements would enable
it to process a broader range of natural language requirements, reflecting the varied
documentation styles encountered in practice. Additionally, by exploring algorithms capable of
accurately handling multi-sentence and compound requirements, ALICE could better manage
documents that mirror real-world complexities, thus enhancing its utility and precision in
contradiction detection.
Another significant area for improvement involves refining ALICE’s handling of linguistic
nuances. Implementing sophisticated natural language processing models would improve the
tool’s ability to interpret context and semantics, particularly in managing filler words and
variable mismatches. This would enable robust contradiction analysis within complex sentence
structures.
The accuracy of technical document interpretation can also be significantly improved by
enhancing grammatical correction features within ALICE, focusing on technical writing norms
such as comma placement and sentence complexity. Standardizing sentence templates for
requirement documentation across various sectors would aid in maintaining consistency and
reducing errors introduced by linguistic variations.
Another area for future exploration is the development of custom models for specific domains
or requirements. By training machine learning models on a specific domain, such as healthcare
or finance, researchers could create highly tailored models optimized for detecting
contradictions and other issues specific to that domain. Developing a dynamic dictionary
management system would allow ALICE to integrate and update new technical terms
seamlessly, ensuring the tool remains relevant and effective in adapting to industry-specific
terminologies and innovations.
Further work can be derived from discoveries made in the publications. The first paper
mentioned other types of contradictions besides LNC contradictions, which were not discussed
in this research: dialectic contradictions and antinomies. Research should focus on these
topics by using the world knowledge of LLMs.
The third paper explained that Alius contradictions are potential contradictions due to
theoretically infeasible conditions. While automated analysis of such conditions was not within
the scope of that paper, it could be a potential area for future research.
Finally, a compelling opportunity exists to bridge the gap between academic research and
practical industry applications. Developing a user-friendly and robust tool based on the insights
and methodologies presented in this thesis could revolutionize how engineers in various
industries approach requirements engineering. Such a tool could be designed to integrate
seamlessly with existing engineering workflows, enabling engineers to detect and resolve
contradictions in their requirement documents quickly. Parallelization holds excellent promise
for future optimization of computational speed. Since each requirement pair is analyzed
separately, it could lead to significant speed improvements, potentially reducing analysis times
from hours to minutes or even seconds. The exact gains should be assessed through practical
implementation and testing.
In conclusion, the future of requirements engineering is filled with possibilities, from exploring
new contradiction types to optimizing computational speed and developing specialized models.
The path forward also includes practical solutions that bring the benefits of academic research
directly into the hands of industry professionals, promising a more efficient and practical
approach to requirements engineering.
Publication bibliography X
8 Publication bibliography
Acheampong, Francisca Adoma; Nunoo-Mensah, Henry; Chen, Wenyu (2021): Transformer
models for text-based emotion detection: a review of BERT-based approaches. In Artif Intell
Rev 54 (8), pp. 57895829. DOI: 10.1007/s10462-021-09958-2.
Ackoff, Russel L. (1989): From data to wisdom. In : Journal of applied systems analysis, vol.
16.1, pp. 39.
Águeda, Cristina Puente; Olivas, José A. (2008): Analysis, detection and classification of
certain conditional sentences in text documents: ReseaerchGate. Available online at
https://www.researchgate.net/publication/228985516_Analysis_detection_and_classification_
of_certain_conditional_sentences_in_text_documents, checked on 9/16/2022.
Ahmad, Arshad; Justo, José Luis Barros; Feng, Chong; Khan, Arif Ali (2020): The Impact of
Controlled Vocabularies on Requirements Engineering Activities: A Systematic Mapping
Study. In Applied Sciences 10 (21), p. 7749. DOI: 10.3390/app10217749.
Al-Aswadi, Fatima N.; Chan, Huah Yong; Gan, Keng Hoon (2020): Automatic ontology
construction from text: a review from shallow to deep learning trend. In Artif Intell Rev 53 (6),
pp. 39013928. DOI: 10.1007/s10462-019-09782-9.
Aristoteles (1986): Metaphysik. Schriften zur Ersten Philosophie. Übertr. u. hrsg. v. Franz F.
Schwarz. With assistance of Franz F. Schwarz: Reclam (Reclams Universal-Bibliothek, 7913).
Asghar, Nabiha (2016): Automatic Extraction of Causal Relations from Natural Language
Texts: A Comprehensive Survey.
Awad, Elias (2003): Knowledge Management. 1st edition: Pearson India. Available online at
https://permalink.obvsg.at/.
Babcock, Jonathan (2007): GOOD REQUIREMENTS ARE MORE THAN JUST ACCURATE.
practicalanalyst.com. Available online at https://practicalanalyst.com/good-requirements-are-
more-than-just-accurate/, checked on 5/20/2022.
Bender, Beate (2020): Requirements Engineering in the Context of IDE. In Sándor Vajna (Ed.):
Integrated Design Engineering. Cham: Springer International Publishing, pp. 587614.
Bender, Beate; Gericke, Kilian (2021): Entwickeln der Anforderungsbasis: Requirements
Engineering. In Beate Bender, Kilian Gericke (Eds.): Pahl/Beitz Konstruktionslehre. Berlin,
Heidelberg: Springer Berlin Heidelberg, pp. 169209.
Biagetti, Maria Teresa (2020): Ontologies as knowledge organization systems. In : Knowledge
Organization, vol. 2, p. 48. Available online at https://www.isko.org/cyclo/ontologies, checked
on 4/5/2023.
Bojić, D.; Bojović, M. (2017): A Streaming Dataflow Implementation of Parallel Cocke
YoungerKasami Parser. In : Creativity in Computing and DataFlow SuperComputing, vol.
104: Elsevier (Advances in Computers), pp. 159199.
Bowman, Samuel R. (2023): Eight Things to Know about Large Language Models. Available
online at http://arxiv.org/pdf/2304.00612v1.
Bramer, Max (2013): Introduction to Classification: Naïve Bayes and Nearest Neighbour. In
Max Bramer (Ed.): Principles of Data Mining. London: Springer London (Undergraduate Topics
in Computer Science), pp. 2137.
Britannica, The Editors of Encyclopaedia (2023): syntax. Edited by Encyclopedia Britannica.
Available online at https://www.britannica.com/topic/syntax, checked on 4/5/2023.
Broadbent, Alex (2008): The Difference Between Cause and Condition. In Proceedings of the
Aristotelian Society (Hardback) 108 (1pt3), pp. 355364. DOI: 10.1111/j.1467-
9264.2008.00250.x.
Advertisement
Publication bibliography XI
Chapman, W. W.; Bridewell, W.; Hanbury, P.; Cooper, G. F.; Buchanan, B. G. (2001): A simple
algorithm for identifying negated findings and diseases in discharge summaries. In Journal of
biomedical informatics 34 (5), pp. 301310. DOI: 10.1006/jbin.2001.1029.
Chen, Zhigang; Lin, Wei; Chen, Qian; Chen, Xiaoping; Wei, Si; Jiang, Hui; Zhu, Xiaodan
(2015): Revisiting Word Embedding for Contrasting Meaning. In Chengqing Zong, Michael
Strube (Eds.): Proceedings of the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural Language Processing.
Beijing, China: Association for Computational Linguistics, pp. 106115.
Chomsky, Noam (1957): Syntactic structures. Berlin: Mouton & Co.
VDI VDI 2221 Blatt 1:2019-11, 2019: Design of technical products and systems - Model of
product design.
VDI-Guideline VDI 2221 Blatt 1:2019-11, 2019: Design of technical products and systems -
Model of product design.
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (2018): BERT: Pre-training
of Deep Bidirectional Transformers for Language Understanding.
Dick, Jeremy (2017): Requirements Engineering. 4th ed. 2017. Cham: Springer (Springer
eBook Collection Computer Science).
Dimitrieski, Vladimir; Petrović, Gajo; Kovačević, Aleksandar; Luković, Ivan; Fujita, Hamido
(2016): A Survey on Ontologies and Ontology Alignment Approaches in Healthcare. In Hamido
Fujita, Moonis Ali, Ali Selamat, Jun Sasaki, Masaki Kurematsu (Eds.): Trends in Applied
Knowledge-Based Systems and Data Science, vol. 9799. Cham: Springer International
Publishing (Lecture Notes in Computer Science), pp. 373385.
Duden (Ed.) (2006): Die Grammatik. Das Standardwerk zur deutschen Sprache. Duden.
Überarb. Neudr. der 7., völlig neu erarb. und erw. Aufl. Mannheim: Dudenverl. (Der Duden/
hrsg. vom Wiss. Rat der Dudenred, Bd. 4).
Duden (2023a): Semantik. Available online at
https://www.duden.de/rechtschreibung/Semantik, checked on 2/5/2023.
Duden (2023b): Taxonomie. Available online at
https://www.duden.de/rechtschreibung/Taxonomie, checked on 4/5/2023.
Ehrlenspiel, Klaus; Meerkamm, Harald (2017): Integrierte Produktentwicklung. Denkabläufe,
Methodeneinsatz, Zusammenarbeit. 6., überarbeitete und erweiterte Auflage. München: Carl
Hanser Verlag (Hanser eLibrary). Available online at
http://dx.doi.org/10.3139/9783446449084.
Eisenbart, B.; Gericke, K.; Blessing, L. (2011): A framework for comparing design modelling
approaches across disciplines. In Culley, S. J, Hicks, B. J, et al. (Ed.): Proceedings of the 18th
International Conference on Engineering Design (ICED11), pp. 344355.
Eisenberg, Peter (2016): Duden - die Grammatik. Unentbehrlich für richtiges Deutsch. 9.,
vollständig überarbeitete und aktualisierte Auflage. Edited by Angelika Wöllstein. Berlin:
Dudenverlag (Der Duden, Band 4).
Ersoy, Pinar (2021): Naive Bayes Classifiers for Text Classification. Types of Naive Bayes
Classifiers for Different Text Processing Cases. Edited by Towards Data Science. Available
online at https://towardsdatascience.com/naive-bayes-classifiers-for-text-classification-
be0d133d35ba, checked on 10/26/2022.
Estival, Dominique; Nowak, Chris; Zschorn, Andrew (2004): Towards ontology-based natural
language processing. In Nancy Ide, Laurent Romary (Eds.): Proceeedings of the Workshop on
NLP and XML (NLPXML-2004): RDF/RDFS and OWL in Language Technology on - NLPXML
'04. Proceeedings of the Workshop on NLP and XML (NLPXML-2004): RDF/RDFS and OWL
in Language Technology. Not Known, 01.06.2004 - 01.06.2004. Morristown, NJ, USA:
Association for Computational Linguistics, pp. 5966.
Publication bibliography XII
Fischbach, Jannik; Frattini, Julian; Spaans, Arjen; Kummeth, Maximilian; Vogelsang, Andreas;
Mendez, Daniel; Unterkalmsteiner, Michael (2021a): Automatic Detection of Causality in
Requirement Artifacts: the CiRA Approach. Available online at
http://arxiv.org/pdf/2101.10766v1.
Fischbach, Jannik; Frattini, Julian; Vogelsang, Andreas; Mendez, Daniel; Unterkalmsteiner,
Michael; Wehrle, Andreas et al. (2022): Automatic Creation of Acceptance Tests by Extracting
Conditionals from Requirements: NLP Approach and Case Study. Available online at
http://arxiv.org/pdf/2202.00932v1.
Fischbach, Jannik; Hauptmann, Benedikt; Konwitschny, Lukas; Spies, Dominik; Vogelsang,
Andreas (2020): Towards Causality Extraction from Requirements.
Fischbach, Jannik; Springer, Tobias; Frattini, Julian; Femmer, Henning; Vogelsang, Andreas;
Mendez, Daniel (2021b): Fine-Grained Causality Extraction From Natural Language
Requirements Using Recursive Neural Tensor Networks. Available online at
http://arxiv.org/pdf/2107.09980v2.
Foote, Keith D. (2019): A Brief History of Natural Language Processing (NLP). Edited by
Dataversity. Available online at https://www.dataversity.net/a-brief-history-of-natural-
language-processing-nlp/#.
Frattini, Julian; Fischbach, Jannik; Mendez, Daniel; Unterkalmsteiner, Michael; Vogelsang,
Andreas; Wnuk, Krzysztof (2022a): Causality in requirements artifacts: prevalence, detection,
and impact. In Requirements Eng. DOI: 10.1007/s00766-022-00371-x.
Frattini, Julian; Fischbach, Jannik; Mendez, Daniel; Unterkalmsteiner, Michael; Vogelsang,
Andreas; Wnuk, Krzysztof (2022b): Causality in requirements artifacts: prevalence, detection,
and impact. In Requirements Eng. DOI: 10.1007/s00766-022-00371-x.
Friedman, Lex (2020): Grant Sanderson and Lex Fridman. Math, Manim, Neural Networks &
Teaching with 3Blue1Brown (118), 08.2020. Available online at
https://www.youtube.com/watch?v=TMxAbNAVrzI&t=14s.
García, Salvador; Luengo, Julián; Herrera, Francisco (2014): Data Preprocessing in Data
Mining. 1. Aufl. s.l.: Springer-Verlag (Intelligent Systems Reference Library, v.72). Available
online at http://gbv.eblib.com/patron/FullRecord.aspx?p=1968256.
Gärtner, A. E.; Göhlich, D.; Fay, T.-A. (2023): Automated Condition Detection in Requirements
Engineering. In : ICED23 Proceedings, vol. 3, pp. 707716.
Gärtner, Alexander Elenga (2023): condition detection. Edited by Swarm Engineer. Available
online at https://swarm-engineer.me/2023/condition-detection/, updated on 2/7/2023, checked
on 2/7/2023.
Gärtner, Alexander Elenga; Fay, Tu-Anh; Göhlich, Dietmar (2022): Fundamental Research on
Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach. In
Applied Sciences 12 (15), p. 7628. DOI: 10.3390/app12157628.
Gärtner, Alexander Elenga; Göhlich, Dietmar (2023): Automated Requirement Contradiction
Detection through Formal Logic and LLMs.
Gärtner, Alexander Elenga; Göhlich, Dietmar (2024a): Automated requirement contradiction
detection through formal logic and LLMs. In Autom Softw Eng 31 (2). DOI: 10.1007/s10515-
024-00452-x.
Gärtner, Alexander Elenga; Göhlich, Dietmar (2024b): Towards an Automatic Contradiction
Detection in Requirements Engineering. In Internation Design Conference (Ed.): Design 24.
Gericke, Kilian; Blessing, L. (2012): An analysis of design process models across disciplines.
In D. Marjanovic, M. Storga, N. Pavkovic, N. Bojcetic (Eds.): DESIGN 2012. Proceedings of
the 12th International Design Conference, May 21 - 24, 2012, Dubrovnik, Croatia. Zagreb: Fac.
of Mechanical Engineering and Naval Architecture (DS, 3), pp. 171-180.
Advertisement
Publication bibliography XIII
Gericke, Kilian; Qureshi, A. J.; Blessing, Lucienne (2013): Analyzing Transdisciplinary Design
Processes in Industry: An Overview. In : Volume 5: 25th International Conference on Design
Theory and Methodology; ASME 2013 Power Transmission and Gearing Conference. ASME
2013 International Design Engineering Technical Conferences and Computers and Information
in Engineering Conference. Portland, Oregon, USA, 04.08.2013 - 07.08.2013: American
Society of Mechanical Engineers.
Gervasi, Vincenzo; Zowghi, Didar (2005): Reasoning about inconsistencies in natural
language requirements. In ACM Trans. Softw. Eng. Methodol. 14 (3), pp. 277330. DOI:
10.1145/1072997.1072999.
Giffin, Monica; Weck, Olivier de; Bounova, Gergana; Keller, Rene; Eckert, Claudia; Clarkson,
P. John (2009): Change Propagation Analysis in Complex Technical Systems. In Journal of
Mechanical Design 131 (8), Article 081001, p. 1. DOI: 10.1115/1.3149847.
Gillani Andleeb, Saira (2015): From text mining to knowledge mining: An integrated framework
of concept extraction and categorization for domain ontology.
Göhlich, Dietmar (2008): Innovationen der Fahrzeugtechnik am Beispiel der Mercedes-Benz
S-Klasse. In Volker Schindler (Ed.): Forschung für das Auto von Morgen. Berlin, Heidelberg:
Springer Berlin Heidelberg, pp. 129154.
Göhlich, Dietmar; Bender, Beate; Fay, Tu-Anh; Gericke, Kilian (2021): Product requirements
specification process in product development. In Proc. Des. Soc. 1, pp. 24592470. DOI:
10.1017/pds.2021.507.
Göhlich, Dietmar; Fay, Tu-Anh (2021a): Arbeiten mit Anforderungen: Requirements
Management. In Beate Bender, Kilian Gericke (Eds.): Pahl/Beitz Konstruktionslehre.
Methoden und Anwendung erfolgreicher Produktentwicklung, vol. 25. 9. Auflage. Berlin,
Heidelberg: Springer Vieweg (Springer eBook Collection), pp. 211229.
Göhlich, Dietmar; Fay, Tu-Anh (2021b): Arbeiten mit Anforderungen: Requirements
Management. In Beate Bender, Kilian Gericke (Eds.): Pahl/Beitz Konstruktionslehre.
Methoden und Anwendung erfolgreicher Produktentwicklung, vol. 25. 9. Auflage. Berlin,
Heidelberg: Springer Vieweg (Springer eBook Collection), pp. 211229.
Göhlich, Dietmar; Fay, Tu-Anh; Jefferies, Dominic; Lauth, Enrico; Kunith, Alexander; Zhang,
Xudong (2018): Design of urban electric bus systems. In Des. Sci. 4. DOI:
10.1017/dsj.2018.10.
Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron (2016): Deep learning. Cambridge,
Massachusetts, London, England: The MIT Press (Adaptive computation and machine
learning).
Gruber, Thomas R. (1993): A translation approach to portable ontology specifications. In
Knowledge Acquisition 5 (2), pp. 199220. DOI: 10.1006/knac.1993.1008.
Guarino, Nicola; Oberle, Daniel; Staab, Steffen (2009): What Is an Ontology? In Steffen Staab,
Rudi Studer (Eds.): Handbook on Ontologies, vol. 5. Berlin, Heidelberg: Springer Berlin
Heidelberg, pp. 117.
Gudivada, Venkat N. (2018): Natural Language Core Tasks and Applications. In :
Computational Analysis and Understanding of Natural Languages: Principles, Methods and
Applications, vol. 38: Elsevier (Handbook of Statistics), pp. 403428.
Guo, Weize; Zhang, Li; Lian, Xiaoli (2021): Automatically detecting the conflicts between
software requirements based on finer semantic analysis. Available online at
http://arxiv.org/pdf/2103.02255v1.
Hauser, Christopher (2019): Aristotle’s Explanationist Epistemology of Essence. In
Metaphysics 2 (1), pp. 2639. DOI: 10.5334/met.24.
Publication bibliography XIV
Hein, Phyo Htet; Kames, Elisabeth; Chen, Cheng; Morkos, Beshoy (2021): Employing machine
learning techniques to assess requirement change volatility. In Res Eng Design 32 (2),
pp. 245269. DOI: 10.1007/s00163-020-00353-6.
Heitmeyer, Constance L.; Jeffords, Ralph D.; Labaw, Bruce G. (1996): Automated consistency
checking of requirements specifications. In ACM Trans. Softw. Eng. Methodol. 5 (3), pp. 231
261. DOI: 10.1145/234426.234431.
Horn, Laurence R. (2018): Contradiction. Edited by The Stanford Encyclopedia of Philosophy.
The Metaphysics Research Lab. Available online at
https://plato.stanford.edu/archives/win2018/entries/contradiction/, checked on 4/21/2022.
Hunter, Anthony; Nuseibeh, Bashar (1998): Managing inconsistent specifications. In ACM
Trans. Softw. Eng. Methodol. 7 (4), pp. 335367. DOI: 10.1145/292182.292187.
IEEE/ISO/IEC 29148-2018, 2018: ISO/IEC/IEEE International Standard - Systems and
software engineering -- Life cycle processes -- Requirements engineering. Available online at
https://standards.ieee.org/ieee/29148/6937/, checked on 3/15/2023.
Jang, Amy; Uzsoy, Ana Sofia; Culliton, Phil (2020): Contradictory, My Dear Watson. Edited by
Kaggle. Available online at https://kaggle.com/competitions/contradictory-my-dear-watson,
checked on 3/13/2023.
Jindal, Rajni; Malhotra, Ruchika; Jain, Abha (2016): Automated classification of security
requirements. In : 2016 International Conference on Advances in Computing, Communications
and Informatics (ICACCI). 2016 International Conference on Advances in Computing,
Communications and Informatics (ICACCI). Jaipur, India, 21.09.2016 - 24.09.2016: IEEE,
pp. 20272033.
Johnson-Laird, Philip Nicholas (2006): How we reason. 1. publ. New York, NY: Oxford Univ.
Press.
Jun, Yennie (2023): Exploring Creativity in Large Language Models: From GPT-2 to GPT-4.
Analyzing the evolution of creative processes in large language models through creativity tests.
Edited by Towards Data Science. Available online at
https://towardsdatascience.com/exploring-creativity-in-large-language-models-from-gpt-2-to-
gpt-4-1c2d1779be57, checked on 5/13/2023.
Jurafsky, Dan; Martin, James H. (2019): Speech and Language Processing. Available online
at https://web.stanford.edu/~jurafsky/slp3/, checked on 4/8/2023.
Kant, Immanuel (2011): Kritik der reinen Vernunft. Vollst. Ausg. nach der 2., hin und wieder
verb. Aufl. 1787, vermehrt um die Vorrede zur 1. Aufl. 1781. Köln: Anaconda.
Karlova-Bourbonus, Natali (2019): Automatic detection of contradictions in texts. Gießen:
Universitätsbibliothek. Available online at http://geb.uni-giessen.de/geb/volltexte/2019/14447/.
Kasneci, Enkelejda; Sessler, Kathrin; Küchemann, Stefan; Bannert, Maria; Dementieva,
Daryna; Fischer, Frank et al. (2023): ChatGPT for good? On opportunities and challenges of
large language models for education. In Learning and Individual Differences 103, p. 102274.
DOI: 10.1016/j.lindif.2023.102274.
Kesselring, Thomas (2013): Formallogischer Widerspruch, dialektischer Widerspruch,
Antinomie. Reflexionen über den Widerspruch. In Stefan Müller (Ed.): Jenseits der Dichotomie.
Wiesbaden: Springer Fachmedien Wiesbaden, pp. 1538.
KHOO, C. S. G.; KORNFILT, J.; ODDY, R. N.; MYAENG, S. H. (1998): Automatic Extraction
of Cause-Effect Information from Newspaper Text Without Knowledge-based Inferencing. In
Literary and Linguistic Computing 13 (4), pp. 177186. DOI: 10.1093/llc/13.4.177.
Kim, Haeng Kon (Ed.) (2018): 2018 IEEE/ACIS 19th International Conference on Software
Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).
June 27-29, 2018, Busan, Korea : proceedings. Institute of Electrical and Electronics
Engineers; IEEE Computer Society; International Association for Computer and Information
Advertisement
Publication bibliography XV
Science. Piscataway, NJ: IEEE. Available online at
http://ieeexplore.ieee.org/servlet/opac?punumber=8422066.
Kingston, John; Schafer, Burkhard; Vandenberghe, Wim (2004): Towards a Financial Fraud
Ontology: A Legal Modelling Approach. In Artif Intell Law 12 (4), pp. 419446. DOI:
10.1007/s10506-005-4163-0.
Knothe, Frank; Mast, Jürgen; Böttger, Matthias; Pfeiffer, Peter; Futschik, Hans-Peter;
Hutzenlaub, Holger et al. (2006): Die neue CL-Klasse von Mercedes-Benz. In ATZ
Automobiltech Z 108 (10), pp. 800813. DOI: 10.1007/BF03221821.
Kurtanovic, Zijad; Maalej, Walid (2017): Automatically Classifying Functional and Non-
functional Requirements Using Supervised Machine Learning. In : 2017 IEEE 25th
International Requirements Engineering Conference (RE). 2017 IEEE 25th International
Requirements Engineering Conference (RE). Lisbon, Portugal, 04.09.2017 - 08.09.2017:
IEEE, pp. 490495.
Lamsweerde, Darimont, Letier (1998): Managing conflicts in goal-driven requirements
engineering. In : IEEE Transactions on Software Engineering, vol. 24, no. 11, Page(s): 908 -
926.
Landhäußer, Mathias; Körner, Sven J. (2017): Artificial Intelligence in Requirements
Engineering.
Lashin, Gamal (2021): Technisches Änderungsmanagement. In Beate Bender, Kilian Gericke
(Eds.): Pahl/Beitz Konstruktionslehre. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 919
942.
Lendvai, Piroska; Reichel, Uwe D.: Contradiction Detection for Rumorous Claims. Available
online at http://arxiv.org/pdf/1611.02588v2.
Li, Luyang; Qin, Bing; Liu, Ting (2017): Contradiction Detection with Contradiction-Specific
Word Embedding. In Algorithms 10 (2), p. 59. DOI: 10.3390/a10020059.
Liu, Chun; Zhao, Zhengyi; Zhang, Lei; Li, Zheng (2021): Automated Conditional Statements
Checking for Complete Natural Language Requirements Specification. In Applied Sciences 11
(17), p. 7892. DOI: 10.3390/app11177892.
Liu, Quan; Jiang, Hui; Wei, Si; Ling, Zhen-Hua; Hu, Yu (2015): Learning Semantic Word
Embeddings based on Ordinal Knowledge Constraints. In Chengqing Zong, Michael Strube
(Eds.): Proceedings of the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural Language Processing.
Beijing, China: Association for Computational Linguistics, pp. 15011511.
Loucopoulos, Pericles (2005): Requirements engineering. In John Clarkson, Claudia Eckert
(Eds.): Design process improvement. London: Springer London, pp. 116139.
Luisa, Mich; Mariangela, Franch; Pierluigi, Novi Inverardi (2004): Market research for
requirements analysis using linguistic tools. In Requirements Eng 9 (1), pp. 4056. DOI:
10.1007/s00766-003-0179-8.
Lutz, Robyn R. (1993): Targeting safety-related errors during software requirements analysis.
In SIGSOFT Softw. Eng. Notes 18 (5), pp. 99106. DOI: 10.1145/167049.167069.
Maedche, Alexander; Motik, Boris; Silva, Nuno; Volz, Raphael (2002): MAFRA A MApping
FRAmework for Distributed Ontologies. In G. Goos, J. Hartmanis, J. van Leeuwen, Asunción
Gómez-Pérez, V. Richard Benjamins (Eds.): Knowledge Engineering and Knowledge
Management: Ontologies and the Semantic Web, vol. 2473. Berlin, Heidelberg: Springer Berlin
Heidelberg (Lecture Notes in Computer Science), pp. 235250.
Maher, Mary Lou; Poon, Josiah (1996): Modeling Design Exploration as Co-Evolution. In
Computer-Aided Civil and Infrastructure Engineering 11 (3), pp. 195209. DOI:
10.1111/j.1467-8667.1996.tb00323.x.
Publication bibliography XVI
Manning, Christopher D. (2022): Human Language Understanding & Reasoning. In Daedalus
151 (2), pp. 127138. DOI: 10.1162/daed_a_01905.
Mäntylä, Mika V.; Graziotin, Daniel; Kuutila, Miikka (2018): The evolution of sentiment
analysisA review of research topics, venues, and top cited papers. In Computer Science
Review 27, pp. 1632. DOI: 10.1016/j.cosrev.2017.10.002.
Marneffe, Rafferty, Manning (2008): Finding Contradictions in Text. Edited by Stanford
University. USA. Available online at https://nlp.stanford.edu/pubs/contradiction-acl08.pdf,
checked on 4/13/2022.
Marques, Ricardo (2015): Detecting Contradictions in News Quotations.
Maxwell, Joseph (1992): Understanding and Validity in Qualitative Research. In Harvard
Educational Review 62 (3), pp. 279301. DOI: 10.17763/haer.62.3.8323320856251826.
Merriam-Webster: condition. Edited by Merriam-Webster.com. Available online at
https://www.merriam-webster.com/dictionary/condition, checked on 10/4/2022.
Merritt, Rick (2022): What Is a Transformer Model? Edited by NVIDIA Blog. Available online at
https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/, checked on 10.2023.
Miller, Tim (2019): Explanation in artificial intelligence: Insights from the social sciences. In
Artificial Intelligence 267, pp. 138. DOI: 10.1016/j.artint.2018.07.007.
Montgomery, Lloyd; Fucci, Davide; Bouraffa, Abir; Scholz, Lisa; Maalej, Walid (2022a):
Empirical research on requirements quality: a systematic mapping study. In Requirements Eng
27 (2), pp. 183209. DOI: 10.1007/s00766-021-00367-z.
Montgomery, Lloyd; Fucci, Davide; Bouraffa, Abir; Scholz, Lisa; Maalej, Walid (2022b):
Empirical research on requirements quality: a systematic mapping study. In Requirements Eng
29 (4), p. 315. DOI: 10.1007/s00766-021-00367-z.
MP: 1005b, 19f.: Aristoteles (1831fI.): Metaphysik (hrsg. v. d. Preußischen Akademie der
Wissenschaften), Band 1, Berlin.
Narkhedem, Sarang (2018): Understanding Confusion Matrix. Edited by Towards Data
Science. Available online at https://towardsdatascience.com/understanding-confusion-matrix-
a9ad42dcfd62, checked on 10/11/2022.
Naz, Tabbasum; Akhtar, Muhammad; Shahzad, Syed Khuram; Fasli, Maria; Iqbal, Muhammad
Waseem; Naqvi, Muhammad Raza (2020): Ontology-driven advanced drug-drug interaction.
In Computers & Electrical Engineering 86, p. 106695. DOI:
10.1016/j.compeleceng.2020.106695.
Newell, A.; Simon, H. (1956): The logic theory machine--A complex information processing
system. In IEEE Trans. Inform. Theory 2 (3), pp. 6179. DOI: 10.1109/TIT.1956.1056797.
Ontology learning and population. Bridging the gap between text and knowledge (2008).
Amsterdam: Ios Press (Frontiers in artificial intelligence and applications, v. 167). Available
online at
https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=
221155.
openai (2020): Models. GPT-3. Edited by openai. Available online at
https://platform.openai.com/docs/models/gpt-3, checked on 3/15/2023.
openai (2023a): ChatGPT Release Notes. The latest update for ChatGPT. Available online
at https://help.openai.com/en/articles/6825453-chatgpt-release-notes, updated on 9/19/2023.
openai (2023b): GPT-4 Technical Report, p. 7. Available online at
https://cdn.openai.com/papers/gpt-4.pdf.
Advertisement
Publication bibliography XVII
Parsons, Terence (2021): The Traditional Square of Opposition. Edited by The Stanford
Encyclopedia of Philosophy. Available online at
https://plato.stanford.edu/archives/fall2021/entries/square/, checked on 4/21/2022.
Pawlik, V. (2022): Pro-Kopf-Stromverbrauch in Deutschland in den Jahren 1995 bis 2022.
Edited by Statista. Available online at
https://de.statista.com/statistik/daten/studie/240696/umfrage/pro-kopf-stromverbrauch-in-
deutschland/, checked on 10/13/2023.
Pohl, Klaus; Rupp, Chris (2021): Basiswissen Requirements Engineering. Aus- und
Weiterbildung nach IREB-Standard zum Certified Professional for Requirements Engineering
Foundation Level. 5., überarbeitete und aktualisierte Auflage. Heidelberg: dpunkt.verlag.
Preum, Sarah Masud; Mondol, Abu Sayeed; Ma, Meiyi; Wang, Hongning; Stankovic, John A.
(2017): Preclude: Conflict detection in textual health advice. In : 2017 IEEE International
Conference on Pervasive Computing and Communications (PerCom). 2017 IEEE International
Conference on Pervasive Computing and Communications (PerCom). Kona, Big Island, HI,
USA, 13.03.2017 - 17.03.2017: IEEE, pp. 286296.
DIN DIN 69901-5:2009-01, 2009: Project management - Project management systems - Part
5: Concepts.
DIN ISO DIN ISO 9000:2015, 2015: Qualitätsmanagementsysteme.
Ritter, Alan; Soderland, Stephen; Downey, Doug; Etzioni, Oren (2008): It’s a Contradiction --
No, it’s Not: A Case Study using Functional Relations. In : Proceedings of the 2008 Conference
on Empirical Methods in Natural Language Processing. With assistance of Association for
Computational Linguistics, pp. 1120.
Robertson, Suzanne; Robertson, James (2013): Mastering the requirements process. Getting
requirements right. 3. ed. Upper Saddle River, NJ, Munich: Addison-Wesley (Always learning).
Rowley, Jennifer (2007): The wisdom hierarchy: representations of the DIKW hierarchy. In
Journal of Information Science 33 (2), pp. 163180. DOI: 10.1177/0165551506070706.
Russell, Stuart J.; Norvig, Peter (2016): Artificial intelligence. A modern approach. With
assistance of Ernest Davis, Douglas Edwards. Third edition, Global edition. Boston, Columbus,
Indianapolis, New York, San Francisco, Upper Saddle River, Amsterdam, Cape Town, Dubai,
London, Madrid, Milan, Munich, Paris, Montreal, Toronto, Delhi, Mexico City, Sao Paulo,
Sydney, Hong Kong, Seoul, Singapore, Taipei, Tokyo: Pearson (Always learning).
Sack, Harald; Alam, Mehwish (2020): Knowledge Graphs. Edited by Hasso-Plattner-Institut.
Fiz Karlsruhe - Leibniz Institute for Information Infrastructure & Karlsruhe Institute of
Technology. Hasso-Plattner-Institut. Available online at
https://open.hpi.de/courses/knowledgegraphs2020, checked on 4/6/2023.
Saenko, Kate (2023): A Computer Scientist Breaks Down Generative AI’s Hefty Carbon
Footprint. Edited by Scientific American. Available online at
https://www.scientificamerican.com/article/a-computer-scientist-breaks-down-generative-ais-
hefty-carbon-footprint/, checked on 10/13/2023.
Sandhu, Geet; Sikka, Sunil (2015): State-of-art practices to detect inconsistencies and
ambiguities from software requirements. In : International Conference on Computing,
Communication & Automation. 2015 International Conference on Computing, Communication
& Automation (ICCCA). Greater Noida, India, 15.05.2015: IEEE, pp. 812817.
Sarafraz, Farzaneh (2011): Finding Conflicting Statements in the Biomedical Literature.
Scheuermann, Andreas; Leukel, Joerg (2014): Supply chain management ontology from an
ontology engineering perspective. In Computers in Industry 65 (6), pp. 913923. DOI:
10.1016/j.compind.2014.02.009.
Publication bibliography XVIII
Schwitter, Rolf (2010): Controlled Natural Languages for Knowledge Representation. In Coling
2010 Organizing Committee (Ed.): Coling 2010: Posters, pp. 11131121. Available online at
https://aclanthology.org/C10-2128.
Sennrich, Rico; Volk, Martin; Schneider, Gerold (2013): Exploiting Synergies Between Open
Resources for German Dependency Parsing, POS-tagging, and Morphological Analysis: s.n.
Sepúlveda-Torres, Robiert; Bonet-Jover, Alba; Saquete, Estela (2021a): “Here Are the Rules:
Ignore All Rules”: Automatic Contradiction Detection in Spanish. In Applied Sciences 11 (7),
p. 3060. DOI: 10.3390/app11073060.
Sepúlveda-Torres, Robiert; Bonet-Jover, Alba; Saquete, Estela (2021b): “Here Are the Rules:
Ignore All Rules”: Automatic Contradiction Detection in Spanish. In Applied Sciences 11 (7),
p. 3060. DOI: 10.3390/app11073060.
Shultz, Thomas R.; Fahlman, Scott E.; Craw, Susan; Andritsos, Periklis; Tsaparas, Panayiotis;
Silva, Ricardo et al. (2010): Confusion Matrix. In Claude Sammut, Geoffrey I. Webb (Eds.):
Encyclopedia of Machine Learning. Boston, MA: Springer US, p. 209.
Simperl, Elena Paslaru Bontas; Tempich, Christoph (2006): Ontology Engineering: A Reality
Check. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann
Mattern, John C. Mitchell et al. (Eds.): On the Move to Meaningful Internet Systems 2006:
CoopIS, DOA, GADA, and ODBASE, vol. 4275. Berlin, Heidelberg: Springer Berlin Heidelberg
(Lecture Notes in Computer Science), pp. 836854.
Sophist GmbH (2016): Schablonen für alle Fälle. Available online at
http://www.sophist.de/MASTeR-Broschuere/., checked on 3/28/2023.
Standish Group (1995): The CHAOS Report 1995.
Surana, S., Dembla, S. & Bihani, P (2022): Identifying Contradictions in the Legal Proceedings
Using Natural Language Models. In : SN COMPUT. SCI. 3. Springer. Available online at
https://doi.org/10.1007/s42979-022-01075-3.
Tawfik, Noha S.; Spruit, Marco R. (2018): Automated Contradiction Detection in Biomedical
Literature. In Petra Perner (Ed.): Machine Learning and Data Mining in Pattern Recognition,
vol. 10934. Cham: Springer International Publishing (Lecture Notes in Computer Science),
pp. 138148.
The Standish Group (Ed.) (2014): Chaos Report.
Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne;
Lacroix, Timothée et al. (2023): LLaMA: Open and Efficient Foundation Language Models.
Twain, Mark (1880): The Awful German Language. In Mark Twain: A Tramp Abroad. Toronto:
Belford & Co., pp. 879897.
Uschold, Mike; Gruninger, Michael (1996): Ontologies: principles, methods and applications.
In The Knowledge Engineering Review 11 (2), pp. 93136. DOI:
10.1017/S0269888900007797.
VDA (2007): Automotive VDA-Standardvorlage Komponentenlastenheft (1st ed.). With
assistance of Druck Henrich, Medien GmbH & Co. KG. Frankfurt am Main.
Wiegers, Karl; Beatty, Joy (2013): Software requirements. 3. ed. [fully updated and expanded].
Redmond, Wash.: Microsoft Press (Best practices).
Wohlin, Claes; Runeson, Per; Höst, Martin; Ohlsson, Magnus C.; Regnell, Björn; Wesslén,
Anders (2012): Experimentation in Software Engineering. Berlin, Heidelberg: Springer Berlin
Heidelberg.
Wong, Wilson; Liu, Wei; Bennamoun, Mohammed (2012): Ontology learning from text. In ACM
Comput. Surv. 44 (4), pp. 136. DOI: 10.1145/2333112.2333115.
Advertisement
Publication bibliography XIX
Wu, Xiangcheng; Niu, Xi; Rahman, Ruhani (2022): Topological Analysis of Contradictions in
Text. In Enrique Amigo, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper,
Gabriella Kazai (Eds.): Proceedings of the 45th International ACM SIGIR Conference on
Research and Development in Information Retrieval. SIGIR '22: The 45th International ACM
SIGIR Conference on Research and Development in Information Retrieval. Madrid Spain, 11
07 2022 15 07 2022. New York, NY, USA: ACM, pp. 24782483.
Wynn, David C.; Clarkson, P. John (2018): Process models in design and development. In Res
Eng Design 29 (2), pp. 161202. DOI: 10.1007/s00163-017-0262-7.
Zhao, Liping; Alhoshan, Waad; Ferrari, Alessio; Letsholo, Keletso J.; Ajagbe, Muideen A.;
Chioasca, Erol-Valeriu; Batista-Navarro, Riza T. (2021): Natural Language Processing for
Requirements Engineering. In ACM Comput. Surv. 54 (3), pp. 141. DOI: 10.1145/3444689.
Zodhya (2023): How much energy does ChatGPT consume? Edited by medium.com. Available
online at https://medium.com/@zodhyatech/how-much-energy-does-chatgpt-consume-
4cba1a7aef85, checked on 10/13/2023.