Automatic detection of contradictions in requirements specifications [original]

AUTOMATIC DETECTION OF CONTRADICTIONS IN REQUIREMENTS

SPECIFICATIONS

vorgelegt von

M.Sc.

Alexander Elenga Gärtner

an der Fakultät V – Verkehrs- und Maschinensysteme

der Technischen Universität Berlin

zur Erlangung des akademischen Grades

Doktor der Ingenieurwissenschaften

- Dr. Ing. -

genehmigte Dissertation

Promotionsausschuss:

Vorsitzender: Prof. Dr.-Ing. Rainer Stark

Erstgutachter: Prof. Dr.-Ing. Dietmar Göhlich

Zweitgutachterin: Prof. Dr.-Ing. Beate Bender

Tag der wissenschaftlichen Aussprache: 10. Juni 2024

Berlin 2024

Vorsitzender:

Erstgutachter: Prof. Dr.-Ing. Dietmar Göhlich

Zweitgutachterin: Prof. Dr.-Ing. Beate Bender

Tag der wissenschaftlichen Aussprache: 10.06.2024

Berlin 2024

Danksagung

Diese kumulative Dissertation entstand während meiner Tätigkeit bei der IAV GmbH und

meiner Promotion am Lehrstuhl Methoden der Produktentwicklung und Mechatronik (MPM) an

der TU Berlin.

Zunächst möchte ich Professor Göhlich, meinem Doktorvater, meinen aufrichtigen Dank für

die Betreuung dieser Arbeit aussprechen. Sein Vertrauen in mich sowie seine fachliche

Expertise haben meine Promotion maßgeblich geprägt, ebenso wie sein zugewandtes

Mentoring.

Ein herzliches Dankeschön geht auch an Professorin Bender für ihre Bereitschaft, als

Zweitkorrektorin für diese Dissertation zu fungieren, sowie an Professor Stark für den Vorsitz

des Prüfungsausschusses.

Ein besonderer Dank gebührt meiner Mutter und Cansu, die mir während meiner

Promotionszeit stets zur Seite standen und mich in meiner Doppelbelastung unterstützten.

Ohne ihr Verständnis hätte ich dieses Pensum niemals durchhalten können.

Ich möchte auch Grigorii Gerdzhikov ausdrücklich für unseren regen und regelmäßigen

Austausch danken. Unsere Gespräche haben mich Woche für Woche angespornt,

kontinuierlich Ergebnisse zu liefern. Dr. Felix Matthies verdient meinen Dank für seine

persönliche Unterstützung und die wertvollen Tipps, die er mir gegeben hat.

Dr. Tu-Anh Fay danke ich für ihre umfassende Einführung in die wissenschaftliche Arbeits-

und Denkweise.

Des Weiteren möchte ich mich bei all meinen Freunden und meiner Familie bedanken. Ihr habt

mich stets motiviert gehalten.

Berlin, Juni 2024 Alexander Elenga Gärtner

Abstract I

Abstract

Alexander Elenga Gärtner

Automatic detection of Contradictions in Requirements Specifications

Technische Universität Berlin – Faculty of Mechanical Engineering and Transport Systems

Chair of Methods for Product Development and Mechatronics

June 2024

Defining a complete, unambiguous, and contradiction-free target system is essential in product

development. When developing a product or system, comprehensive requirements detail the

desired functionalities and characteristics. However, these documents often contain errors in

the form of contradictions. This occurrence can be attributed to the inherent complexity of the

requirements specifications, encompassing several thousand individual requirements derived

through interdisciplinary collaboration. Furthermore, these requirements are articulated in

natural language, which introduces an additional layer of complexity due to potential

ambiguities and inconsistencies arising from the diverse perspectives and interpretations of

the multiple stakeholders involved.

Identifying and rectifying these contradictions manually presents considerable challenges and

drawbacks. The manual correction process is time-consuming and incurs high costs due to the

extensive effort required to inspect and analyze each requirement to detect potential

contradictions. Moreover, the implications of not addressing these contradictions on time can

give rise to complications in subsequent stages of product development. Failure to resolve

contradictory requirements can lead to misunderstandings, conflicts, and inefficiencies during

the design, implementation, and testing phases, ultimately hindering the successful realization

of the intended product or system.

Thus, the detection and resolution of contradictions in complex requirements documents

emerge as a critical area of research and development within the field of product development.

By creating automated methods and tools to identify and address these contradictions, it

becomes possible to mitigate the costs, complexities, and risks associated with manual error

correction. Additionally, automating this process enables the timely identification of

contradictions, allowing for their resolution during the early stages of product development,

thereby preventing potential complications and improving the overall efficiency and

effectiveness of the development process.

This work addresses this issue by discussing scientific theories regarding contradictions and

providing a comprehensive definition of what they are and how they can occur in requirement

documents. Drawing from the Aristotelian Logic of non-contradiction, specific subtypes of

contradictions within the requirements engineering context are defined, each exhibiting varying

levels of criticality. The proposed methodology involves formalizing requirements sentences

by extracting conditions and effects, as well as variables and actions, resulting in the

development of a taxonomy. Analyzing these constituent elements using specific yet simple

questions makes detecting the subtypes of contradictions possible.

To accomplish this objective, natural language processing methods based on grammatical

rules, machine learning, and deep learning models have been applied and thoroughly

analyzed. Consequently, a new method, ALICE (Automated Logic for the Identification of

Contradictions in Engineering), has been proposed. ALICE combines the logical analysis

capabilities of symbolic AI and the data-driven approach of LLMs, leading to the accurate

identification and classification of contradictions. The findings also demonstrate that this

Abstract II

approach outperforms the sole utilization of either symbolic AI or LLMs. As a result, ALICE

contributes to improved product development outcomes.

In conclusion, this work paves the way for future research endeavors in the field of automated

requirements engineering by offering a solution and proof-of-concept for detecting

contradictions within complex requirements documents. By enabling the automated

identification of contradictions, this research aims to enhance the overall quality of

requirements, thereby fostering more precise communication, increased consistency, and

reduced conflicts in product development.

Zusammenfassung III

Zusammenfassung

Alexander Elenga Gärtner

Automatische Detektion von Widersprüchen in Anforderungsspezifikationen

Technische Universität Berlin – Fakultät für Verkehrs- und Maschinensysteme

Fachgebiet Methoden der Produktentwicklung und Mechatronik

Juni 2024

In der Produktentwicklung ist die Definition eines vollständigen, eindeutigen und

widerspruchsfreien Zielsystems von entscheidender Bedeutung. Während der

Produktentwicklung werden umfangreiche Anforderungsdokumente erstellt, um die

gewünschten Funktionalitäten und Eigenschaften detailliert zu beschreiben. Jedoch enthalten

diese Dokumente oft Fehler in Form von Widersprüchen. Dies ist auf die inhärente Komplexität

der Dokumente zurückzuführen, da sie tausende individuelle Anforderungen umfassen und

durch interdisziplinäre Zusammenarbeit entstehen. Die Formulierung in natürlicher Sprache

führt zusätzliche Komplexität durch potenzielle Mehrdeutigkeiten und Inkonsistenzen aus

verschiedenen Perspektiven und Interpretationen der beteiligten Stakeholder ein.

Die manuelle Identifizierung und Behebung dieser Widersprüche sind zeitaufwändig und

kostenintensiv. Der Korrekturprozess erfordert eine umfassende Überprüfung und Analyse

jeder Anforderung, um potenzielle Widersprüche zu erkennen. Nicht behobene Widersprüche

können zu Komplikationen in späteren Phasen der Produktentwicklung führen, was

Missverständnisse, Konflikte und Ineffizienzen verursachen und letztendlich die erfolgreiche

Umsetzung des Produktes oder Systems behindern kann.

Deshalb ist die Erkennung und Behebung von Widersprüchen in komplexen

Anforderungsdokumenten ein kritisches Forschungs- und Entwicklungsfeld. Durch die

Einführung automatisierter Methoden und Werkzeuge zur Identifizierung und Behebung dieser

Widersprüche ist es möglich, die mit manueller Fehlerkorrektur verbundenen Kosten,

Komplexitäten und Risiken zu minimieren. Darüber hinaus ermöglicht die Automatisierung

dieser Aufgabe die rechtzeitige Identifizierung von Widersprüchen und deren Auflösung in den

frühen Phasen der Produktentwicklung. Dadurch werdenn potenzielle Komplikationen

vermieden, und die Gesamteffizienz und -effektivität des Entwicklungsprozesses verbessert.

Diese Arbeit diskutiert wissenschaftliche Theorien zu Widersprüchen und bietet eine

umfassende Definition der Arten von Widersprüchen in Anforderungsdokumenten. Unter

Bezugnahme des aristotelischen Gesetz des Widerspruchs werden spezifische Unterarten von

Widersprüchen mit unterschiedlicher Kritikalität für das Anforderungsengineering identifiziert.

Die vorgeschlagene Methodik umfasst die Formalisierung von Anforderungen durch

Extrahieren von Bedingungen, Effekten, Variablen und Aktionen, was zur Entwicklung einer

Taxonomie führt. Durch die Analyse dieser Konstituenten anhand einfacher Fragen wird die

Erkennung von Widersprüchen ermöglicht.

Um dieses Ziel zu erreichen, wurden Methoden zur natürlichen Sprachverarbeitung, basierend

auf grammatikalischen Regeln, maschinellem Lernen und Deep-Learning-Modellen analysiert

und angewendet. Dies führte zur Entwicklung eines neuen Ansatzes namens ALICE

(Automated Logic for the Identification of Contradictions in Engineering) zur Erkennung von

Widersprüchen in Anforderungen. ALICE kombiniert analytische Fähigkeiten symbolischer KI

mit dem datengetriebenen Ansatz von LLMs, was zu einer genauen Identifizierung und

Klassifizierung von Widersprüchen führt. Die Ergebnisse zeigen, dass dieser Ansatz die

alleinige Verwendung von symbolischer KI oder LLMs zur Erkennung von Widersprüchen

übertrifft. ALICE trägt folglich zu einer Verbesserung des Produktentwicklungsprozesses bei.

Zusammenfassend ebnet diese Arbeit den Weg für zukünftige Forschungsaktivitäten im

Anforderungsengineering, indem sie eine automatisierte Lösung und einen Proof-of-Concept

Zusammenfassung IV

zur Erkennung von Widersprüchen in komplexen Anforderungsdokumenten bietet. Durch die

automatisierte Identifizierung von Widersprüchen zielt diese Forschung darauf ab, die

Gesamtqualität der Anforderungen zu verbessern, was zu effektiverer Kommunikation,

erhöhter Konsistenz und reduzierten Konflikten in der Produktentwicklung führt.

Table of Contents  V 
 
 
Table of Contents 
ABSTRACT ..................................................................................................................................................... I 
ZUSAMMENFASSUNG ............................................................................................................................... III 
TABLE OF CONTENTS ................................................................................................................................ V 
LIST OF ABBREVIATIONS ........................................................................................................................ VII 
LIST OF FIGURES ...................................................................................................................................... VII 
LIST OF TABLES ......................................................................................................................................... IX 
INTRODUCTION AND MOTIVATION ................................................................................................. 1 
1 MOTIVATION........................................................................................................................................... 1 
2 CONCEPT AND HOW-TO-READ .............................................................................................................. 3 
THEORETICAL EMBEDDING ............................................................................................................. 6 
1 KNOWLEDGE: FROM KANT TO DIKW AND NLP ................................................................................... 6 
2 WISDOM: ONTOLOGIES OR LLMS? ....................................................................................................... 7 
2.1 Ontologies ........................................................................................................................................ 7 
2.2 Large Language Models in Knowledge Representation ........................................................... 8 
2.3 Conclusion ..................................................................................................................................... 10 
3 PROPOSITIONAL LOGIC ....................................................................................................................... 11 
FUNDAMENTAL RESEARCH ON DETECTING CONTRADICTIONS IN REQUIREMENTS: 
TAXONOMY AND SEMI-AUTOMATED APPROACH ............................................................................. 13 
1 ABSTRACT ........................................................................................................................................... 13 
2 INTRODUCTION .................................................................................................................................... 13 
2.1 Problem .......................................................................................................................................... 13 
2.2 Contribution ................................................................................................................................... 14 
3 FUNDAMENTALS................................................................................................................................... 14 
3.1 Formulation and building blocks ................................................................................................. 14 
3.2 Contradictions ............................................................................................................................... 14 
4 RELATED WORK .................................................................................................................................. 16 
4.1 Classification of conflicts ............................................................................................................. 16 
4.2 Natural Language Processing for detecting conflicts .............................................................. 17 
4.3 Ontologies for detecting conflicts ............................................................................................... 17 
5 METHOD FOR DETECTING CONTRADICTIONS ...................................................................................... 17 
5.1 Nomenclature ................................................................................................................................ 17 
5.2 Contradictions – Subcategories ................................................................................................. 18 
5.3 Process .......................................................................................................................................... 20 
6 MATERIALS AND RESULTS ................................................................................................................... 21 
6.1 Materials......................................................................................................................................... 21 
6.2 Results ........................................................................................................................................... 21 
7 DISCUSSION ......................................................................................................................................... 27 
8 CONCLUSION ....................................................................................................................................... 27 
9 OTHER ................................................................................................................................................. 28 
AUTOMATED CONDITION DETECTION IN REQUIREMENTS ENGINEERING ......................... 29 
1 ABSTRACT ........................................................................................................................................... 29 
2 INTRODUCTION .................................................................................................................................... 29 
3 TERMINOLOGY ..................................................................................................................................... 30 
4 RELATED WORK .................................................................................................................................. 31 
5 STUDY .................................................................................................................................................. 32 
5.1 Method ........................................................................................................................................... 32 
5.2 Data ................................................................................................................................................ 33 
6 RESULTS .............................................................................................................................................. 33 
6.1 Dataset 1 ........................................................................................................................................ 33 
6.2 Dataset 2 ........................................................................................................................................ 34 

Table of Contents  VI 
 
 
6.3 Detection of constituents ............................................................................................................. 35 
6.4 Discussion of Validity ................................................................................................................... 36 
7 SUMMARY, CONCLUSION AND OUTLOOK ............................................................................................ 36 
8 AUTHOR CONTRIBUTIONS ................................................................................................................... 37 
9 ACKNOWLEDGMENTS .......................................................................................................................... 37 
AUTOMATED REQUIREMENT CONTRADICTION DETECTION THROUGH FORMAL LOGIC 
AND LLMS ................................................................................................................................................... 38 
1 ABSTRACT ........................................................................................................................................... 38 
2 INTRODUCTION .................................................................................................................................... 39 
3 FUNDAMENTALS................................................................................................................................... 40 
3.1 Contradictions ............................................................................................................................... 40 
3.2 Theoretical Background for NLP ................................................................................................ 41 
4 RELATED WORK .................................................................................................................................. 42 
4.1 Classifying Conflicts ..................................................................................................................... 42 
4.2 Natural Language Processing for Detecting Contradictions .................................................. 43 
5 AUTOMATION METHOD ........................................................................................................................ 44 
5.1 Method Fundamentals ................................................................................................................. 44 
5.2 Preprocessing ............................................................................................................................... 46 
5.3 Questions ....................................................................................................................................... 47 
6 VALIDATION AND RESULTS .................................................................................................................. 47 
6.1 Data Analysis and Methodology ................................................................................................. 47 
6.2 Results for Dataset 1 .................................................................................................................... 48 
6.3 Results for Dataset 2 .................................................................................................................... 50 
6.4 Application to large datasets ....................................................................................................... 52 
6.5 LLM Comparison .......................................................................................................................... 52 
6.6 Criticality Assessment .................................................................................................................. 53 
6.7 Conclusion ..................................................................................................................................... 53 
7 LIMITS .................................................................................................................................................. 54 
7.1 Limits of LLMs for Contradiction Detection ............................................................................... 54 
7.2 Limits of Formal Logic .................................................................................................................. 55 
7.3 Threats to Validity ......................................................................................................................... 56 
8 CONCLUSION AND OUTLOOK ............................................................................................................... 56 
9 APPENDIX ............................................................................................................................................ 58 
10 AUTHOR CONTRIBUTIONS ................................................................................................................... 60 
11 DATA AVAILABILITY .............................................................................................................................. 60 
12 ACKNOWLEDGMENTS .......................................................................................................................... 60 
13 COMPETING INTERESTS ...................................................................................................................... 60 
DISCUSSION ....................................................................................................................................... 61 
1 COMPARISON OF THE RESULTS WITH THE LITERATURE ..................................................................... 61 
1.1 Taxonomy ...................................................................................................................................... 63 
1.2 Preprocessing and Part-of-Speech tagging:............................................................................. 64 
1.3 Interpretation Part 1...................................................................................................................... 64 
1.4 Interpretation Part 2...................................................................................................................... 65 
1.5 Conclusion ..................................................................................................................................... 66 
2 FROM METHOD TO MECHANISM: ENVISIONING ALICE AS A TOOL .................................................... 67 
2.1 Operational Workflow ................................................................................................................... 67 
2.2 Usability and Integration .............................................................................................................. 68 
3 DEVELOPMENT PROCESSES INTEGRATION ........................................................................................ 70 
3.1 Product Requirements Specification Process .......................................................................... 70 
3.2 Process Models and Guidelines for Product Development .................................................... 71 
4 LIMITATIONS AND THREATS TO VALIDITY ............................................................................................. 75 
4.1 Limitations of the Methodology ................................................................................................... 75 
4.2 Threats to Validity ......................................................................................................................... 75 
SUMMARY AND OUTLOOK .............................................................................................................. 76 
1 SUMMARY ............................................................................................................................................ 76 

List of Abbreviations VII

7.2 OUTLOOK ............................................................................................................................................. 77

8 PUBLICATION BIBLIOGRAPHY ........................................................................................................ X

List of Abbreviations

ACC

Accuracy

ADVB

Adverbial

Artificial Intelligence

ALICE

Automated Logic for the Identification of Contradictions in Engineering

API

Application Programming Interface

DIKW

Data, Information, Knowledge, Wisdom

kNN

k-Nearest Neighbor

LLM

Large Language Model

LNC

Law of non-Contradiction

Machine Learning

Naïve Bayes

NLI

Natural Language Inferencing

NLP

Natural Language Processing

PoC

Proof of Concept

POS

Part-of-Speech

Requirements Engineering

REC

Recall

Subordinate-Clause / Sub-Clause

Sensitivity

TPR

True Positive Rate

List of Figures

Figure 1-1: Dilemma of product development (Ehrlenspiel and Meerkamm 2017) .................. 2

Figure 1-2: Contents and structure of this dissertation ............................................................ 4

Figure 2-1: DIKW pyramid (Ackoff 1989) ................................................................................ 6

Figure 2-2: DIKW and Contradiction Detection ....................................................................... 7

Figure 2-3: How to represent Ontologies (Sack and Alam 2020) ............................................ 8

Figure 2-4: Working with LLMs ............................................................................................... 9

Figure 2-5: Data, Information, and Knowledge (Awad 2003)................................................. 11

Figure 3-1: Formulation and building blocks ......................................................................... 14

List of Figures  VIII 
 
 
Figure 3-2: Square of opposition .......................................................................................... 15 
Figure 3-3: Contradictions .................................................................................................... 17 
Figure 3-4: Classification of contradictions ........................................................................... 18 
Figure 3-5: Process overview ............................................................................................... 20 
Figure  3-6:  Building  blocks  for  a  Simplex  Subaltern  contradiction,  consisting  of  two 
requirements ........................................................................................................................ 22 
Figure 3-7: Process for Simplex Subaltern ........................................................................... 22 
Figure 3-8: Building blocks for an Alius Subaltern contradiction, consisting of two requirements
 ............................................................................................................................................. 23 
Figure 3-9: Process for Alius Subaltern ................................................................................ 23 
Figure  3-10:  Building  blocks  for  an  Alius  Contradictory  contradiction,  consisting  of  two 
requirements ........................................................................................................................ 24 
Figure 3-11: Process for Alius Contradictory ........................................................................ 25 
Figure 3-12: Building blocks for an Alius Contrary contradiction, consisting of two requirements
 ............................................................................................................................................. 26 
Figure 3-13: Process for Alius Contrary ................................................................................ 26 
Figure 4-1: variable and action ............................................................................................. 31 
Figure 4-2: code structure .................................................................................................... 32 
Figure 4-3: confusion matrices for dataset 1 based on 313 requirements ............................. 34 
Figure 4-4: confusion matrices for dataset 2 based on 300 requirements ............................. 35 
Figure 4-5: condition/effect verbal expression detection ....................................................... 36 
Figure 4-6: action/variable verbal expression detection ........................................................ 36 
Figure 5-1: Contradictions relevant to RE ............................................................................. 40 
Figure 5-2: Nine types of contradictions for RE (Gärtner et al. 2022) .................................... 43 
Figure 5-3: Modular decision tree for contradiction detection in RE ...................................... 45 
Figure 5-4: Showcase of Condition and Effect, as well as Variable and Action in a formal 
requirement .......................................................................................................................... 46 
Figure 5-5: LLM prompt - pseudo code ................................................................................. 50 
Figure 5-6: GPT3 has enough knowledge to detect and explain the present contradiction ... 54 
Figure  5-7:  GPT3  is  not  able  to  properly  detect  the  present  contradictions.  Firstly,  the 
sentences are not conditions but mere sentences, and secondly, although the conditions can 
indeed be true at the same time, this would lead to contradicting effects. ............................. 55 
Figure 5-8: Prompt for the first question ............................................................................... 58 
Figure 5-9: Prompt for the third question .............................................................................. 59 
Figure 5-10: Prompt for the fourth question .......................................................................... 59 
Figure 5-11: Condition Detection with GPT-3 ....................................................................... 59 
Figure 5-12: Prompt for the seventh question ....................................................................... 60 
Figure 6-1: Procedure, based on Figure 2-2: DIKW and Contradiction Detection ................. 61 
Figure 6-2: Different semantic decompositions: (1) - Preum et al. (2017), (2) - dissertation at 
hand, (3) - Sarafraz (2011) ................................................................................................... 65 
Figure  6-3:  ALICE  Workflow:  Identifying  conditions  and  contradictions  in  requirements 
engineering .......................................................................................................................... 67 
Figure 6-4: ALICE’s structure (Gärtner and Göhlich 2023) ................................................... 68 

Related document tools

Tools for careful academic writing

Plag helps review text similarity and possible source overlap. Identific can support academic and institutional document processes. They help keep academic document workflows clearer.

plag.ai

List of Tables  IX 
 
 
Figure 6-5: Product Requirements Specification Process based on (Bender 2020) .............. 71 
Figure 6-6: Process model according to VDI 2221 (Göhlich et al. 2021) ............................... 72 
Figure  6-7:Tasks  when  developing  requirements  (own  illustration  based  on  Bender  and 
Gericke (2021)) .................................................................................................................... 73 
 
List of Tables 
Table 1-1: Impairment and Success factors of project execution (The Standish Group 2014) 1 
Table 1-2: Refinements from the first paper to the second and third paper ............................. 5 
Table 2-1: Most basic connectives (Sack and Alam 2020) .................................................... 11 
Table 2-2: How to model facts? (Sack and Alam 2020) ........................................................ 11 
Table 3-1: Formalized contradictions .................................................................................... 19 
Table 3-2: Distribution .......................................................................................................... 21 
Table 5-1: extract from Dataset 1 ......................................................................................... 49 
Table 5-2: Results for Reference Dataset in the form of confusion matrices ......................... 49 
Table 5-3: confusion matrices for the first dataset ................................................................ 50 
Table 5-4: Results for Dataset 2 in the form of confusion matrices ....................................... 52 
Table 5-5: Different answers generated by GPT3, GPT3.5, GPT4 (22.03.2023) and LLaMA 53 
Table 6-1: Possible cases of conflict (Preum et al. 2017) ..................................................... 63 
Table 6-2: Possible cases of contradictions. ......................................................................... 64 

Introduction and Motivation 1

1 Introduction and Motivation

This chapter offers a comprehensive introduction to this dissertation. It begins with exploring

the research motivation and provides an overview of the key concepts. Additionally, it guides

navigating and understanding the dissertation's structure.

1.1 Motivation

Requirements are fundamental to the development process of most products. They serve as

a vital means of communication between stakeholders involved in developing solutions,

especially when they are being developed by different individuals or groups simultaneously or

successively. Errors in requirements can have grave consequences, as illustrated by a 1993

study that identified the primary cause of safety-related software errors in NASA's Voyager

and Galileo spacecraft stemming from functional and interface requirements errors. These

errors can potentially result in serious accidents (Lutz 1993).

Effective requirements management ensures that all stakeholders clearly understand the

desired product functionality, performance, and constraints. It helps align the development

efforts, minimizes misunderstandings, and reduces the risk of rework and project delays. This

is shown in Table 1-1, which presents the most common project impairment and success

factors, as identified by The Standish Group (2014). In requirements management, the

analysis reveals that incomplete requirements account for 13.1% of the cases, lack of user

involvement constitutes 12.4%, unrealistic expectations contribute to 9.9%, and changing

requirements and specifications represent 8.7%. Moreover, when examining the success

factors associated with requirements management, clear statements of requirements emerge

as a significant factor, accounting for 13.0% of the cases. Realistic expectations constitute

8.2% of the cases, and clear vision and objectives represent 2.9% of the successful outcomes.

Table 1-1: Impairment and Success factors of project execution (The Standish Group 2014)

Project Impaired Factors

% of Responses

Project Succes Factors

% of Responses

1. Incomplete Requirements

13.1%

1. User Involvement

15.9%

2. Lack of User Involvement

12.4%

2. Executive Management Support

13.9%

3. Lack of Resources

10.6%

3. Clear Statement of Requirements

13.0%

4. Unrealistic Expectations

9.9%

4. Proper Planning

9.6%

5. Lack of Executive Support

9.3%

5. Realistic Expectation

8.2%

6. Changing Requirements & Specifications

8.7%

6. Smaller Project Milestones

7.7%

7. Lack of Planning

8.1%

7. Competent Staff

7.2%

8. Didn‘t Need it Any Longer

7.5%

8. Ownership

5.3%

9. Lack of IT Management

6.2%

9. Clear Vision & Objectives

2.9%

10. Technology Illiteracy

4.3%

10. Hard-Working, Focused Staff

2.4%

Other

9.9%

Other

13.9%

Legend:

Highlight: Factor directly or indirectly related to requirements management

Typically, requirements are documented in a specification sheet to capture all the requirements

for a development project. The primary objective of this document is to define a

comprehensive, unambiguous, and coherent target system without any contradictions.

However, the complexity and diversity of modern systems, such as mechatronic products,

necessitate distributed development across various systems and levels of aggregation of

product development. Consequently, the requirements for each system and level are typically

managed in separate documents, making it challenging to establish correlations between

Introduction and Motivation 2

them. Given these factors, it is unsurprising that contradictions and other inconsistencies are

often found in these documents (Dick 2017; Bender and Gericke 2021). This issue can lead to

a significant impact on project costs and timelines. Discovering and rectifying these issues late

in the development process often requires extensive rework, modifications, and additional

resources, leading to delays and increased expenses, as seen in Figure 1-1.

Additionally, the quality of the final product may be compromised, as conflicting requirements

can result in a system that fails to meet user needs effectively or lacks the desired functionality.

This can lead to customer dissatisfaction, potential reputational damage, and even financial

losses (Ehrlenspiel and Meerkamm 2017; Standish Group 1995; Giffin et al. 2009) Therefore,

it becomes clear that an automated method to detect such errors would be highly beneficial.

Figure 1-1: Dilemma of product development (Ehrlenspiel and Meerkamm 2017)

To this day, in most cases, requirements are expressed in natural language (Luisa et al. 2004),

which is why Natural Language Processing (NLP) plays a crucial role in handling large sets of

requirements documents. NLP is a subfield of artificial intelligence (AI) concerned with the

interaction between computers and humans using natural language. The development of NLP

can be traced back to the 1950s when the idea of using computers to analyze and understand

was first proposed by Chomsky (1957). However, it was not until the 1980s that significant

progress was made when researchers developed various new techniques and algorithms for

analyzing and generating natural language. A significant breakthrough during this time was the

development of statistical machine translation systems, which used large amounts of data to

improve the quality of translations (Foote 2019). In the mid-2000s, researchers began to focus

on developing more sophisticated natural language understanding systems. These systems

were designed to extract meaning and context from text rather than simply analyzing its surface

features. A notable example of this was the development of sentiment analysis systems, which

could analyze the emotional content of a text (Mäntylä et al. 2018).

One of the most significant breakthroughs in recent years has been the emergence of Large

Language Models (LLMs), notably ChatGPT, thanks to the development of large datasets and

powerful computing resources, which have revolutionized the field of NLP. LLMs are machine

learning models trained on massive amounts of text data using deep neural networks. They

can generate human-like responses to natural language queries and translate text from one

language to another. The emergence of LLMs has led to significant advances in several NLP

applications, such as speech recognition, sentiment analysis, and machine translation

(Bowman 2023).

In summary, NLP methods can enable extracting and parsing textual information from

requirements documents, facilitating the ability to extract insights, detect conflicts, and improve

the accuracy and efficiency of requirement analysis processes.

Introduction and Motivation 3

1.2 Concept and How-to-Read

Several key concepts play a crucial role in the study of language and information processing,

which will serve as a red thread for this thesis: syntax, semantics, and taxonomy. Syntax refers

to the rules governing the structure of sentences and phrases in a language. It involves

understanding how words and phrases are arranged to form meaningful expressions. In

English, "[…] the main device for showing the relationship among words is word order"

(Britannica, The Editors of Encyclopaedia 2023). Conversely, semantics refers to the meaning

of words, phrases, and sentences (Duden 2023a). It involves understanding how words and

phrases convey meaning and how they relate to each other in a sentence. Another essential

concept is taxonomy, which refers to the segmentation and classification of linguistic units

(Duden 2023b). It involves creating a categorization system to group similar things based on

their properties.

Together, these concepts form the foundation of language and information processing. They

enable us to communicate effectively and efficiently, understand and organize information, and

create systems to reason and learn independently. The emergence of large language models

and other advanced technologies has dramatically expanded language and information

processing potential.

Other terms important in this context are thesauri and ontologies. A thesaurus helps to find

synonyms or antonyms. Ontologies define concepts and relationships in a domain (Biagetti

2020). Initially, the application of thesauri and ontologies was considered for the third paper.

However, following the outcomes detailed in section 2.2, Wisdom: Ontologies or LLMs, the

decision was made to employ LLMs instead.

This cumulative dissertation presents a comprehensive approach to detecting contradicting

requirements. It comprises three papers, prefaced by an introduction and encapsulated by a

discussion, as depicted in Figure 1-2. Section 1 introduces the topic, detailing the motivation

and conceptual framework, and guides how to navigate this dissertation. Section 2 delves into

various methodologies for representing knowledge.

The first publication in Section 3 establishes the various contradictions that may arise in

requirements documents, serving as the basis for subsequent publications. The second

publication in Section 4 introduces a crucial step toward automated contradiction detection,

offering a rule-based method for identifying conditionals and pseudo-grammatical elements

within requirements. The third publication in Section 5 incorporates LLMs to detect

contradictions between requirements by combining the strengths of both symbolic AI and

LLMs. While symbolic AI is adept at detecting conditions and their effects within requirements,

LLMs excel at understanding the nuances of natural language and identifying subtle

contradictions that may not be immediately apparent. Together, these two AIs form a powerful

tool to detect contradictions in requirements automatically. Importantly, this approach is

grounded in the theoretical foundation established in the first publication, which enables

systematically identifying and addressing the different types of contradictions with varying

levels of criticality.

Introduction and Motivation 4

Figure 1-2: Contents and structure of this dissertation

Paper 1: Syntax and semantics of requirements and taxonomy of contradicting requirements.

It establishes the theoretical framework for the dissertation and is presented in a

modular format (Gärtner et al. 2022).

Title: Fundamental research on detecting contradictions in requirements: Taxonomy

and semi-automated approach

Paper 2: Adjustment of the first paper’s taxonomy and automated syntax, semantics, and

taxonomy detection – called constituent analysis. It introduces symbolic AI, mainly

based on grammatical rules, and is the foundation of the automation presented in the

third paper. The results form the basis of the formal logic used in the third paper

(Gärtner et al. 2023).

Title: Automated Condition Detection in Requirements Engineering

Paper 3: Extension of the symbolic AI from the second paper and integration of LLMs, resulting

in a successful, automated detection of contradicting requirements (Gärtner and

Göhlich 2024a).

Title: Automated Requirement Contradiction Detection through Formal Logic and

LLMs

Section 6 discusses the development prospects for a software tool named ALICE, an acronym

for Automated Logic for the Identification of Contradictions in Engineering. This exploration is

Introduction and Motivation 5

twofold: first, the feasibility of developing such a tool, and second, its potential applications in

the development process. Though not a component of this dissertation, the research

encompasses a proof-of-concept (PoC) for ALICE and a fourth paper published within the

scope of this work. The PoC is an executable Python script following the methodology this

research delineated. The additional paper focuses on practically implementing contradiction

detection within the development process. Both are discussed in this section.

To rapidly gain a comprehensive understanding of the dissertation, it is acceptable to focus on

the third paper, which offers a condensed overview of the theoretical definitions and processes

of the first and second papers in the form of a decision tree. However, to fully comprehend the

development of the decision tree and for possible future adaptions, it is recommended to read

the papers in the correct reading order.

It should be noted that the terms (taxonomy) used in the first paper were modified for the

second and third papers based on practical findings. In order to prevent any possible confusion

when reading this dissertation, these modifications will also be outlined here (see Table 1-2),

in addition to being discussed in the second paper, where the definitions and rationale for the

modifications can be found.

Table 1-2: Refinements from the first paper to the second and third paper

Paper 1

Paper 2 and Paper 3

Cause

Condition

Effect

Variable

Condition

Action

Theoretical Embedding 6

2 Theoretical Embedding

Section 2 presents a theoretical embedding of the underlying methods used in the papers.

Section 2.1 explores how knowledge is created, as requirements documents are essentially a

specific form of knowledge management that can be analyzed using various methods. These

findings lead to section 2.2, where two prominent approaches for this task are compared.

Finally, section 2.3 examines the theoretical basis of the formal syntax used in the papers.

2.1 Knowledge: From Kant to DIKW and NLP

The ancient Greek philosopher Aristotle – whose work logic of non-contradiction (MP: 1005b)

is the basis for the first paper – contributed significantly to the development of epistemology

(the art of knowledge, ger: Erkenntnistheorie). He believed that knowledge is acquired through

experience and observation (Hauser 2019). Immanuel Kant built upon the works of Aristotle to

develop his own epistemological framework in 1787. He argued that knowledge is a product

of both sensory experience and innate concepts, which he called 'categories.' He believed the

mind actively structures sensory data to produce knowledge (Kant 2011).

The DIKW framework, developed by Ackoff (1989), is a modern approach to knowledge

management that emphasizes the relationships between Data, Information, Knowledge, and

Wisdom. Although the works of Aristotle or Kant did not officially influence the framework, their

ideas on epistemology can be seen as its precursor – particularly Kant's notion of knowledge

consisting of hierarchical layers of sensory experience and innate concepts.

The DIKW framework represents a pyramid, where data forms the base, information builds

upon the data, knowledge builds upon information, and wisdom is at the top of the pyramid

(Rowley 2007), as seen in Figure 2-1:

Figure 2-1: DIKW pyramid (Ackoff 1989)

In the context of Natural Language Processing, the DIKW has become an essential framework

for understanding the processing of textual data. NLP techniques extract meaning and insights

from unstructured data such as text, speech, and images. Unlike structured data, which can

be organized in a tabular format with clearly defined fields and relationships, unstructured data

does not have a predetermined organization or standardized format. Text, speech, and images

often contain free-form expressions and visual content that do not follow a strict structure.

DIKW provides a structure for understanding the different levels of information that can be

derived from such data.

At the pyramid's base, NLP algorithms extract information from raw data, identifying individual

sentences and primary constituents such as subjects, verbs, or objects. 'The difference

between data and information is functional, not structural.' (Ackoff 1989).

Knowledge, the next level in the DIKW hierarchy, is derived from information by making

connections and drawing inferences (conclusions) between different pieces of information. In

NLP, this involves using techniques such as semantic analysis or machine learning to identify

Wisdom

Knowledge

Information

Data

Theoretical Embedding 7

patterns and relationships between different pieces of information, such as a sentence

condition.

Finally, at the top of the pyramid is wisdom, which represents the highest level of understanding

and insight that can be derived from data. Wisdom is not just about having access to

information or knowledge but about having the ability to apply that information and knowledge

in a meaningful way, which, in our case, would correspond to the realization that two

statements contradict each other. In NLP, wisdom can be achieved by developing intelligent

systems that can reason.

The development process and milestones for finding contradictions in requirements can be

mapped to the DIKW pyramid, as depicted in Figure 2-2.

Figure 2-2: DIKW and Contradiction Detection

2.2 Wisdom: Ontologies or LLMs?

As mentioned earlier, two methods will be compared for their ability to detect contradictions.

Ontologies, also referred to as knowledge graphs, are a probate method to handle knowledge

in natural language processing. However, in recent years, LLMs have gained significant

attention in this field. In the following section, both approaches will be discussed to understand

why, in this dissertation, LLMs were ultimately chosen to detect contradictions in requirements

documents.

2.2.1 Ontologies

Ontologies are vital to knowledge representation in computer science and artificial intelligence.

They are a formal, explicit specification of a shared conceptualization. In other words, an

ontology defines a set of concepts and their relationships to each other in a structured,

organized manner. It is used to represent knowledge machine-readable so that computers can

understand and reason about the relationships between concepts (Gruber 1993; Uschold and

Gruninger 1996).

A benefit of using ontologies is that they enable automated reasoning, which can lead to the

acquisition of wisdom. For example, an ontology could represent the relationships between

different functions and their effects on different components. Automated reasoning tools could

then be used to identify potential conflicts based on the known effects of the functions. Figure

2-3 shows an example from KIT (Sack and Alam 2020) of a simple ontology on a scientific

article about the effects of air pollution.

•Contradiction detection

between sentences

Wisdom

•Condition/Effect

•Action/Variabe

Knowledge

•Sentences

•Grammar

Information

•Strings

•Characters

Data

Theoretical Embedding 8

Figure 2-3: How to represent Ontologies (Sack and Alam 2020)

Ontologies serve as a foundational concept in knowledge representation across diverse

domains. Their applications extend to various fields, from healthcare to finance, manufacturing,

and natural language processing. For instance, in the healthcare domain, ontologies aid in

standardizing medical terminology, enabling more accurate diagnoses and treatment

recommendations (Dimitrieski et al. 2016). In finance, ontologies are utilized for risk

assessment and fraud detection by representing complex financial relationships (Kingston et

al. 2004). In supply chain management optimization, ontologies manufacture benefits through

the structured representation of product and process information (Scheuermann and Leukel

2014). In natural language processing, ontologies facilitate the extraction of semantic meaning

from text, enhancing information retrieval and understanding (Estival et al. 2004).

One of the fundamental advantages of ontologies is their ability to enable automated

reasoning. Reasoning engines utilize ontologies to infer new knowledge and detect

inconsistencies. This process enhances decision-making by providing logical and context-

aware insights. For example, in a healthcare ontology, automated reasoning can help identify

potential drug interactions based on known interactions between drug components and patient

medical histories (Naz et al. 2020). Similarly, reasoning engines can optimize production

processes in industrial applications by identifying conflicting constraints in heterogenous

databases (Ram and Park 2004; Nguyen 2007), i.e., different specification documents.

Ontology construction typically follows one of three approaches: manual construction,

cooperative construction (involving human intervention), and (semi-)automatic construction

(Al-Aswadi et al. 2020). The latter is also known as ontology learning and refers to the process

of creating an ontology in an automatic or semi-automatic way with limited human exert (Gillani

Andleeb 2015). For instance, text mining algorithms can extract domain-specific concepts and

their relationships from large textual corpora.

As ontologies grow in size and complexity, scalability and performance become critical

considerations. This becomes especially important when combining or interoperating different

ontologies. Strategies like mapping distributed ontologies and optimization techniques address

these challenges (Maedche et al. 2002).

However, using ontologies is challenging (Simperl and Tempich 2006; Ontology learning and

population 2008), notably their complexity when building and maintaining large amounts of

data. According to Wong et al. (2012) and Al-Aswadi et al. (2020) achieving fully automatic

ontology construction remains a formidable challenge and is unlikely to be attainable at all.

2.2.2 Large Language Models in Knowledge Representation

Large Language Models (LLMs) have emerged as powerful tools for knowledge representation

and understanding. These models are powerful tools capable of generating human-like

language and performing various tasks such as translation and text summarization while

displaying creativity in their output (Jun 2023). However, as pointed out by several authors,

including Zodhya (2023) and Saenko (2023), the training process incurs substantial

computational resources ranging from 1000-1300 MWh. In perspective, Germany had a per

capita electricity consumption of approximately 7 MWh, with a total electricity consumption of

547 TWh in 2022 (Pawlik 2022).

Theoretical Embedding 9

Sequential in nature, textual data relies heavily on the order of words and their contextual

relationships within a sentence to convey complete meaning. Traditional unsupervised

machine learning models overlook this inherent word order and context and grapple with the

constraints of fixed input sizes that are often relatively small. This prompts the rationale for

embracing computationally deep approaches when dealing with textual data (Acheampong et

al. 2021). LLMs attain their capabilities by using vast amounts of data, enabling them to apply

billions of parameters during their training phase. By repeating prediction tasks during the initial

training – such as successively identifying each next word in the text given the previous words

– and learning from its mistakes, the model recognizes similar textual context. Therefore, a

general knowledge of a language and the world is accumulated. This knowledge can then be

deployed for tasks of interest, such as question answering or text classification (Manning

2022). Looking at this, one could argue that LLMs may effectively exhibit inherent ontologies

of world knowledge.

This makes them valuable for chatbots, virtual assistants, and sentiment analysis tasks. They

enable computers to comprehend and respond to natural language queries, bridging the gap

between humans and machines (Kasneci et al. 2023; Acheampong et al. 2021). Furthermore,

LLMs can extract structured information from unstructured text, facilitating converting textual

data into structured knowledge. This is particularly useful for data mining, content

summarization, and knowledge base population.

Beyond their technical applications, LLMs excel in non-technical use cases as well. For

example, they are powerful translation systems, enabling seamless communication across

multiple languages. They can automatically translate text from one language to another while

preserving context and meaning. Another interesting aspect is their capability to enhance

search engines by enabling semantic search, which goes beyond keyword matching to

understand the user's intent and context. This results in more relevant search results and

improved user experiences. Finally, LLMs can generate human-like text, making them valuable

for content-generation tasks such as automated article writing, creative writing assistance, and

code generation (Acheampong et al. 2021).

LLMs come in various architectures and sizes, each designed for specific tasks and scalability.

Some prominent types include Transformer Models, Recurrent Models, and Convolutional

Models. Since 2018, transformers like Google’s BERT introduced in 2018 (Devlin et al.),

OpenAIs GPT3 introduced in 2020 (openai) and Meta’s LLaMa introduced in 2023 (Touvron

et al.) in many cases replaced convolutional and recurrent neural networks, the most popular

types of deep learning models for NLP applications (Merritt 2022; Manning 2022).

Transformer Models have gained widespread popularity for their ability to handle various tasks

related to understanding natural language. A transformer model is a neural network that learns

context and thus meaning by tracking relationships in sequential data like the words in this

sentence. They use attention mechanisms to capture context and relationships between words

in a text, are pre-trained on vast amounts of text data, and can generate coherent and natural

language (Merritt 2022; Manning 2022).

Such models are designed to process natural language text, which is both the input and output

of the model, see Figure 2-4. The input text is tokenized, dividing it into individual words or

sub-words, and then fed into the model as a sequence. The model processes the input

sequence and generates an output sequence of words or sub-words that form a coherent text.

The output can be further processed or used for downstream NLP tasks such as language

translation or text summarization.

Figure 2-4: Working with LLMs

Theoretical Embedding 10

Despite their remarkable capabilities, LLMs have their fair share of challenges and limitations.

Firstly, they can inherit biases from their training data, potentially resulting in biased or unfair

outputs. For example, if a model is trained on data biased towards certain groups of people, it

may produce unfair or discriminatory results towards those groups. Secondly, the training and

operation of LLMs demand significant computational resources, rendering them inaccessible

to many researchers and organizations. To address this challenge, using pre-trained models

and cloud technology in combination with cooperative schemes for usage in partnership with

institutions and companies can serve as a starting point. Lastly, the models raise ethical

concerns, particularly about privacy, misinformation, and potentially generating harmful

content (Kasneci et al. 2023; Merritt 2022).

The field of LLMs continues to evolve rapidly, with several noteworthy trends on the horizon.

Firstly, Multimodal models capable of handling both text and other modalities like images and

audio are gaining traction, opening up new possibilities for knowledge representation and

understanding. Furthermore, researchers are actively exploring fine-tuning and transfer

learning techniques to adapt pre-trained models for specific tasks, reducing the need for

extensive training data. Lastly, there is a growing emphasis on ethical AI within the LLM

domain, with concerted efforts to address bias, ensure fairness, and enhance transparency in

their systems.

Large Language Models have revolutionized knowledge representation and natural language

understanding. They show potential in detecting contradictions in text thanks to their ability to

capture complex linguistic patterns and relationships between words and phrases, having an

inherent ontology on trivial knowledge. However, they have yet to be trained on requirements

specifications, which are generally not publicly available. Therefore, to this day, they cannot

detect contradicting requirements, as shown in the third paper, and therefore cannot be solely

relied upon when working in this domain. In the second and third papers, a self-written symbolic

AI is presented, which is necessary to heavily preprocess the data, which an LLM can then

analyze.

2.2.3 Conclusion

While solutions already exist or can be programmed via symbolic AI for the first three layers of

the DIKW pyramid – as shown in the second paper – the main challenge lies in the top layer.

As stated by Awad (2003): 'Wisdom is the highest level of abstraction, with vision, foresight,

and the ability to see beyond the horizon.' In the preceding sections, two methods were

proposed to tackle this challenge, which could theoretically solve the issue. However, providing

a relation for every term in an ontology to any other term in that ontology is virtually impossible

since this would require an immense amount of manual work for each project. Large Language

Models possess this wisdom intrinsically and do not require individual training per project, as

demonstrated in the third paper. Therefore, the decision was made to use LLMs. The DIKW

pyramid adapted by Awad (see Figure 2-5) emphasizes that non-algorithmic solutions are

required for the upper levels of the pyramid, which excludes ontologies and suggests the use

of large statistical models.

Theoretical Embedding 11

Figure 2-5: Data, Information, and Knowledge (Awad 2003)

2.3 Propositional Logic

Propositional logic, or sentential logic, is a branch of mathematical logic that deals with

propositions or statements that are either true or false. This notation is used but not

scientifically introduced as underlying syntax in the first paper. The syntax uses symbols to

represent propositions and connectives to combine them into more complex statements. The

most basic connectives are shown in Table 2-1:

Table 2-1: Most basic connectives (Sack and Alam 2020)

Connective

Name

Intentional meaning

negation

not

∧

conjunction

and

∨

disjunction

⇒

implication

if – then

⇔

equivalence

if, and only if, then

'In propositional logic, the world consists simply of facts and nothing else (statements of

assertions).' (Sack and Alam 2020). For example, let 𝑝 be the statement It is raining and 𝑞 be

the statement I am indoors. Then, the negation of 𝑝 is ¬𝑝, which means It is not raining. The

conjunction of 𝑝 and 𝑞 is 𝑝 ∧ 𝑞, which means It is raining and I am indoors. And the disjunction

of 𝑝 and 𝑞 is 𝑝 ∨ 𝑞 , which means It is either raining or I am indoors, or both..

Table 2-2 shows simple and composed assertations, which, by the way, do not need to be true

in the real world.

Table 2-2: How to model facts? (Sack and Alam 2020)

Simple Assertion

Modeling

The moon is made of green cheese.

𝑔

It rains.

𝑟

The street is getting wet.

𝑛

Composed Assertation

Modeling

If it rains, then the street will get wet.

𝑟 ⇒ 𝑛

If it rains and the street does not get wet, then

the moon is made of green cheese.

(𝑟 ∧ ¬𝑛) ⇒ 𝑔

Theoretical Embedding 12

Propositional logic can be used to analyze and reason about complex systems or situations by

breaking them down into smaller, formal, and more manageable objects and relations. It forms

the basis for more advanced forms of logic and reasoning, such as predicate logic and modal

logic.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 13

3 Fundamental research on detecting contradictions in

requirements: Taxonomy and semi-automated approach

This article is an open access article distributed under the terms and conditions of the Creative

Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). The

content of this chapter was published:

• Authors: Alexander Elenga Gärtner, Tu-Anh Fay, Dietmar Göhlich. Fundamental Research

on Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach.

• Publisher: Appl. Sci. 2022, 12(15), 7628 (This article belongs to the Special Issue

Requirements Engineering: Practice and Research)

• DOI: https://doi.org/10.3390/app12157628

• Published: 28 July 2022

• Author Contributions: Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.;

vali-dation, A.E.G..; formal analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G and

D.G.; data curation, A.E.G.; writing—original draft preparation, A.E.G.; writing—review and

editing, D.G. and T.-A.F.; supervision, D.G.

3.1 Abstract

Requirements documents can contain several thousand individual requirements. They must

be error-free to avoid unnecessary complications and costs in the later product development

stages. An important part of this is to identify contradictions between two requirements. The

first step is therefore to define what contradictions are and in what form they can occur in

requirement documents. In this paper the scientific theories regarding contradictions are

discussed, concerning to their usefulness for the topic. In doing so, the Aristotelian Logic

proved to provide the best basis for an application in the Requirements Engineering context.

Based on this theory, we have created specific subtypes of contradictions to match them to

the requirements engineering field. The identification of these subtypes is done by a

formalization of the requirement sentences and a subsequent analysis by means of simple

questions. To validate the method, industrial requirement documents were searched for

contradictions. For each detected type of contradiction, we present an example of the detection

process. Thereby, we show that the method is easy to apply and may also be used by non-

specialists. Thus, our method provides a taxonomy as a basis for further research on

automated contradiction detection as well as on automated quality anal-ysis of requirements

documents.

3.2 Introduction

Complete and error-free requirements specifications are crucial for effective product

development. One aspect of this, is to ensure that the documents are contradiction-free. On

the way toward this, contradictions must first be defined in the Requirements Engineering (RE)

context to recognize and classify them. Subsequently, the quality of the requirements

specification can be determined and, depending on the class of the contradiction, a solution

can be pursued.

3.2.1 Problem

Requirements form the basis for project planning, risk management, acceptance testing, and

many other fields (Dick 2017). Requirements specifications that describe an entire system are

often written in an interdisciplinary manner. The partial results must be merged according to

their logical and temporal dependencies, to form the overall solution. This process involves a

risk of error, especially for complex systems, regarding the consistency of the partial solutions

within the overall solution (Bender and Gericke 2021). Therefore, it is not surprising that errors,

e.g., in the form of contradictions, are often found in these documents.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 14

Empirical research on requirements quality focuses on improvement techniques, with very few

primary studies addressing evidence-based definitions and evaluations of quality attributes

(Montgomery et al. 2022b).

3.2.2 Contribution

In this paper, a proposal is made on what contradictions are in the RE context, how they can

be classified, and how they can be determined. The classification of distinctive categories

allows for a more consistent assessment of the quality of requirements documents , as different

types of contradictions have different criticality levels. Also, depending on the type of

contradiction, different approaches are needed to solve them. Finally, the proposed

standardized solution provides a partially automated method, which is the basis for a potential

fully automated contradiction detection.

Within our validation, examples from real requirements documents are used.

3.3 Fundamentals

The market study from Luisa et al. concluded that in most cases (95%) requirements

documents were expressed in Natural Language (Luisa et al. 2004), which is inherently

ambiguous (Babcock 2007) and must therefore be interpreted to a certain degree. This

interpretation can often be an undetected source of errors.

In this paper, we focus on contradictions between pairs of text-based requirements in tabular

form. Below, we will specify how these requirements should be phrased and how contradictions

are generally defined.

3.3.1 Formulation and building blocks

Ideally, requirements should be based on a specific scheme (Dick 2017): Requirement

Expression = Boilerplate + Placeholder values. A boilerplate for a typical non-causal

requirement shows the following form: The <stakeholder type> must be able to <capability>.

Another example for a boilerplate could look like this: If <operational condition (cause)>, the

<system> shall <function> not less than <quantity> <object>, e.g. If the fuel tank is empty, the

Flexray shall sustain communication not less than 1 h. Simplified, sentences are built with

<Cause> + <Effect> which in turn consist of variables and conditions, as shown in Figure 3-1:

Figure 3-1: Formulation and building blocks

3.3.2 Contradictions

In this paper, we differentiate between the term contradiction – which occurs when a statement

is in opposition either with itself or an established fact – and the term contradictory, which we

will explain below. In the literature on Requirements Engineering, many definitions of

contradictions can be found, see section 3 Related Work. To get the most generically valid and

scientifically accepted definition, we base our theory on the logical philosophy of Aristotle. The

foundation of his logic – also known as term logic, traditional logic, or formal logic – developed

in his work Metaphysics is the law of non-contradiction (LNC) (Horn 2018). There, he argues

that it is impossible that the same thing belongs and does not belong simultaneously in an

identical way to the same object (Aristoteles 1986). “The doctrine of the square of opposition

[as seen in Figure 3-2; note by the author] originated with Aristotle in the fourth century BC

and has occurred in logic texts ever since. Although severely criticized in recent decades, it is

still regularly referred to” (Parsons 2021) and will hence serve as a basis for our purposes.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 15

Figure 3-2: Square of opposition

The relations can be described as follows:

• “Every S is P” and “Some S is not P” are contradictories.

• “No S is P” and “Some S is P” are contradictories.

• “Every S is P” and “No S is P” are contraries.

• “Some S is P” and “Some S is not P” are subcontraries.

• “Some S is P” is a subaltern of “Every S is P”.

• “Some S is not P” is a subaltern of “No S is P”.

Therefore, we have four main oppositions: contradictories, contraries, subcontraries, and

subalterns.

1. Contradictory opposites, e.g., “he is sick” / “he is not sick”, are mutually exhaustive and

mutually inconsistent. This means, that one statement must be true and the other false

or vice versa. They cannot both be true or false at the same time.

2. Contrary opposites, e.g., “it is black” / “it is white”, are also mutually inconsistent, but

not exhaustive. While they cannot both be true, they can both be false.

3. Subcontraries, e.g., “you can – If you want to – call in sick” / you can – If you want to –

not call in sick” are mutually consistent. While they can simultaneously be true at the

same time, they cannot simultaneously be false at the same time.

4. The statement “some people are sick” is the subaltern of “everybody is sick”, while the

latter is the superaltern of the former. If the superaltern is true, the subaltern must also

be true and if the subaltern is false, the superaltern must also be false.

By these definitions, the four central kinds of opposition – contradictory, contrariety,

subcontrariety, and subaltern – are mutually inconsistent.

In addition to the contradictions considered so far, there are other types. Kesselring

differentiates between the Aristotelian LNC-contradictions, dialectic contradictions, and

antinomies (Kesselring 2013).

Dialectic contradictions are comparable to antagonisms or so-called „conflict of goals“. For

example:

• The vehicle must have high performance

• The vehicle must have low consumption

They don’t stand in a mathematical/logical conflict but are incompatible in practice.

Antinomies denote conceptual or propositional structures in which the truth value oscillates. A

famous example is: Plato says, "Socrates speaks the truth," and Socrates says, "Plato lies."

They are often confused with self-contradictions (Kesselring 2013).

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 16

In this paper, we will tackle the LNC conflicts, except for the subcontraries. As they can be

valid simultaneously, subcontraries are not contradictions that need to be resolved for the RE

work.

3.4 Related Work

In this section, we assembled related works in terms of classification of conflicts, detection of

antagonisms, natural language processing for detecting conflicts, and finally ontology-based

approach for detecting conflicts. In general, we saw a lack of real validation in these topics, as

it is difficult to find non-academic institutions that are willing to share their requirements-

documents for scientific analyses (Landhäußer and Körner 2017).

3.4.1 Classification of conflicts

A classification of conflicts is suggested by Marneffe et al., in antonymy, negation, or numeric

mismatches (Marneffe, Rafferty, Manning 2008). Negations and numeric mismatches do not

fit the LNC classification of Aristotle and can only be partially combined.

Lamsweerde et. al. are classifying conflicts into nine categories (Lamsweerde, Darimont, Letier

1998):

1. Process-Level Deviation: Conflict between a process-level rule and a specific process

state.

2. Instance-Level Deviation: Inconsistency between a product-level requirement and a

specific state of the running system.

3. Terminology Clash: Usage of different terms for the same event

4. Designation Clash: Usage of the same term for different events

5. Structure Clash: Different explanations for a single real-world concept

6. Conflict: Two assertations are directly logically inconsistent

7. Divergence: Two assertations are indirectly (through a boundary condition) logically

inconsistent

8. Competition: Particular case of the divergence

9. Obstruction: Another particular case of the divergence

This doesn’t represent a classification with a consistent structure, since on the one hand the

system level is taken as classification criteria and on the other hand the context is taken as

classification criteria. Also, 7, 8, and 9 cannot clearly be differentiated.

Marneffe et al. propose a looser classification than ours. "Pairs such as 'Sally sold a boat to

John' and 'John sold a boat to Sally' are tagged as contradictory" (Marneffe, Rafferty, Manning

2008). In the context of requirements engineering though, this should not be interpreted as a

contradiction. This becomes clear in the following example: "Control unit 1 sends a signal to

control unit 2. Control unit 2 sends a signal to control unit 1." It becomes more complicated

when it says, "Control unit 1 sends the signal X to control unit 2. The control unit 2 sends the

signal X to the control unit 1.” This would indeed be a contradiction, but it would be classified

as a dialectic contradiction: theoretically, it is possible, but it wouldn't make any practical sense.

Guo et al. propose a classification into three basic conflict types - inconsistencies, inclusions,

and interlocks - which in turn can be divided into seven subcategories (Guo et al. 2021).

Inconsistencies are defined as contradictions between requirements that cannot both be

fulfilled at the same time. Compared to LNC, this could correspond to contradictories or

contraries as well as dialectical contradictions. Inclusions can correspond to both

contradictories and contraries. Interlocks can be compared with subalterns. This represents a

promising approach and to a large extent can be combined with LNC. In section 2.2

Contradictions – Subcategories, parts of this classification are taken up, placed in the logical

context, and further refined.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 17

3.4.2 Natural Language Processing for detecting conflicts

According to Zhao et al., most of the studies (67.08%) in the Nature Language Processing

domain combined with RE are “solution proposals, assessed by a laboratory experiment or an

example application, while only a small percentage (7%) are assessed in industrial settings”

(Zhao et al. 2021), so they rarely have a practical validation.

However, to the best of our knowledge, no Machine-Learning oriented studies tried to classify

or find contradictions. Many papers deal with classifying requirements, for example in

functional and non-functional requirements (Kurtanovic and Maalej 2017) or in security-related

requirements (Jindal et al. 2016).

3.4.3 Ontologies for detecting conflicts

Guarino et al. describes computational ontologies as “means to formally model the structure

of a system, i.e., the relevant entities and relations that emerge from its observation.” (Guarino

et al. 2009). It is required to conduct a mapping of statements to concepts and relationships.

Inconsistencies and opposing elements can be recognized this way (Sandhu and Sikka 2015).

This shows that ontology-based methods are not easy to apply. A certain amount of

preparatory work is needed, including system knowledge. The resulting advantage is that not

only LNC contradictions, but also dialectic contradictions can be detected.

In this paper, however, we want to focus on LNC contradictions and lay the foundation for

automatically finding contradictions in the future, without requiring system knowledge or

preparatory work. Neither is expedient with ontologies.

3.5 Method for detecting Contradictions

The findings from the section 2 Fundamentals can be summarized as shown in Figure 3-3,

while dialectical contradictions, antinomies, and subcontraries – as already explained – will not

be considered:

Figure 3-3: Contradictions

In the following, we present a formal method by which contradictions can be identified and

classified. To have a meaningful application in the RE context, we first must subdivide the LNC

contradictions. Since many requirements are not just simple statements, we have added the

principle of cause and effect to LNC, as this is not specifically represented in this theory. With

this, our categories are still based on Aristotle but adapted to the Requirement context. In

Section 3, they will be validated with examples from the automotive sector.

3.5.1 Nomenclature

First, we must define a nomenclature, to be able to refer to it in the following sections.

Capital letters as 𝐴, 𝐵, 𝐶, and 𝐷:

• Are events that represent, for example, conditions

• Are always unequal

• Can occur simultaneously

• Do not depend on each other

Lowercase letters 𝑥 and 𝑦:

• Are variables

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 18

Lowercase letters 𝑐 and 𝑘:

• Are parameters

• Are unequal to each other

• Can occur in parallel: 𝑐 can be equal to 1 and at the same time 𝑘 equal to 2.

Operators

• =

!; <

!; >

!: must be equal, must be smaller, must be bigger

• ⇒: implies; if... then. E.g., 𝐴 ⇒ 𝑥 =

!1 translates to “If 𝐴 is true, then 𝑥 must be 1".

• ¬: not. E.g., The statement ¬𝐴 is true if and only if 𝐴 is false.

• ∧; ∨: and; or. E.g., The statement 𝐴 ∧ 𝐵 is true if 𝐴 and 𝐵 are both true; otherwise, it is

false. Another example is: The statement 𝐴 ∨ 𝐵 is true if 𝐴 or 𝐵 (or both) are true; if

both are false, the statement is false.

3.5.2 Contradictions – Subcategories

The suggested categories are shown in Figure 3-4:

Figure 3-4: Classification of contradictions

The terms Simplex, Idem, and Alius are described below. It should be noted that requirements

can be formulated as “condition + conclusion” as well as inverted as “conclusion + condition”.

Simplex (lat. = simple) refers to contradicting requirements without conditions (non-causal):

• The car must be red: 𝑥 =

!𝑘.

• The car must be blue: 𝑥 =

!𝑐.

Idem (lat. = same) refers to contradicting causal requirements with the same conditions or

pairs where only one requirement has a condition:

• If the customer wishes, the car must be red: 𝐴 ⇒ 𝑥 =

!𝑘.

• If the customer wishes, the car must be blue: 𝐴 ⇒ 𝑥 =

!𝑐.

Alius (lat. = different) refers to contradicting causal requirements with the different conditions

(causal):

• The car must be red if the customer wishes it to be: 𝐴 ⇒ 𝑥 =

!𝑘.

• The car must be blue if the car has four doors: 𝐵 ⇒ 𝑥 =

!𝑐.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 19

The last example is a contradiction because the conditions of both requirements can be fulfilled

at the same time (they are independent of each other) and the conclusions would then

contradict each. Also, this is an example where the condition and conclusion have been

inverted in order.

If one requirement is non-causal and the other is causal, the contradiction as a whole is said

to be causal.

"Contradictory" refers only to the effect, not to the requirement as a whole. If the effects are

contradictory but the requirements as a whole would be contrary, we still refer to them as

contradictory here.

The following Table 3-1 lists all types of contradiction in their formalized form. The column

“Multiple condition” shows examples of formalized requirements with multiple conditions. It is

not an exhaustive list of all possible multiple conditions:

Table 3-1: Formalized contradictions

Contradictions

Simple

Examples for

multiple

conditions

Contradictory

Simplex

𝑥 =

!𝑘

𝑥 =

!¬𝑘

Idem

𝐴 ⇒ 𝑥 =

!𝑘

𝐴 ⇒ 𝑥 =

!¬𝑘

𝐴 ∧ 𝐵 ⇒ 𝑥 =

!𝑘

𝐴 ∧ 𝐵 ⇒ 𝑥 =

!¬𝑘

Alius

𝐴 ⇒ 𝑥 =

!𝑘

𝐵 ⇒ 𝑥 =

!¬𝑘

𝐴 ∧ 𝐵 ⇒ 𝑥 =

!𝑘

𝐴 ∧ 𝐶 ⇒ 𝑥 =

!¬𝑘

Contrary

Simplex

𝑥 =

!𝑐

𝑥 =

!𝑘

Idem

𝐴 ⇒ 𝑥 =

!𝑐

𝐴 ⇒ 𝑥 =

!𝑘

𝐴 ∧ 𝐵 ⇒ 𝑥 =

!𝑐

𝐴 ∨ 𝐵 ⇒ 𝑥 =

!𝑘

Alius

𝐴 ⇒ 𝑥 =

!𝑐

𝐵 ⇒ 𝑥 =

!𝑘

𝐴 ∧ 𝐵 ⇒ 𝑥 =

!𝑐

𝐶 ∧ 𝐷 ⇒ 𝑥 =

!𝑘

Subaltern

Simplex

𝑥 <

!𝑐 + 𝑘

𝑥 <

!𝑐

Idem

𝐴 ⇒ 𝑥 <

!𝑐 + 𝑘

𝐴 ⇒ 𝑥 <

!𝑐

𝐴 ∧ 𝐵 ⇒ 𝑥 <

!𝑐 + 𝑘

𝐴 ∧ 𝐵 ⇒ 𝑥 <

!𝑐

Alius

𝐴 ⇒ 𝑥 <

!𝑐 + 𝑘

𝐵 ⇒ 𝑥 <

!𝑐

𝐴 ∧ 𝐵 ⇒ 𝑥 <

!𝑐 + 𝑘

𝐶 ∧ 𝐷 ⇒ 𝑥 <

!𝑐

If a condition is composed of two or-statements, it can be split into two sentences, as

intermediate. This will be shown in section 5.2.4 Alius Contrary. Each partial cause can then

separately be considered with the effect. From 𝐴 ∨ 𝐵 ⇒ 𝑥 =

!𝑐 follows 𝐴 ⇒ 𝑥 =

!𝑐 and 𝐵 ⇒ 𝑥 =

!𝑐.

This facilitates the comparison of requirements that consist of compound conditions.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 20

3.5.3 Process

The following Figure 3-5 shows how the types of contradiction can be recognized based on

simple, but specific questions.

Figure 3-5: Process overview

The first three questions refer to the effects of the requirements to be compared. The following

three questions refer to the causes, if any. The questions are elaborated on below. For a

contradiction to be identified, all questions must be answered as specified in the corresponding

column. The check mark stands for "yes" and the cross for "no". The circle stands for questions,

that do not apply in that case. Condition 1 and condition 2 are the respective conditions of the

effects of requirement 1 and requirement 2. The same applies for cause 1 and cause 2.

Effect-related questions:

1. Are the variables from effect 1 and effect 2 the same or a subset of each other?

Two statements can contradict each other in the sense of LNC only if the variables, i.e. the

object in question, are the same or one is a part of the other, for example, table and table leg.

2. Does one effect include the other one?

If one condition includes the other, it could be a subaltern contradiction, for example, "...

between 15m and 30m" and "... between 20m and 22m". The range of the second statement

is included in the range of the first statement and therefore the former is the superaltern of the

latter.

3. Are effect 1 and effect 2 mutually exhaustive and mutually inconsistent?

This question aims at finding contradictory opposites, for example, “the car is ready” / “the car

is not ready”. If one is true, the other must be false and vice versa.

Cause-related questions:

4. Is there a condition?

If there is a condition, any form of Simplex-contradiction can be excluded.

5. Can cause 1 occur at the same time as cause 2?

Two statements can only contradict each other, if they can theoretically occur at the same time.

The statements “If it rains, …” and “If it does not rain, …” cannot occur at the same time and

are therefore not contradicting each other. If “it rains” in the first statement were to be replaced

with “it’s hot outside”, the two statements could theoretically contradict each other.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 21

6. Are cause 1 and cause 2 the same?

This question simply aims at detecting Idem-contradictions, who must have the same cause,

for example, “If I am here, you are there” and “If I am here, you are here”.

3.6 Materials and Results

In this section, first, the underlying data set for the validation is explained. Afterward, the

contradiction types defined above are validated using an example from the dataset, if so found.

The dataset was analyzed by hand. The document was read through to find all existing

contradictions. Not only contradictions, but also duplicates, repetitions, ambiguities and other

conflicts were found. For a complete and automated application over a large data set, see

section 7 Conclusion.

3.6.1 Materials

The data set consists of several interrelated requirements documents. The originator is the

company IAV GmbH (Berlin, Germany), which was kind enough to make the documents

available. The goal was to create a complete requirements package for the development of E-

buses, which are in use today. The document consists of about 3500 functional and non-

functional requirements, from system to software level. The original language of the documents

is German and was translated to English. For confidentiality issues, signal names are

anonymized by using square brackets.

3.6.2 Results

Contradictory connections are counted as one contradiction. In other words: every

contradictory pair is counted as a single contradiction.

From a total of 6500 objects 3500 were requirements. Besides the above mentioned other

conflicts, 49 (1.35%) LNC-contradictions were found. However, it should be noted that not all

contradictions were evenly distributed across all levels. 46 of the 49 contradictions were found

at the software level, where they account for 2.53% of all requirements. The distribution of the

different contradiction types is displayed in Table 3-2.

Table 3-2: Distribution

Simplex

Subaltern

Alius

Subaltern

Alius

Contradictory

Alius Contrary

These figures must be viewed with caution, as the analysis was done manually, and it is likely

that further inconsistencies were overlooked.

We didn’t find any contradictions for the following species: Simplex and Idem Contradictories,

Idem Contraries, and Idem Subalterns. This will be reflected in section 6 Discussion. In the

following, we explain the method using one example each from the requirements documents.

3.6.2.1 Simplex Subaltern

The two selected requirements are:

1. The safe state must be reached within 1000 ms.

2. The safe state must be reached within 800 ms.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 22

The building blocks are shown in Figure 3-6:

Figure 3-6: Building blocks for a Simplex Subaltern contradiction, consisting of two

requirements

Its formalized form is:

𝑥 <

!𝑘

𝑥 <

!𝑐

𝑤ℎ𝑖𝑙𝑒 𝑐 < 𝑘

(1)

(2)

(3)

where “safe state” is 𝑥, “1000 ms” is k and “800 ms” is 𝑐.

The questions presented in our methodology can then be answered as shown in Figure 3-8:

Figure 3-7: Process for Simplex Subaltern

The questions were answered as given in the column for Simplex Subaltern contradictions.

The last two cause-questions did not need to be answered, because the Simplex-

contradictions do not have causes.

3.6.2.2 Alius Subaltern

The two selected requirements are:

1. If the actual heater stage CbnHeatg_[…] > 0, the requested pump power

CbnHeatg_SpOfCooltPmp must be limited by the parameter

CbnHeatg_TrigForDutyCycOf[…].

The safe state must be reached within 1000ms.

The safe state must be reached within 800ms.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 23

2. If BattChrgnMngt_MsgVld[…] = false, the requested pump power

CbnHeatg_SpOfCooltPmp must be limited to 20%.

The parameter CbnHeatg_TrigForDutyCycOf[...] is initialized elsewhere with 80%. Therefore,

we have a similar case as above, only this time there are conditions. The building blocks are

shown in Figure 3-8:

Figure 3-8: Building blocks for an Alius Subaltern contradiction, consisting of two

requirements

And results in:

𝐴 ⇒ 𝑥 <

!𝑘

𝐵 ⇒ 𝑥 <

!𝑐

𝑤ℎ𝑖𝑙𝑒 𝑐 < 𝑘

(4)

(5)

(6)

where “If the actual heater stage CbnHeatg_[…] > 0” is A, “If BattChrgnMngt_MsgVld[…] =

false” is B, “CbnHeatg_SpOfCooltPmp” is x, “Cbn-Heatg_TrigForDutyCycOf[…]” is k and

“20%” is c.

The questions presented in our methodology can then be answered as shown in Figure 3-9:

Figure 3-9: Process for Alius Subaltern

If the actual heater stage CbnHeatg_[ ] > 0, the requested pump power CbnHeatg_SpOfCooltPmp must be limited by the parameter Cbn Heatg_TrigForDutyCycOf[ ].

If BattChrgnMngt_MsgVld[ ] = false, the requested pump power CbnHeatg_SpOfCooltPmp must be limited to 20%.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 24

3.6.2.3 Alius Contradictory

It gets more complicated when getting to the following contradictories:

1. Suitable potential equalization is required for all conductive covers or housings of all

HV components.

2. If additional external conductive sheaths or covers are fitted over covers or enclosures

consisting of solid insulating materials, equipotential bonding is not required for these.

By considering the context, it becomes clear, that the demonstrative “these” in the second

sentence is a variable y. It refers to “covers or housings consisting of insulating materials” and

not to “covers or housings” or “solid insulating materials”. However, the variable x of the first

sentence is “covers or housings“, which means that 𝑦 ∈ 𝑥.

Therefore, the building blocks are as shown in Figure 3-10:

Figure 3-10: Building blocks for an Alius Contradictory contradiction, consisting of two

requirements

The formalized form is:

𝑥 =

!𝑘

𝐴 ∨ 𝐵 ⇒ 𝑦 ≠

!𝑘

𝑤ℎ𝑖𝑙𝑒 𝑦 ∈ 𝑥

(7)

(8)

(9)

where “conductive covers or housings of all HV components” is x, “potential equalization” is k,

“additional external conductive covers or housings are fitted over covers” is A and “housings

consisting of solid insulating materials” is B. “these” is y and is actually a subset of x. It denotes

“conductive covers or housings of all HV components with additional external conductive

covers or housings fitted over covers or housings consisting of solid insulating materials”.

Suitable potential equalization is required for all conductive covers or housings of all HV components.

If additional external conductive covers or housings are fitted over covers or housings consisting of solid insulating materials, potential equalization is not required for these.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 25

After the formalized form has been determined, filling in the table works as usual, as shown in

Figure 3-11:

Figure 3-11: Process for Alius Contradictory

3.6.2.4 Alius Contrary

The two selected requirements are:

1. If the value of the signal ComVehFrnt_ChrgnCur[…] exceeds the value of 0 (A), the

signal Chrgn[…] must be set to TRUE.

2. If the parameter ChrgnCurChk_SubVal[…] is set to TRUE, the signal Chrgn[…]

corresponds to the parameterizable value ChrgnCurChk_SubValChrgn[…], otherwise,

the signal is forwarded unchanged.

The second condition of the second requirement should be transferred to a separate

requirement to apply this method. The second requirement thus splits and can be checked

separately against other requirements for contradictions. Accordingly, our customized

requirements look like this, while we will be using 2.1 in the further analysis:

1. If the parameter ChrgnCurChk_SubVal[…] is set to TRUE, the signal Chrgn[…]

corresponds to the parameterizable value ChrgnCurChk_SubValChrgn[…].

2. If the parameter ChrgnCurChk_SubVal[…] is not set to TRUE, the signal Chrgn[…] is

forwarded unchanged.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 26

Then, the building blocks are as shown in Figure 3-12:

Figure 3-12: Building blocks for an Alius Contrary contradiction, consisting of two

requirements

The formalized form results in:

𝐴 ⇒ 𝑥 =

!𝑐

𝐵 ⇒ 𝑥 =

!𝑘

(10)

(11)

where “the value of the signal ComVehFrnt_ChrgnCur[…] exceeds the value of 0 (A)” is A, “the

parameter ChrgnCurChk_SubValForChrgn[…] is set to TRUE” is B, “Chrgn[123]” is x and

“TRUE” is c and “ChrgnCurChk_SubValChrgn[…]” is k.

The questions presented in our methodology can then be answered as shown in Figure 3-13:

Figure 3-13: Process for Alius Contrary

If the parameter ChrgnCurChk_SubValForChrgn[ ] is set to TRUE,

the signal Chrgn[123] corresponds to the parameterizable value ChrgnCurChk_SubValChrgn[ ].

If the value of the signal ComVehFrnt_ChrgnCur[ ] exceeds the value of 0 (A), the signal Chrgn[123] must be set to TRUE

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 27

3.7 Discussion

It is important to note, that we didn't find any Idem-contradictions and only one Simplex-

contradiction. Idem-contradictions are so conspicuous that the requirements engineer would

probably notice them immediately since he would have to formulate exactly the same cause

twice with the same variables but different effects. The reason for the absence of Simplex-

contradictions is, that the examined system is so complex that simple statements without

conditions would simply not be sufficient to describe the system precisely.

Besides mentioned reasons, internal validity mistakes could play a role in not finding certain

contradiction types. In the project documents are about 3500 requirements with

∑𝑘 = 6 123 250

𝑛−1

𝑘=0 theoretical combinations. We therefore cannot rule out the possibility, that

we missed Idem- or Simplex-contradictions.

If the requirements are not formulated according to the guidelines, borderline cases can

certainly occur in which contradictions cannot be clearly assigned or even identified. This is

because language is often ambiguous and human interpretation is often needed. When it

comes to complex formulations, even common sense can reach its limits.

3.8 Conclusion

Especially in the early development phase, ambiguities are very common in Requirements

documents due to the use of natural language. In this paper, we examined contradictive

requirements, which we defined using formal logic. In contrast to other papers, we did not

classify contradictions according to our data set or our code, but according to a generally

accepted, well-tested systematic model. Then, we created a classification tailored to RE, in

which conditions and effects now take a prominent role. Finally, we proposed a way to identify

our contradictions using clear questions.

We have analyzed about 6500 objects, approximately 3500 of which were requirements. In

total, we were able to identify many different conflicts, 49 of which were LNC-related

contradictions that could be identified using our method. The majority of the detected

contradictions were of the Alius Contraries. Furthermore, most of the contradictions were found

at the deeper system levels, namely those of the software requirements. This corresponds to

our expectations, since requirements on the higher levels are written less concretely and

describe the general functionality of the product. As a result, there is often no risk of

contradictions in the first place.

With our method, contradictions can be found in a semi-automated way: The classification into

cause and effect, as well as variable and condition, are fully automated, for example by using

Fischbach's parser (Fischbach et al. 2022). This way our method can be applied automatically

up to the step "Building Blocks". In the then following formalization the building blocks must be

replaced by symbols and formulas. However, this step is not automated. To the best of our

knowledge, there are currently no methods available which allow for this. Therefore, further

research is required, as mentioned in section 4.2 Future work. Once the formalization is done,

answering the questions in Figure 3-5 presents a simple – yet manual – task. This was shown

with examples in section 5.2 Results.

When applying our method, a requirements reviewer does not have to be familiar with

requirements in general or with the topic of the document anymore, to recognize contradictions,

as our method provides a simple recipe for detecting LNC-related contradictions.

Future work could entail automation, quality analysis and non-LNC-contradictions.

Requirements documents can become very extensive due to the necessary level of detail

(Göhlich and Fay 2021b). Therefore, an automated determination of contradictions would be

useful. The formalization of contradictions proposed in this paper provides strong implications

for automation, by serving as the basis for a fully automated contradiction-detection method.

The queries that would have to be made in such a code are already mathematically formulated

here.

Fundamental research on detecting contradictions in requirements: Taxonomy and semi-automated approach 28

We can also derive implications for an automated quality analysis. The classification into

different types of contradictions is an important step to quantify the quality of a requirements

document. The logical next step would be to assess the criticality of the contradiction. Based

on this, a meaningful key performance indicator could be determined. This would require

analyzing a large number of inconsistencies, to assess the impact on the product, as well as

any different resolution methods per type of contradiction. The greater the impact and the more

difficult the solution, the more critical the contradiction.

As we saw in section 1.4 Fundamentals, there are other types of contradictions besides LNC-

contradictions, that have not been discussed in this paper: dialectic contradictions and

antinomies. In our opinion, dialectic contradictions cannot be detected by applying simple

rules, instead, they require context and language comprehension. It might be possible to

achieve results with a sufficiently large and clean data set and by using machine learning

algorithms. Regarding antinomies, it should first be checked whether they occur at all in

requirements documents. A solution to these contradictions is similar to the solution of

dialectical contradictions.

3.9 Other

Author Contributions: “Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.;

validation, A.E.G..; formal analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G and D.G.;

data curation, A.E.G.; writing—original draft preparation, A.E.G.; writing—review and editing,

D.G. and T.-A.F.; supervision, D.G. All authors have read and agreed to the published version

of the manuscript.” Please turn to the CRediT taxonomy for the term explanation.

Funding: We acknowledge support by the German Research foundation and the Open Access

Publication Fund of TU Berlin.

Data Availability Statement: “Not applicable”.

Conflicts of Interest: “The authors declare no conflict of interest.”

Automated Condition Detection in Requirements Engineering 29

4 Automated Condition Detection in Requirements Engineering

This article has been published in Proceedings of the Design Society

https://doi.org/10.1017/pds.2023.71. This version is free to view and download for private

research and study only. Not for re-distribution or re-use. © Alexander Elenga Gärtner. This is

an Open Access article, distributed under the terms of the Creative Commons Attribution-

NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/).

• Authors: Alexander Elenga Gärtner, Dietmar Göhlich, Tu-Anh Fay. Automated Condition

Detection in Requirements Engineering.

• Publisher: Cambridge University Press

• DOI: https://doi.org/10.1017/pds.2023.71

• Published: 19 June 2023

• Author Contributions: Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.;

validation, A.E.G.; formal analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G.; data

curation, A.E.G., D.G. and T.-A-F.; writing—original draft preparation, A.E.G.; writing—

review and editing, D.G. and T.-A.F.; supervision, D.G. and T.-A.F.

4.1 Abstract

In product development, it is of great importance that a complete, unambiguous, and, as far as

possible, contradiction-free target system is defined. Requirements documents of complex

systems can contain several thousand individual requirements, derived in an interdisciplinary

manner and written in natural language by many different stakeholders. Hence, errors, in the

form of contradictions, cannot be completely avoided in these documents and today they must

be corrected manually with high effort.

This paper presents an important building block for automated contradiction detection and

quality analysis of requirements documents. We discuss the necessary identification of

conditions in requirements and the extraction of the verbal expressions associated with

condition and effect, respectively. We applied and analyzed natural language processing

methods based on grammatical versus machine learning models. The models have been

applied to 1,861 real-world requirements. Both approaches generate promising results, with

an accuracy partly over 98%. However, in structured specification texts, a grammatical model

is preferable due to lower effort in preprocessing and better usability.

4.2 Introduction

Requirements play a central role in the development process of virtually all products

(Loucopoulos 2005; Gericke and Blessing 2012) because they are usually a central basis of

communication when solutions are developed by several involved persons, areas or even

companies in parallel or successively (VDI-Guideline VDI 2221 Blatt 1:2019-11). It is a

widespread practice that requirements documents are formulated in a specification sheet,

which should contain the totality of the requirements within a development project (DIN DIN

69901-5:2009-01). In this context, it is of great importance that a complete, unambiguous and,

as far as possible, contradiction-free target system is defined (Bender and Gericke 2021).

However, specification sheets can contain several thousand individual requirements written in

natural language and are typically derived in an interdisciplinary manner. Additionally, the

variety of mechatronic products and the complexity of modern systems make distributed and

concurrent development at different aggregation levels of the product development process

indispensable. Typically, the requirements for each level are managed in different documents,

for the overall product in a product requirements document and for the subsystems in

component requirements documents (Göhlich et al. 2021). Therefore, it is not surprising that

errors, e.g., in the form of contradictions, are often found in these documents.

To the best of our knowledge, currently, neither a comprehensive automated contradiction

detection nor an automated quality analysis of industrial specification documents is available.

Automated Condition Detection in Requirements Engineering 30

Our work aims at providing the necessary building blocks for an automated quality assurance

of specification documents. In a previous study (Gärtner et al. 2022), we presented different

types of contradictions that can occur in requirements specifications and developed a method

to classify contradictions. In this context, the question of whether the requirement includes a

condition or not plays a central role. In fact, they give the decisive hint as to whether statements

contradict each other. For example, the two statements x must be 4 and x must be 5 are

contradictory unless the conditions state otherwise, e.g. if y=3 and if y=4. Therefore, the

detection of conditions and extraction of their constituents is crucial for reaching the goal of an

automated contradiction detection. In this paper, we want to show how conditions can be

detected automatically and which verbal expressions form the condition. In detail, this paper

aims to:

• Identify all requirements in a specification that contain conditions.

• Identify the verbal expressions that form the condition.

• Evaluate which is the best-suited approach in this context: self-written rules or machine

learning.

• Create a basis to identify contradictions automatically. According to our previous study,

the conditions and effects must be compared regarding their verbal expressions to find

contradictions (Gärtner et al. 2022).

For this purpose, we compare two natural language processing (NLP) techniques: the first one

is a model that consists mainly of grammatical rules that are directly embedded into the source

code. The second one uses a bag-of-words model for machine learning (ML). We, therefore,

elaborate on the terminology, define rules on how to detect conditionals and the related verbal

expressions, and finally present the results of our algorithms on 1,861 "real world"

requirements.

Our main dataset is in German, therefore the terminology, as well as the method, are tailored

to German grammar. As Mark Twain in The Awful German Language (1880) notes, the

German language is much more complicated than English. In this respect, we found that rules

which work well for requirements written in German can be transferred easily into rules for

requirements in written English, but not vice versa. We demonstrate this by translating and

analyzing a public English dataset gathered by Fischbach et al. (2021a) into German.

4.3 Terminology

Conditions are sub-statements of sentences that are "essential to the appearance or

occurrence of something else" (Merriam-Webster). In other words, the condition must be true

for the effect to happen. In German, the condition can be formulated either in a subordinate

clause (SC) or in an adverbial (ADVB) within the main clause (Duden 2006; Eisenberg 2016).

SC-clauses are, in German, usually marked by conjunctions such as wenn, falls, sofern (eng:

if, when, in case) (1), or by a special word position, the so-called verb-first position (2). The

second case is non-existent in English and is always translated by using one of the mentioned

conjunctions. In the examples below, the conditions are noted in italic:

1. Wenn drei Sekunden vergangen sind, dann muss das Licht ausgehen. Eng.: When

three seconds have passed, the light must go out.

2. Sollten drei Sekunden vergangen sein, dann muss das Licht ausgehen. Eng.: When

three seconds have passed, the light must go out.

Adverbial conditions occur within the main clause without the need for a subordinate clause. It

is important to note, that adverbs are defined differently in English. In German, they express

the closer circumstances of an action, a process, or a condition (Duden 2006). They are usually

marked by bei, nach, während, im Falle, and more (Eng.: at, after, while, in case of, and more).

For example: In case of a message timeout, a message should be sent to the error manager.

Although there is a comma after timeout, the first part of the sentence is not a subclause.

Automated Condition Detection in Requirements Engineering 31

It is important to note, that conditional statements are not statements of causality. The strike

of a match is a cause of the match lighting. The presence of oxygen is a condition for the match

lighting (Broadbent 2008). Confusion arises often, as both conditional and causal statements

can be introduced by If …, then: If I strike a match, then the match lights (causal statement)

and If oxygen is present, then the match can light (conditional statement). In requirements

engineering (RE), causal statements would be classified as information rather than as

requirements, as they describe why something happens and not how.

Another research topic of this paper is to identify the verbal expressions that belong to the

condition as well as the verbal expressions that belong to the subject and the verb of the

clause. These terms are also referred to as constituents (words or groups of words that function

as a unit). Unlike Fischbach's terminology (2021a) which we adopted in our previous paper,

(Gärtner et al. 2022), we now propose the following terms: variable and action. For example,

in the sentence If the threshold is reached, the controller must limit the speed decay, we must

first differentiate between the condition (If the threshold is reached) and the effect (the

controller must limit the speed decay), as shown in Figure 4-1. For the condition, the threshold

is the variable, and if is reached is the action. For the effect, the controller is the variable, and

must limit the speed decay is the action. In other words: the variable is the protagonist and the

action that what shall happen.

Figure 4-1: variable and action

4.4 Related Work

In the literature we found several approaches to automatically identify conditions and effects.

While rule-based methods were predominant in the past, machine-learning-based methods

have emerged with increasing computing power (Asghar 2016).

The rule-based approach from Khoo et al. (1998) extracts conditionals using linguistic clues

and pattern-matching, reaching an accuracy of 68 %. It was designed based on prose text from

the wall street journal, and not specifically designed for RE, which has more linguistic

restrictions than a newspaper article. This approach, as well as an approach from Liu et al.

(2021), searches for specific trigger words. Although this is a good approach, especially for

English, in the analysis of our dataset for this paper we found that 17 % of the conditions were

not based on trigger words but on special grammatical forms, as explained in section 4.3.

Águeda and Olivas (2008) present an approach to automatically extract conditionals for search

engine optimization. However, the constraints are too narrow, as "the cause must precede the

effect" (Águeda and Olivas 2008). In RE, but also in general, the effect can precede the cause,

e.g., the engine must shut down, if the engine speed is too high.

In multiple papers, Frattini and Fischbach present different machine-learning-based

approaches for the automatic detection of causalities (Frattini et al. 2022a; Fischbach et al.

2021a; Fischbach et al. 2020) and the automatic detection of their constituents (Fischbach et

al. 2021b). Although their theory is explicitly designed for RE, they do not focus on conditionals,

but on causalities. However, they seemed to have looked at both conditionals and causalities,

which could be due to an incorrect definition of the terms, as explained in section 4.3

Terminology. In addition, they have classified sentences as causal, which are not causal nor

conditional, e.g.: The fire is burning down the house. Although the fire is indeed the cause of

the house burning down and there is an implicit causal relationship, the sentence is

grammatically not causal. Furthermore, they define causality as a causal relation that requires

the effect to occur if - and only if - the cause has occurred. However, this definition is not useful

in the RE context, since effects can be triggered by multiple conditions: If the speed exceeds

the threshold, the motor is to be switched off in an emergency and if the torque exceeds the

Automated Condition Detection in Requirements Engineering  32 
 
 
threshold, the motor is to be switched off in an emergency. They investigated 14,983 sentences 
to track the extent and form in which causality occurs. Their approach achieved an accuracy 
of 82%.  
4.5  Study  
In this paper,  we compare the performance of two approaches: a rule-based grammatical 
model versus two ML-based models. The former applies the rules mentioned in section 4.3 
Terminology. The latter is implemented using a bag-of-words model for the document and two 
popular classifying theorems for NLP tasks: Naïve Bayes (NB) and k-Nearest Neighbor (kNN) 
(Bramer 2013). The bag-of-words algorithm is used to represent the document, "depicting [it] 
as a bag and each vocabulary in the texture as the items in the bag" (Ersoy 2021). The NB 
classifier uses probability to find the most likely of the possible classifications. kNN estimates 
the classification using the classification of its neighbors, with k representing the number of the 
closest instances to consider (Bramer 2013). For this study, it was deemed appropriate to limit 
the approach to utilizing these two simple and lightweight methods due to their demonstrated 
efficacy in section 4.6 - Results. Hence, the advanced features of more complex methods, 
such  as  GloVe  and  BERT,  including  improved  contextual  comprehension  and  semantic 
representation, were not considered for the current task and were not incorporated into the 
analysis. 
4.5.1  Method 
All 1,861 requirements of our datasets, see section 4.5.2 Data, were manually divided into 
conditionals  and non-conditionals  according  to section  4.3. This classification was  verified 
independently by 3 persons, all having extensive experience in requirements engineering. The 
classification is necessary to train the ML models and to determine the accuracies of both the 
ML and the grammatical models.  
Our  method  as  well  as  the  software  which  was  developed  in  form  of  a  Python  code  are 
structured in four parts as shown in Figure 4-2: 
 
Figure 4-2: code structure 
1.  Data  Preprocessing  (1):  "Data  directly  taken  from  the  source  will  likely  have 
inconsistencies, errors or most importantly, it is not ready to be considered for a data 
mining process." (García et al. 2014). Therefore, data reduction techniques must be 
applied  to  remove  irrelevant  and  noisy  elements from  the  dataset,  for  example  by 
replacing "<" with "is smaller than". 
2.  Data Preprocessing (2): "Parsing is the task of analyzing grammatical structures of an 
input  sentence  and  deriving  its  parse  tree."  (Bojić  and  Bojović  2017).  This  task  is 
sometimes  considered  to  be  part  of  the  preprocessing  (Gudivada  2018).  For  the 
grammatical model, we used parse trees, also called dependency trees, and Part-of-
Speech (POS) tagging. A parse tree is a graphical representation of the dependencies 
between words. POS tagging categorizes words in correspondence with a particular 
part  of  speech,  e.g.,  nouns,  verbs,  adverbs,  conjunctions,  etc.  The  parsing  was 
outsourced to the dependency parser ParZu, by Sennrich et al. (2013). For the ML 
models  tokenization  was  done  via  countvectorizer,  a  method  to  convert  text  to 
numerical  data,  making  a  separate  tokenizer  redundant.  While  it  does  not  employ 
lemmatization techniques, this is not critical for the specific task at hand, as semantic 
considerations do not play a crucial role in this classification problem. 

Automated Condition Detection in Requirements Engineering  33 
 
 
3.  Identifying conditional requirements: The grammatical model uses grammatical rules, 
trigger words, and POS tags to detect conditions, as explained in section 4.3. The ML 
models are trained using the labeled German dataset. An 80/20 split was chosen, i.e., 
1,249  requirements  for  training  and  313  requirements  for  testing.  For  kNN, 
hyperparameter tuning was done via gridsearch. This is a technique to determine the 
optimal values for a given model. The best results were achieved with leaf_size = 1, 
metric = minkowski,  n_neighbors = 1,  p = 2  and  weights = uniform.  The  results  are 
discussed in sections 4.6.1 and 4.6.2. 
4.  Identifying constituents: The final task is to identify the constituents, as explained in 
section 4.3. From the POS tags and the parse tree, the verbal expressions associated 
with  the  condition  and  the  effect  can  be  determined,  as  well  as  the  variables  and 
actions. We did not use ML for this task, as the dataset is too large to label all variables 
and actions manually. The results for the grammatical model are discussed in section 
4.6.3. 
4.5.2  Data 
We used two different datasets. The first dataset originates from a recent project for electric 
buses. In this context a complete requirements package was derived, that describes a modular 
system, which shall replace conventional bus powertrains with new, electric powertrains. The 
role of requirements in the development process of electric buses can be found in Design of 
urban electric bus systems (Göhlich et al. 2018). We based our study on 1,561 functional and 
non-functional  requirements.  The  specifications  are  written  in  German  and  the  examples 
discussed in this paper were translated into English. 
The second  dataset is public, originating from Fischbach  (Fischbach  et al.  2020).  It  is  an 
accumulation of approximately 15,000 requirements written in English from many different 
sources available online. From this dataset we used 300 randomly selected requirements, 
available online at Swarm-Engineer (2023). For our analysis, we translated them into German 
via  DeepL.  We  used  this  dataset  to  show  the  feasibility  to  get  correct  results  with  non-
automotive requirements and an originally English dataset.  
4.6  Results 
The models were validated with two different datasets. In sections 4.6.1 and 4.6.2 the results 
of  the  questions  on  how  to  automatically  detect  conditional  requirements  are  shown  and 
discussed. In section 4.6.3 the result for identifying the verbal expressions assigned to the 
condition/effect and to the action and variable are shown. 
4.6.1  Dataset 1 
Out  of  the  1,561  analyzed  requirements,  785  (50,3%)  were  labeled  as  conditional  by  us, 
according to section 4.3 Terminology. The labeling was verified by three persons to increase 
confidence in the correctness of the labels. For the ML algorithms, an 80/20 split was chosen, 
i.e., 1,249 requirements for training and 313 requirements as a test set.  
The overall accuracies are: 
•  grammatical model: 99% 
•  NB model: 91% 
•  kNN model: 94% 
To measure the effectiveness of the models, we used confusion matrices, as shown in Figure 
4-3.  They  are  used  for  performance  measurements  for  machine  learning  classification 
problems, where the output can be two or more classes (Narkhedem 2018). In these tables, 
the  different  combinations  of  predicted  and  reference  values  can  be  examined.  The 
combinations are called true negative (reference: 0; predicted 0), false negative (reference: 1; 
predicted:  0),  false  positive  (reference:  0;  predicted:  1),  and  true  positive  (reference:  1; 
predicted: 1).  

Automated Condition Detection in Requirements Engineering 34

Figure 4-3: confusion matrices for dataset 1 based on 313 requirements

The grammatical model (Figure 4-3(1)) only has a few false predictions compared to the other

models. It misclassified only 3 requirements, consisting of 1 false positive and 2 false

negatives. The Naïve Bayes model (Figure 4-3(2)) had the biggest difficulties with false

positives, as 23 non-conditional requirements were misclassified as conditionals. Whereas the

k-Nearest Neighbor model (Figure 4-3(3)) had the biggest difficulties with false negatives, as

14 conditional requirements were misclassified as non-conditionals.

Although we have results for the grammatical model classifying all 1,561 requirements, for the

ML models we only have results on the test set consisting of 313 requirements. Comparing the

results between 1,561 requirements and 313 would not be fair, which is why we determined

the accuracy of the grammatical model also using the 313 requirements. The accuracies are

calculated as the number of all correct predictions divided by the total number of the dataset:

𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃+𝑇𝑁

𝑃+𝑁 (1)

4.6.2 Dataset 2

To show that the algorithm is not overfitted to the main German dataset, it was tested against

a public English dataset. 300 out of approximately 15,000 sentences were randomly selected

and then labeled so that they could now be used as a second test set, as explained before.

The same models trained on the previous dataset were used and no new ML training was

applied since the behavior of the models was to be tested with unknown data from different

fields than automotive. The overall accuracies are:

• Grammatical model: 92%,

• NB model: 67%

• kNN model: 91%.

In contrast to the ML models, the reasons for the poorer accuracy of the grammatical model

(99% with dataset 1 compared to 92% with dataset 2) can be clearly identified: some

requirements were formulated in an inconsistent and complex way. Particularly adverbial

trigger words were used much more liberally compared to the first dataset, causing the model

to incorrectly identify ADVB-conditions.

The confusion matrices for dataset 2 are shown in Figure 4-4. As expected, the accuracies of

all models are lower than for the dataset 1. On the one hand, this is because we built our theory

using the dataset 1. On the other hand, the requirements in the second mostly don't follow

standard RE documentation guidelines, for example described in Pohl and Rupp (2021). This

complicates the analysis, as patterns do not exist or cannot be identified. Nevertheless, the

grammatical model and the kNN model were able to achieve solid results. Although the

accuracy for the grammatical model dropped from 99% to 92% and the accuracy from the kNN

model dropped from 94,1% to 91%, the results are still good. Thus, we were able to show how

the models can handle English datasets from other disciplines, although many requirements

Automated Condition Detection in Requirements Engineering 35

were formulated differently than in our first, automotive dataset. The NB model, however,

misclassified many requirements, especially when analyzing non-conditional requirements.

Here, only 66% were correctly classified, while 34% were misclassified. This suggests that the

algorithm was overfitted to the original dataset.

Figure 4-4: confusion matrices for dataset 2 based on 300 requirements

This leads to the following insight: some ML approaches, e.g. Naïve Bayes, are not well suited

for our kind of problem, as the data that one wants to use in practice must correspond to the

shape of the data of the training set. Especially important in this respect is the distribution of

the labels True and False. In the German training dataset, there was a distribution of about 1

True to 1 False. Naturally, the German test set had the same distribution, which is why the

accuracy is high, see Figure 4-3(2). The English test set though had a distribution of

approximately 1 True to 10 False. The algorithm, however, expects a distribution of 1/1 and

includes this in its classification. This can be seen in the result of Figure 4-4(2). More

conditionals were recognized than there actually are: namely 114 (95 + 19) True predictions

instead of 22 (3 + 19) True references.

The kNN approach looks at the classification of its nearest neighbor, which is why this ML

algorithm is not as biased toward the original label distribution. The grammatical model also

ignores the original label distribution, as it considers a fixed rule set.

4.6.3 Detection of constituents

Another research point of this paper is to identify the verbal expressions that are linked to the

condition and the effect, as well as the verbal expressions that are linked to the variable and

action, as explained in section 4.3. In this case, we did not use ML as a comparison. First, it is

very time-consuming to label a training set accordingly. Second, and much more important, if

the basis (condition-recognition) is correct, see Figure 4-2 steps 1-3, we can accurately match

all verbal expressions to the condition, using a fixed rule set. This problem is not complex

enough to expect a better solution with ML. In the previous problem of condition recognition,

for example, we could not be sure of the grammatical model being better than ML, because

there are many variables involved and ML could have found a better way - which still it did not.

The results for three exemplary sentences of the first dataset can be seen in Figure 4-5: correct

verbal expression assignments of an SC-condition and an ADVB-condition as well as an

unsuccessful assignment. Figure 4-5(1) shows, the correctly determined SC and the resulting

assignment of the verbal expressions that correspond to the condition. Figure 4-5(2) shows,

that while no SC was detected, the ADVB was correctly determined and the verbal expressions

that correspond to the condition were correctly assigned. Figure 4-5(3) shows a requirement

that could not be processed by our algorithm because the parser had difficulty interpreting the

input: For unknown reasons, it detected a main clause with the verb must and a subordinate

clause with the alleged verb transfer. Therefore, the output resulted in a completely wrong

assignment. The correct interpretation would be that must transfer is a compound main clause

verb and thus no subordinate clause prevails as a condition.

Automated Condition Detection in Requirements Engineering 36

Figure 4-5: condition/effect verbal expression detection

Figure 4-6 shows the results for variable and action detection. The first two examples show

correct verbal expressions assignments, while the third one shows a failed assignment. The

correctness of the assignments is directly linked to the correctness of the assignments of the

verbal expressions for condition and effect, see Figure 4-5 - and also fails for the same

reasons.

Figure 4-6: action/variable verbal expression detection

4.6.4 Discussion of Validity

Descriptive validity (Maxwell 1992) deals with the risk of not having remained objective when

conducting a study. Although we defined conditions based on fixed criteria, there may have

been inconsistencies when labeling the dataset. We have tried to reduce this risk by having

three persons review the labeling of the dataset. In addition, the grammatical model has an

advantage here, because a different understanding of a condition can immediately be

implemented, whereas for ML everything would have to be re-labeled and re-trained.

Generalizability (Maxwell 1992) is given when the results can be applied to other situations

that are outside the present research, which is also a threat to the study's validity. We

addressed this, by testing and validating our method with a non-automotive dataset. That is

why we believe that other industries write requirements in a similar way to the automotive

industry. This is also supported by Göhlich et al. (2021) who found that processes to manage

requirements and specifications do not differ significantly with regard to the industrial context.

However, further testing should be conducted in the future to verify its applicability in different

industries.

4.7 Summary, Conclusion and Outlook

In this paper, we have provided a building block for how to make requirements engineering

(RE) and requirements management intelligent using automated methods. In specific, we

made a proposal on how to automatically detect conditionals and how they occur in the RE. In

section 4.3, we elaborated on the terminology and on how to detect conditional sentences.

Furthermore, we laid the foundation to identify the verbal expressions respectively associated

with the variables and actions. In section 4.5, we presented the results on 1,861 requirements

while comparing a grammatical model with two machine learning (ML) models.

We found that in structured texts, such as usually found in specifications, grammatical models

are well suited for identifying conditionals and their constituents. Grammatical models show

better or at least similar results than ML approaches. Some ML algorithms e.g. Naïve Bayes,

Automated Condition Detection in Requirements Engineering 37

are not well suited for our kind of problem. For example, for this algorithm, the data that one

wants to use in practice must correspond to the shape of the data of the training set.

Furthermore, every dataset used for ML methods must first be labeled, which can be very time-

consuming and prone to human errors. Therefore, we conclude that the grammatical model is

preferable as the rules can be tracked and easily adjusted if needed, for example, by changing

trigger words. It is important to note, that this is not the case for every natural language

processing problem. These findings do not apply to unstructured text, such as in newspapers

or books. In RE, however, there seem to be certain explicit or implicit rules - depending on the

industry - according to which sentences are formulated, which massively reduces the

complexity of the problem and makes machine learning redundant.

Complete and error-free requirements specifications are crucial for a good product design.

Some specifications contain several thousand individual requirements. Therefore, it is obvious,

that automatization would result in an enormous leap in the manageability of such large

datasets. Moving in this direction, this paper creates the possibility to identify contradictions

between two requirements automatically. The approach is, that if two requirements had the

same condition and in the effect the same variables, but different actions, this would indicate

a contradiction (Gärtner et al. 2022). This research contributes a building block for this

approach by identifying these corresponding verbal expressions. Such a method could, for

example, help developers to identify critical requirements already during the design process or

even in later stages like the review phase. This will be elaborated further in a future paper.

4.8 Author Contributions

Conceptualization, A.E.G., D.G. and T.-A.F.; methodology, A.E.G.; validation, A.E.G.; formal

analysis, A.E.G.; investigation, A.E.G.; resources, A.E.G.; data curation, A.E.G., D.G. and T.-

A-F.; writing—original draft preparation, A.E.G.; writing—review and editing, D.G. and T.-A.F.;

supervision, D.G. and T.-A.F. All authors have read and agreed to the published version of the

manuscript.

4.9 Acknowledgments

We thank IAV GmbH for providing us with the requirements specification on electric buses.

Automated Requirement Contradiction Detection through formal logic and LLMs 38

5 Automated Requirement Contradiction Detection through formal

logic and LLMs

This article is licensed under a Creative Commons Attribution 4.0 International License, which

permits use, sharing, adaptation, distribution and reproduction in any medium or format, as

long as you give appropriate credit to the original author(s) and the source, provide a link to

the Creative Commons licence, and indicate if changes were made

( http://creativecommons.org/licenses/by/4.0/). The content of this chapter was published:

• Authors: Alexander Elenga Gärtner and Dietmar Göhlich

• Publisher: Springer – Automated Software Engineering

• DOI: https://doi.org/10.1007/s10515-024-00452-x

• Published: 06 June 2024

• Conceptualization, A.E.G., and D.G.; methodology, A.E.G.; implementation, A.E.G.;

validation, A.E.G.; resources, A.E.G.; data curation, A.E.G.; writing – original draft

preparation, A.E.G.; writing – review and editing, D.G.; supervision, D.G. All authors have

read and agreed to the published version of the manuscript.

5.1 Abstract

This paper introduces ALICE (Automated Logic for Identifying Contradictions in Engineering),

a novel automated contradiction detection system tailored for formal requirements expressed

in controlled natural language. By integrating formal logic with advanced large language

models (LLMs), ALICE represents a significant leap forward in identifying and classifying

contradictions within requirements documents. Our methodology, grounded on an expanded

taxonomy of contradictions, employs a decision tree model addressing seven critical questions

to ascertain the presence and type of contradictions. A pivotal achievement of our research is

demonstrated through a comparative study, where ALICE's performance markedly surpasses

that of an LLM-only approach by detecting 60% of all contradictions. ALICE achieves a higher

accuracy and recall rate, showcasing its efficacy in processing real-world, complex

requirement datasets. Furthermore, the successful application of ALICE to real-world datasets

validates its practical applicability and scalability.

This work not only advances the automated detection of contradictions in formal requirements

but also sets a precedent for the application of AI in enhancing reasoning systems within

product development. We advocate for ALICE's scalability and adaptability, presenting it as a

cornerstone for future endeavors in model customization and dataset labeling, thereby

contributing a substantial foundation to requirements engineering.

Automated Requirement Contradiction Detection through formal logic and LLMs 39

5.2 Introduction

Requirements are crucial in product development as they serve as a basis for communication

among stakeholders, teams, and companies (Loucopoulos 2005; Gericke and Blessing 2012).

Specification sheets are commonly used to capture project requirements, aiming for a

complete, unambiguous, and contradiction-free system definition (DIN DIN 69901-5:2009-01;

Bender and Gericke 2021). However, these sheets often contain thousands of requirements

in natural language, derived from interdisciplinary collaboration, typically organized and

managed in separate documents for different levels of the product (VDI-Guideline VDI 2221

Blatt 1:2019-11). The complexity of modern systems and the need for distributed and

concurrent development across various levels further complicate the process, which is why the

requirements are typically organized and managed in multiple documents (Göhlich and Fay

2021b). As a result, errors, including contradictions, are frequently found in these documents.

Currently, there is a lack of comprehensive automated tools for detecting contradictions and

performing quality analysis on industrial specification documents.

Product development is a complex and dynamic process that requires proper management

and understanding of the requirements. Multiple studies have demonstrated the critical

importance of high-quality requirements in the success of development projects. In the realm

of requirements quality research, the attention is centered on individual characteristics of

requirements, such as completeness, complexity, ambiguity, and consistency (Montgomery et

al. 2022a; IEEE/ISO/IEC 29148-2018). This paper focuses on addressing inconsistencies in

the form of contradictions, which can potentially cause misunderstandings and product defects.

Natural Language Inferencing (NLI) is a commonly encountered problem in the field of natural

language processing (NLP), in which the objective is to determine the nature of the relationship

between a premise and a hypothesis, both of which are represented as sentences (Jang et al.

2020). In recent years, NLP has become more sophisticated, with machine learning models

capable of handling complex tasks such as question answering, text extraction, and sentence

generation. It was shown that extending NLI to manage sentence relationships could have

significant implications for scientific text analysis during product development (Ritter et al.

2008).

We emphasize formal requirements written in natural language that adhere to specific

formulation guidelines, particularly using single-sentence requirements that follow the principle

of atomicity. For example, employing standard sentence templates for requirements can

ensure adherence to these guidelines. Such templates are widely used across different

industries and sectors involved in hardware, software, or system development, especially for

lower system levels (Dick 2017; Sophist GmbH 2016; Wiegers and Beatty 2013; Robertson

and Robertson 2013). This approach facilitates consistency and minimizes errors arising from

linguistic nuances.

Various solutions have been proposed to tackle this challenge, including knowledge graphs

and thesauri. Knowledge graphs are formal representations of concepts, ideas, or objects

about a specific domain and all the relationships between those concepts. Thesauri group

together words with similar (synonyms) and opposite (antonyms) meanings (Ahmad et al.

2020). While these solutions can be helpful in identifying contradictions in requirements, they

have limitations in their practicality, as we will discuss further in Section 5.4.

Building upon our foundational work (Gärtner et al. 2022), which introduced a taxonomy for

detecting contradictions and proposed a semi-automated approach, we made further strides

in automation in 2023 (Gärtner et al.). Our contribution involved developing a method to identify

conditions and other sentence components within requirements. This advancement builds

upon our previous framework, emphasizing our ongoing commitment to enhancing the

automated detection of contradictions in requirements engineering.

Our current research introduces ALICE (Automated Logic for Identifying Contradictions in

Engineering). This system synergizes the capabilities of formal logic with large language

models (LLMs) to automatically detect and resolve contradictions between two requirements.

Automated Requirement Contradiction Detection through formal logic and LLMs 40

Formal logic is a branch of mathematics that deals with the precise rules and structures for

reasoning and inference. On the other hand, LLMs are artificial intelligence (AI) models that

use machine learning algorithms to learn patterns and relationships between tokens in large

natural language datasets. We will elaborate further on this in Section 5.3.2 and explain why

LLMs are needed to detect contradictions. Combining these two solutions offers a unique

approach to identifying and resolving contradictions automatically, leveraging the capabilities

of formal logic and the data-driven approach of LLMs to identify inferences in natural language.

Through this research, we aim to provide an understanding of the strengths and limitations of

this approach and its potential applications in product development. We evaluate the

effectiveness using real requirements specifications, demonstrating its potential and paving

the way for future optimization.

This paper significantly contributes to automated contradiction detection in requirements

expressed in controlled natural language (Schwitter 2010). They are particularly relevant in the

automotive domain and related fields, as sentence templates are standard for formulating

requirements (Dick 2017; Sophist GmbH 2016). However, it is essential to note that formal

requirements have different definitions. The requirements we consider are more natural and

intuitive than those discussed in the related work (see section 5.4.2).

To establish the groundwork for this research, we built upon our previous work (Gärtner et al.

2022), which proposed six essential questions for identifying contradictions. In this paper, we

implemented these questions and improved their effectiveness by modifying and expanding

them, resulting in a comprehensive set of seven categorized inquiries. These refined inquiries

serve a dual purpose: firstly, to determine the presence of contradictions in the requirements,

and secondly, to identify the specific types of contradictions that may be present.

We provide a detailed, step-by-step explanation of our implementation and discuss the

evaluation using real-world requirements.

5.3 Fundamentals

There are several ways to identify contradictions between sentences without human

intervention. In the following section, we clarify the term contradiction, discuss two prominent

methods – formal logic and machine learning – and explain why we chose a combination of

both.

5.3.1 Contradictions

Contradictions are statements that conflict with one another. However, the definition is not

intuitively clear, so it must be explicitly defined:

Different definitions of contradictions can be found in the literature on requirements

engineering (RE). To get the most generically valid and scientifically accepted definition, we

build our theory on the logical philosophy of Aristotle (Aristoteles 1986), as shown in Gärtner

et al. (2022). The foundation of his logic – also known as term logic, traditional logic, or formal

logic – developed in his work Metaphysics is the law of non-contradiction (LNC) (Horn 2018).

Derived from this, Figure 5-1 shows different types of RE-relevant contradictions.

Figure 5-1: Contradictions relevant to RE

In alignment with the interpretation of Karlova-Bourbonus (2019), it should be acknowledged

that contraries, in addition to contradictories, will be considered. Despite falling under the term

Automated Requirement Contradiction Detection through formal logic and LLMs 41

of contradictions, it is imperative to recognize that they are not synonymous and must be

distinctly discerned.

• Contradictory opposites, such as he is sick and he is not sick, are mutually exclusive;

one must be true if the other is false, without overlap.

• Contrary opposites, like it is black and it is white, cannot be true at the same time but

can both be false, indicating they are not exhaustive.

• Subaltern relation follows, where if everybody is sick is true, it implies some people are

sick must also be true, demonstrating a logical step-down.

In practice, identifying contradictions in natural language can be challenging. Based on

empirical evidence, it is apparent that only a small number of contradictions observed in real-

world situations exhibit explicit characteristics. 'Rather, contradictions make use of a variety of

natural language devices […]. The most sophisticated kind of contradictions, the so-called

implicit contradictions, can be found only when applying world knowledge and after conducting

a sequence of logical operations' (Karlova-Bourbonus 2019) such as 'the car must be as fast

as possible' and 'the car must be as fuel efficient as possible.' Those familiar with physical

rules know that the faster a car drives, the more fuel it consumes. In this study, however, we

focus solely on explicit (prototypical) (Marneffe, Rafferty, Manning 2008) contradictions, such

as 'The car is fast' and 'The car is slow,' without conducting deep meaning processing.

5.3.2 Theoretical Background for NLP

In the following section, we will explore formal logic as a subset of symbolic AI and Large

Language Models (LLMs) as a subset of machine learning (ML).

5.3.2.1 Formal Logic

Symbolic artificial intelligence represents intelligence using abstract symbolic representations

and logical rules, which are the core elements of formal logic. Unlike earlier approaches aimed

at imitating human thought processes (Newell and Simon 1956) the symbolic AI paradigm

sought to replicate human knowledge and understanding at a conceptual level.

Formal logic plays a crucial role in symbolic AI, as it encompasses a wide range of applications,

including handling simple mathematical tasks, performing string comparisons, and extending

to more complex symbolic reasoning and logical operations such as rule-based reasoning,

knowledge representation, expert systems, and automated theorem proving. Tasks such as

parsing sentences, planning responses, reasoning about meaning, and building expert

systems have relied on symbolic problem-solving techniques (Jurafsky and Martin 2019).

It is essential to acknowledge that symbolic AI, which relies on formal logic, has achieved

significant milestones. However, it faces limitations when dealing with complex, ambiguous,

and uncertain problems (Russell and Norvig 2016). It struggles with tasks that require fluid and

intuitive thinking, perceiving subtle distinctions, and generalizing knowledge across domains:

it would have difficulties determining whether something could be fast and big simultaneously

but not fast and slow.

While rule-based systems alone cannot match human intelligence, they provide key

capabilities such as logical reasoning, explanation, and abstraction integral to human cognition

(Johnson-Laird 2006; Miller 2019).

5.3.2.2 Machine Learning

Machine learning (ML) is an approach to AI that uses data-driven techniques to train models

that can learn and make predictions. Unlike Symbolic AI, ML algorithms learn patterns and

relationships from large amounts of data, enabling the systems to adapt to new information

and make accurate predictions in real-world scenarios.

Classical ML models excel in specific tasks like image classification, speech recognition, or

natural language processing. However, they may struggle to identify contradictions across

Automated Requirement Contradiction Detection through formal logic and LLMs 42

diverse contexts, as their knowledge is confined to the data used during training (Goodfellow

et al. 2016; Kim 2018). Therefore, even these models would have difficulty determining the

relationships between something being fast and big or fast and slow simultaneously.

On the other hand, LLMs like GPT (openai 2020) and LLaMA (Touvron et al. 2023) are pre-

trained on extensive datasets, allowing them to learn general language patterns and

relationships. This broad knowledge enables LLMs to better understand context and

semantics, crucial for identifying contradictions (Surana, S., Dembla, S. & Bihani, P 2022).

They can indeed identify that something could be fast and big simultaneously but not fast and

slow. This is why, in this work, we use LLMs to detect contradictions.

5.4 Related Work

Previous research in requirements management has proposed several methods for tackling

inconsistencies between requirements. The following sections discuss taxonomies, NLP-

based conflict detection, and ontology-based conflict detection approaches.

5.4.1 Classifying Conflicts

In our foundational work (Gärtner et al. 2022), we introduced a nuanced taxonomy for detecting

contradictions based on Aristotle's Law of Non-Contradiction, recognizing the need for a

systematic approach tailored to the intricacies of requirements engineering. This classification

emerged from a rigorous analysis of contradictions commonly observed in requirements

documents, ensuring a methodical and scientifically grounded approach rather than an

arbitrary categorization. By adopting this classification, we aimed to bridge the gap between

theoretical logic and practical application in software development, ensuring our methodology

is grounded in logical rigor and applicable to the demands of modern engineering projects.

The proposed method categorizes contradictions into distinct types: contradictories, contraries,

and subalterns, which fall into three subcategories: Simplex-, Idem-, and Alius-, as shown in

Figure 5-2. The decision to focus on these subtypes was driven by their prevalence in real-

world requirements and unique challenges in automated detection and resolution.

Simplex contradictions (from Latin ‘simple’) are characterized by direct opposition without

conditional statements. These contradictions are straightforward but crucial for establishing

our classification framework. For example, ‘The car must be red’ versus ‘The car must be blue’

showcases apparent, uncomplicated contradictions.

Idem contradictions (from Latin ‘same’) involve identical conditions leading to contradictory

outcomes, presenting challenges due to their conditional nature. An example is ‘If the customer

wishes, the car must be red’ and ‘If the customer wishes, the car must be blue,’ where the

same condition yields conflicting requirements.

Alius contradictions (from Latin ‘different’), distinguished by differing conditions that result in

incompatible conclusions, illustrate the complexity of engineering requirements. An instance

of this is ‘If the customer wishes, the car must be red’ versus ‘If the car has four doors, the car

must be blue,’ demonstrating how different conditions can lead to contradictory outcomes.

These categories – Simplex, Idem, and Alius – were chosen to address the varied and complex

nature of contradictions in real-world requirements. By providing a clear structure for identifying

and analyzing these contradictions, our approach enhances the capability of automated tools

to manage these challenges.

Automated Requirement Contradiction Detection through formal logic and LLMs 43

Figure 5-2: Nine types of contradictions for RE (Gärtner et al. 2022)

Different types of contradictions occur in varying numbers, have varying levels of project

impact, and require different detection approaches. As a result, our previous study defines

questions that can be used to identify these types of contradictions. This standardized solution

is the basis for the fully automated contradiction detection presented in Section 4 of this paper.

Guo et al. (2021) suggest classifying conflicts into three fundamental types: inconsistencies,

inclusions, and interlocks, each of which can be further subdivided into seven subcategories.

Although this approach shows potential, the authors do not provide a clear method for

identifying these categories.

Wu et al. (2022) categorize contradictions into six types: negation, antonym, replacement,

switch, scope, and latent. In this classification, negation – identified through negative words

like 'no,' 'not,' 'never,' and 'neither - nor' – aligns with what we define as contradictory. Both

antonyms and replacements fit within our category contraries. The concept of scope, which

involves either narrowing or expanding the scope expressed in a sentence, corresponds to our

subset category. However, switch is not a contradiction in the context of RE, exemplified by

the sentences 'Sally sold a boat to John' and 'John sold a boat to Sally'. This scenario does

not necessarily indicate a contradiction, as it is possible for Sally to sell a boat to John while

John simultaneously sells his boat to Sally. Lastly, latent contradictions, which are not yet

developed, fall outside this paper's focus as we concentrate on explicit contradictions.

5.4.2 Natural Language Processing for Detecting Contradictions

On a general notice, Zhao et al. (2021) report that most studies (67.08%) in the domain of NLP

combined with RE consist of solution proposals evaluated through laboratory experiments or

example applications. In contrast, only a tiny percentage (7%) of these studies undergo

evaluation in industrial settings, resulting in insufficient practical validation.

Marneffe et al. (2008) suggest a definition for contradictions in NLP tasks and analyze the

task's performance. However, their focus is on implicit rather than explicit contradictions,

aligning closely with the switch-contradictions delineated by Wu et al. in the preceding section.

Li et al. (2017) state that traditional context-based word embedding learning algorithms are

ineffective for mapping contrasting words. To address this issue, they developed a neural

network to learn contradiction-specific word embeddings that can separate antonyms.

However, their emphasis is not on explicit contradictions, e.g.: 'Some people and vehicles are

on a crowded street' and 'Some people and vehicles are on an empty street' (Li et al. 2017).

While these sentences contrast with each other, they might still simultaneously be correct.

Other approaches with the same goal as Li et al. utilize WordNet and Thesaurus to obtain

additional antonyms and synonyms as semantic constraints (Chen et al. 2015; Liu et al. 2015).

Law of non-contradiction

for RE

Contra-

dictory

Simplex

Idem

Alius

Contray

Simplex

Idem

Alius

Subaltern

Simplex

Idem

Alius

Automated Requirement Contradiction Detection through formal logic and LLMs 44

These methods might be helpful for fine-grained optimization, as we will mention in Section

5.7.

Ritter et al. (2008) present an approach for automatic contradiction detection. However, their

approach does not address explicit contradictions in the context of RE, so their technique

cannot be directly applied to our case. Nevertheless, it may prove helpful for future

optimization.

Heitmeyer et al. (1996)introduce a technique called consistency checking for detecting errors

in requirements specifications using formal logic. Their paper analyzes formal requirements,

called SCR, and addresses issues like type errors, nondeterminism, missing cases, and

circular definitions.

However, the requirements are not written using everyday language. Instead, they are

expressed in a highly structured manner, where each requirement component has a

predetermined category that it must be placed in during the writing process. Our method aims

to meet requirements written more intuitively and naturally while still being formal.

Gervasi and Zowghi (2005) explore using formal logic and natural language parsing techniques

to identify inconsistencies in requirements. The authors propose a method that automatically

discovers and addresses these logical contradictions using theorem-proving and model-

checking techniques. By translating the requirement into a set of logic formulae, they can

detect contradictions such as ∝ and its negation ￢∝. Hunter and Nuseibeh (1998) suggest

utilizing formal logic, which records and tracks the information, enabling the tracking of

inconsistent information by propagating labels and their associated data. While they do not

only focus on negating assumptions like Gervasi et al., Hunter et al. can only detect conflicts

if at least a part of the argument is negated, such as ∝ and ￢∝. Therefore, Gervasi et al. and

Hunter et al. cover an essential part of the issue addressed in this paper, namely

contradictories (see Figure 5-2). However, they do not consider contraries and subalterns,

such as 'the car must be red' and 'the car must be blue' or 'it must be under 20' and 'it must be

under 10'.

Other research papers that use NLP focus on categorizing requirements, such as

distinguishing between functional and non-functional requirements (Kurtanovic and Maalej

2017), addressing security-related requirements (Jindal et al. 2016) or finding contradictions

in prose texts (Karlova-Bourbonus 2019; Sepúlveda-Torres et al. 2021b).

5.5 Automation Method

The combination of formal logic and LLMs leverages their strengths to identify contradictions

between statements. Formal logic offers a formal framework for reasoning about relationships

between statements. At the same time, the LLMs' ability to access world knowledge can be

used to ask specific prompts, guiding ALICE to identify contradictions. The prompt engineering

process involves designing and fine-tuning prompts to elicit the desired behavior from the

model. This combination enables the model to perform logical inference and identify

contradictions scalable and flexibly. The resulting system can be used to evaluate the

consistency of arguments in natural language text. While we cannot provide a comprehensive

account of all the detailed procedures and optimizations used in the code, we will explain the

relevant steps in this context in the following sections.

5.5.1 Method Fundamentals

In this section, we will delve into the core components of our approach, including a decision

tree, formal logic, and GPT prompting.

5.5.1.1 Decision Tree

In our previous work, we implemented a classifier system to identify if and what type of

contradiction is present. In the present paper, we designed this system as a decision tree, as

shown in Figure 5-3. Here, seven questions must be answered to determine the specific type

of contradiction. The first four questions refer to the effects of the requirements, and the

Automated Requirement Contradiction Detection through formal logic and LLMs 45

following three questions refer to the condition, if any. One notable advantage of this method

is its modular nature, allowing easy adaptability to future advances in the field. Therefore, if

more efficacious methodologies for resolving the questions are developed subsequently, their

integration would be unproblematic. Based on these seven questions, we show how ALICE

leads to the desired goal and discuss the advantages of each approach (ALICE or LLMs-only)

in Section 5.6.

Figure 5-3: Modular decision tree for contradiction detection in RE

5.5.1.2 Formal Logic

To detect contradictions, we need the constituents, i.e., condition and effect, as well as variable

and action, as seen in Figure 5-4. This analysis is pivotal, establishing the core structure of our

method to address the seven questions proposed by our model effectively.

Take the requirement 'If the threshold is reached, the controller must limit the speed decay' as

an example. The segments 'If the threshold is reached' and 'the controller must limit the speed

decay’ serve as the condition and the resultant effect, respectively. Within the condition, 'The

threshold' represents the variable, and 'is reached' the action. Similarly, within the effect, 'the

controller' is the variable and 'must limit the speed decay' the action. This is shown in Figure

5-4.

Automated Requirement Contradiction Detection through formal logic and LLMs 46

In terms of implementation, this step involves utilizing symbolic AI using a grammatical model

that incorporates grammatical rules, trigger words, and POS tags to detect conditions, as

shown in (Gärtner et al. 2023).

It is important to note that conditional statements are not statements of causality. The presence

of oxygen, for example, is a condition for a match to light, whereas striking the match is the

cause. The lighting of the match is called the effect. In terms of implementation, this step

involves utilizing symbolic AI using a grammatical model that incorporates grammatical rules,

trigger words, and POS tags to detect conditions.

For the contradiction detection presented in this paper, we further extended this logic by

incorporating logical and mathematical rules, including string similarity comparisons and

mathematical operations.

Figure 5-4: Showcase of Condition and Effect, as well as Variable and Action in a formal requirement

5.5.1.3 GPT

To leverage the power of LLMs while addressing the unpredictability of their responses, we

adopted a frozen model approach using GPT3. Utilizing this version ensured no updates would

be applied to the model during our experiments, providing consistency in its behavior. The later

models, such as GPT3.5 and GPT4, are subject to constant updates by OpenAI (openai

2023a). For the sake of completeness, we have nevertheless tested and briefly evaluated the

newer models. In our implementation, we employed the following configuration settings: the

engine was set to text-davinci-003, the prompt was provided as the input, a temperature of 0

was used to ensure deterministic responses, the maximum number of tokens was set to 5, top-

p sampling with a probability of 1 was employed to avoid randomness, and both the frequency

penalty and presence penalty were set to 0 to encourage a neutral influence on the generation

process. This configuration allowed us to explore the capabilities of the frozen GPT3 model

and examine its outputs in a controlled and predictable manner.

5.5.2 Preprocessing

In NLP tasks, preprocessing is critical in preparing unstructured textual data for analysis.

Preprocessing aims to convert raw text into a structured and meaningful format suitable for

further analysis. Our research builds on our prior work (Gärtner et al. 2023), making it part of

ALICE. It utilizes a fully automated preprocessing pipeline consisting of the following stages:

(1) data reduction to replace or eliminate special characters such as '<' with their corresponding

expressions; (2) text parsing to generate dependency trees and perform part-of-speech

tagging; and (3) identification of verbal expressions, including conditions, effects, variables,

and actions, collectively referred to as constituents, as explained in Section 5.5.1.2.

Automated Requirement Contradiction Detection through formal logic and LLMs 47

5.5.3 Questions

The following section summarizes the seven key questions that form the basis of the ALICE

methodology for detecting contradictions within requirement specifications. Each question

targets a specific aspect of potential contradictions, enabling a structured and thorough

analysis. Detailed examples and GPT-based prompts for each question are provided in the

appendix.

1. Variable Identity: Identifies whether the two requirements' variables are identical or

subsets.

2. Effect Inclusivity: Investigates whether one effect encompasses another, which is

essential for identifying subaltern contradictions.

3. Mutual Action Exclusivity: Identifies directly opposing actions, indicating

contradictions where the truth of one negates the other.

4. Mutual Action Inconsistency: Assesses if actions are inherently contradicting

without being direct opposites, which is essential for identifying contraries.

5. Condition Presence: Identifies conditional clauses that might affect the interpretation

of requirements, helping eliminate Simplex-contradictions contradictions.

6. Condition Equivalence: Compares conditions in two requirements to detect Idem-

contradictions that require the same condition.

7. Condition Co-Occurrence: Evaluates the possibility of two conditions coinciding,

helping eliminate Alius-contradictions contradictions.

5.6 Validation and Results

We adopt a structured empirical validation protocol based on established research

methodologies to ensure a rigorous and systematic evaluation of ALICE. Our approach is

delineated by specific research questions (RQs) designed to assess the effectiveness and

applicability of ALICE in identifying and resolving contradictions in engineering requirements.

RQ1: How does ALICE compare to existing LLM approaches regarding accuracy and recall in

detecting contradictions within complex specification sheets?

RQ2: How does employing ALICE affect the efficiency and scalability of contradiction detection

processes in real-world datasets?

5.6.1 Data Analysis and Methodology

We used three datasets in our study. Dataset 1 was compiled specifically for this study and

served as an initial test suite for developing our method. This dataset, see Section 5.11,

includes all possible types of contradicting requirements pairs as well as non-contradicting

pairs. Datasets 2 and 3 were obtained from a recent real-world electric bus project: A

comprehensive set of requirements was created to outline a modular system intended to

replace traditional bus powertrains with modern electric ones. The significance of requirements

in developing electric buses can be found in the publication Design of urban electric bus

systems (Göhlich et al. 2018) Usually, real-world specification sheets cannot be shared with

the public due to confidentiality concerns. Nevertheless, we have taken steps to anonymize

210 requirement pairs and have provided access to them, as detailed in Section 9.

Dataset 2 contains 1,071 requirement pairs, which have been manually checked and labeled

for contradictions. Hence, it can be used to validate of our method. Dataset 3 contains 3,916

pairs, which were not manually checked. On the one hand, it was used to show that the method

can handle large datasets, and on the other hand, it served to compare ALICE and GPT3.

Automated Requirement Contradiction Detection through formal logic and LLMs 48

Our datasets are inherently unbalanced, reflecting the real-world scenario where the vast

majority of requirements do not contradict each other. This imbalance is not a flaw but a feature

of using authentic datasets, where contradictions are naturally rare when comparing each

requirement against others. Our analysis methods are tailored to recognize and effectively

handle such skewness, aiming to accurately identify the relatively few, yet critically important,

contradicting combinations. This approach underscores the practical applicability of our

method in real-world settings, where the objective is not to balance data artificially but to mirror

and navigate the complexities of actual requirement specifications.

Additionally, rather than employing machine learning techniques to train on the unbalanced

dataset, our methodology leverages formal logic and pre-trained LLMs, reinforcing the

method's adaptability and efficacy in handling the intricacies of requirement specifications

without the need for dataset balancing. Traditional machine learning techniques often

necessitate a significant quantity of labeled data to effectively capture the underlying features

and patterns within a specific problem domain. Due to the absence of a large dataset

specifically tailored for contradicting requirements and inherently unbalanced datasets,

incorporating such ML techniques becomes unfeasible.

We compared the results of ALICE to the results when using only LLMs. We conducted an

evaluation solely focused on detecting the presence of contradictions without classifying the

specific type of contradiction: While ALICE demonstrated the capability to identify specific

types of contradictions, LLMs are not equipped with this classification awareness.

Contradictions in specialized industrial domains hinge on subtle semantic details. However,

LLMs are trained on broad datasets, which limits their ability to specialize in such nuanced

tasks. Thus, we limited the evaluation to general contradiction detection to ensure

comparability.

Finally, our evaluation did not consider methods using only formal logic, see Section 5.4.2. As

mentioned, traditional context-based word embedding algorithms like Word2Vec or GloVe are

ineffective for mapping contrasting words (Li et al. 2017). For instance, they would both

struggle to detect that a car can drive and be red simultaneously, but it cannot drive and be in

the air simultaneously. The latter example represents an apparent contradiction that would

require real-world knowledge to identify.

5.6.2 Results for Dataset 1

We present the first dataset's results based on the method presented in Section 5.5. It

comprises 87 requirement pairs, of which 61 exhibit contradictions. The selection of the

contradictory pairs encompasses all nine types of contradictions and considers as many

variations in their formulations as possible.

An extract of the 87 pairs is shown in Table 5-1, with one example for each type of contradiction

and some examples that are not contradictions. Table 5-2 displays the confusion matrices as

tabular representations of ALICE and GPT3 performance. In these matrices, True Positives

and True Negatives indicate correct predictions for contradictions and non-contradictions,

respectively. False Positives and False Negatives represent errors where non-contradictions

were mistaken for contradictions and vice versa. From these figures, we derive key metrics

like accuracy, precision, and recall, which reflect the model’s performance in identifying true

contradictions among the dataset (Shultz et al. 2010).

Automated Requirement Contradiction Detection through formal logic and LLMs 49

Table 5-1: extract from Dataset 1

Expected Type

Requirement Pairs

Simplex subaltern

x must be less than 12.

x must be less than 10.

Simplex contrary

The playground must be fun for adults

and children.

The playground must only be fun for children.

Simplex

contradictory

x must be equal to three.

x must not be equal to three.

Idem subaltern

If it rains, x must be between 10 and 20.

If it rains, x must be between 12 and 15.

Idem contrary

If it rains, the car must stand.

If it rains, the car must drive.

Idem contradictory

If it rains, x must be equal to 3.

If it rains, x must be unequal to 3.

Alius subaltern

If it rains, x must be between 10 and 20.

If I sing, x must be between 12 and 15.

Alius contrary

If the value of the signal

LapVeh_FueCur exceeds the value 0

(A), the signal Chrg must be set to

TRUE.

If the parameter Chrg_SubVal is set to TRUE,

the signal Chrg corresponds to the

parameterizableble value Chrg_SubValChrg.

Alius contradictory

If y is equal to 4, x must be equal to 3.

If z is equal to 4, x must be unequal to 3

The application must be able to run on

multiple platforms and operating

systems.

The application must integrate easily with

other existing software systems.

If the signal Pfx is equal to three, y must

be equal to 5.

If the signal Pfx is equal to three, z must be

equal to 5.

If the signal Pfx is equal to three, y must

be equal to 5.

If the signal Pfx is equal to 4, y must be equal

to 6.

…

Table 5-2: Results for Reference Dataset in the form of confusion matrices

ALICE

expected

GPT3

expected

calculated

Automated Requirement Contradiction Detection through formal logic and LLMs 50

The LLM-only prompts were formulated as shown in Figure 5-10. The return value is then

analyzed regarding whether it contains a string with the content 'yes,' after which the answer

is considered as positive.

ALICE achieved an accuracy of 72%, a recall of 75%, and a precision score of 83%. On the

other hand, the LLM-only method achieved an accuracy of 47%, a recall of only 32%, and a

precision score of 80%. Attempts to guide the model’s behavior using n-prompting proved

unsuccessful. Several attempts were made, mainly using examples of contradictions and the

seven questions described in Section 5.5.3. N-prompting, or “chain of thought” prompting,

involves guiding a language model through a series of logical steps to improve its problem-

solving accuracy. This technique mimics human reasoning by breaking down complex

problems into simpler, articulated steps before concluding.

ALICE is more likely to identify contradictions (i.e., 9+45=44) than the LLM (i.e., 5+20=25).

This is even more clear in the following datasets.

Figure 5-5: LLM prompt - pseudo code

5.6.3 Results for Dataset 2

In this section, we analyzed 1,071 combinations to validate our method. ALICE achieved an

accuracy of 99%, a recall of 60%, and a precision score of 94%. On the other hand, the LLM-

only method achieved an accuracy of 97%, a recall of 0%, and a precision score of 0%. Due

to unbalanced data, the accuracy alone may not be a highly relevant metric. The recall, which

measures the proportion of correctly identified positives, provides a better understanding of

the results. Accordingly, ALICE detected 60% of all contradictions, whereas the LLM method

did not detect any. For a more comprehensive analysis of the results, the confusion matrices

are shown in Table 5-3.

Table 5-3: confusion matrices for the first dataset

ALICE

expected

GPT3

expected

calculated

1044

calculate

1041

calculated

calculate

An example of an accurately identified contradicting pair (true positive) by ALICE is as follows:

1. If the value of the signal B_LdnTrvOfSin3Load_fKlLdngr exceeds the value of 0, the

value of the segment output signal LdnProcTrvtv must be set to True.

Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 ⇒ 𝑠𝑖𝑔𝑛𝑎𝑙 =

!𝑇𝑟𝑢𝑒

2. If the Parameter Load_DomWrtTgtLdnProcTrvtvToSW is set to True, the signal

LdnProcTrvtv corresponds to the configurable value Load_DomWrtLdnProcTrvtv.

Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2 ⇒ 𝑠𝑖𝑔𝑛𝑎𝑙 =

!𝑐𝑜𝑛𝑓𝑖𝑔.

Prompt: Do the following sentences contradict

each other, yes or no:

1. {Req1}

2. {Req2}

Automated Requirement Contradiction Detection through formal logic and LLMs 51

This is a contradiction of the type Alius contrary since it does not exclude that 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 and

𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2 may occur at the same time, while the effects (𝑠𝑖𝑔𝑛𝑎𝑙 =

!𝑇𝑟𝑢𝑒 and

𝑠𝑖𝑔𝑛𝑎𝑙 =

!𝑐𝑜𝑛𝑓𝑖𝑔) contradict each other.

It is a potential contradiction because it is theoretically feasible that the conditions have been

defined in another place so that parallel occurrence is excluded. Automated detection of such

scenarios is not within the scope of this paper but is conceivable for the future. Nonetheless,

a manual verification was conducted on a selective basis, where the absence of corresponding

exclusions was confirmed for the prior example: The first requirement belongs to the Function

section, while the subsequent requirement belongs to the Manual Output section. It could be

assumed that the Manual Output supersedes all other requirements due to its later placement;

however, this assumption lacks clarity and unambiguity. Sequential processing of

requirements must be explicitly defined and is not inherently evident. On the contrary, in cases

where sequentially overwriting actions are present, they must be distinctly indicated.

Therefore, this is indeed a real contradiction.

An example of a requirement pair falsely identified as Alius contrary (false positive) by ALICE

is as follows:

1. If the value of the signal B_Mt[...]Mngt is above the value of Load_DebVwTgtInpSig == 1s

for longer than Load_ThdOfLdnTrvMtl == 0.8, the value of the segment output signal

LdnProcTrvtv must be set to True.

Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 > 𝐶 ⇒ 𝑠𝑖𝑔𝑛𝑎𝑙 =

!𝑇𝑟𝑢𝑒

2. If none of the input signals is above the value of 0 and the value of the signal

B_Mt[...]Mngt falls below the value of Load_DebVwTgtInpSig == 5s for longer than

Load_ThdOfLdnTrvMtl == 0.8, the value of the segment output signal LdnProcTrvtv must

be set to False.

Formal: (𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2) ∧ (𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_3) ⇒ 𝑠𝑖𝑔𝑛𝑎𝑙 =

!𝐹𝑎𝑙𝑠𝑒

The issue, in this case, traces back to the response provided for question 7. ALICE’s approach

fails to distinguish between 'above x' (condition_1) and 'below x' (condition_3) as mutually

exclusive, leading it to believe that both conditions can co-occur. This would result in

contradicting effects, thereby rendering the requirements contradictive. It is important to note

that – in the current configuration – question 7 involves GPT3 as LLM. It is possible that more

advanced LLM models could yield better results in the future.

An example of a falsely identified contradicting pair (false positive) when relying solely on

GPT3 (LLM-only) and not using ALICE is as follows:

1. If the CAN status bit B_NchPsst12333 has the value False, an error must be written to

the segment-internal error bus.

Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_1 ⇒ 𝑥 =

!𝑒𝑟𝑟𝑜𝑟

2. If the CAN status bit BVwLoad_sOfNch12333 has the value True, an error must be

written on the segment-internal error bus.

Formal: 𝑖𝑓 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛_2 ⇒ 𝑥 =

!𝑒𝑟𝑟𝑜𝑟

When asking GPT why it thinks like that, it explains, 'Since these are different CAN message

status bits, both one and the other cannot be fulfilled at the same time. Therefore, the

requirements contradict each other.' ALICE correctly identified this as not contradicting.

Automated Requirement Contradiction Detection through formal logic and LLMs 52

5.6.4 Application to large datasets

The third dataset consists of 3,916 combinations of requirements, which took approximately

four hours to complete. GPT-3 took 0.9 seconds on average per requirement pair, resulting in

approximately one hour for the whole dataset. ALICE achieved a precision score of 86%,

signifying high accuracy in its predictions. In contrast, GPT3's precision score was 0%. Our

focus during this analysis was not on optimizing for speed or code efficiency; thus, not all

combinations could be thoroughly analyzed within the scope of this research. We did not

employ parallel processing techniques, indicating a potential for future performance

enhancements. Nevertheless, the analysis successfully showcased the method's capability to

process large datasets and underlined the combined method's advantage over solely using

GPT-3, as shown in Table 5-4.

Table 5-4: Results for Dataset 2 in the form of confusion matrices

ALICE

expected

GPT3

expected

calculated

3848

calculated

3881

calculated

5.6.5 LLM Comparison

We evaluated the datasets using not only GPT3 but also GPT-3.5 turbo and GPT-4 as well as

LLaMA by Meta. Our findings indicate that GPT-3.5 turbo had a limited performance, detecting

only one contradiction. On the other hand, GPT-4 detected five contradictions, marginally

underperforming when compared to GPT-3. LLaMA detected three contradictions. The

difference between GPT-3 and GPT-3.5 is particularly evident in question 4, ‘Are the actions

mutually inconsistent?’ which involves two questions: Question 1: Does 'must be set to true'

mean the same as 'corresponds to the parameterisable value Ldn_C'? And Question 2: could

the following statements potentially contradict each other? 'it must be set to true’ 'it must

correspond to the parameterisable value Ldn_C.'

Table 5-5 depicts the divergent responses obtained from GPT-3, GPT-3.5, GPT-4, and LLaMA,

with GPT-3.5 yielding a less satisfactory answer than the other two models. While GPT-4's

response is akin to that of GPT-3, its computational time is notably higher, and thus, we opted

to employ GPT-3. LLaMA comprehends problems or inquiries accurately but frequently derives

erroneous conclusions, generating increasingly imaginative responses.

Automated Requirement Contradiction Detection through formal logic and LLMs 53

Table 5-5: Different answers generated by GPT3, GPT3.5, GPT4 (22.03.2023) and LLaMA

GPT3:

text davinci

ChatGPT:

GPT3.5 Turbo

ChatGPT:

GPT4

LLaMA

No, they do

not mean

the same

thing.

Yes, 'must be set to true'

can' mean the same as

'corresponds to the

parameterisable value

Ldn_C' if that specific value

requires the parameter to be

set to true.

No, 'must be set to true'

means that the value must be

set as true, while

'corresponds to the

parameterisable value

Ldn_C' means that it should

match the given parameter

value.

no.

LdN_F is a switch in the

SIM that sets the

simulator to either use

the local directory name

(Ldn_F), or not (Ldn_T)

[…]

Yes, these

statements

could

potentially

contradict

each other.

No, the two statements do

not potentially contradict

each other.

Yes, these statements could

potentially contradict each

other if the parameterisable

value Ldn_C is not set to

true. If Ldn_C is set to true,

then there is no contradiction.

no.

We cannot have more

than one such statement

in any given model, but if

both are present in a

particular model they

could easily refer to

different parameters.

This outcome is likely not attributable to a lack of quality in the model but instead stems from

an alternative objective for which the model was intended. An illustrative example of LLaMA's

propensity to stray further from the target with increasing response length is provided in Table

5-5. The first answer was accurate; however, subsequent elaboration led to irrelevant content.

The initial response to the second question was inaccurate, but the ensuing explanation was

correct. Therefore, one could even argue that the second answer is an oxymoron, as it

contradicts itself. Notably, no definitive comparison between LLaMA and GPT can be drawn

from this study. It is plausible that the model's performance may vary depending on the nature

of the input stimuli. Future investigations utilizing different input conditions may yield more

informative insights into the comparative performance of the two models.

5.6.6 Criticality Assessment

ALICE enables the conclusion on the criticality of requirements and, to some extent, on the

development costs. Contradictions falling under the types Idem (e.g., 𝐴 ⇒ 𝑥 =

!𝑐 and 𝐴 ⇒

𝑥 =

!𝑘) and Simplex (e.g., 𝑥 =

!𝑐

and

𝑥 =

!𝑘 ) are always considered real contradictions

because either there is no condition, or the conditions are identical. However, contradictions

under the Alius type (e.g., 𝐴 ⇒ 𝑥 =

!𝑐

and 𝐵 ⇒ 𝑥 =

!𝑘) are potential contradictions.

As noted in Section 5.6.3, it is a potential for contradiction as it is theoretically possible for the

conditions to have been defined in a manner that precludes their parallel occurrence. While

this paper does not address the automated detection of such scenarios, it remains a possibility

for future exploration.

Furthermore, question 5, ‘Is there a condition?’ provides additional insight, as noted by Frattini

et al. (2022b), regarding the correlation between the existence of conditions, lead times, and

volatility of requirements. Fischbach's findings suggest that conditions with a strict semantic

structure can lead to a more understandable requirement, which can subsequently facilitate

the translation of the requirement into downstream artifacts, such as code or test cases.

5.6.7 Conclusion

After conducting a thorough evaluation, we now return to our initial research questions to

contextualize our findings within the broader objectives of our study.

RQ1: How does ALICE compare to existing LLM approaches regarding accuracy and recall in

detecting contradictions within complex specification sheets?

Automated Requirement Contradiction Detection through formal logic and LLMs 54

Our analysis reveals that ALICE demonstrates a notable improvement in both accuracy and

recall rates over LLM-only approaches. As evidenced by the results in Dataset 1, ALICE

achieved an accuracy of 72%, a recall of 75%, and a precision score of 83%, significantly

outperforming the LLM-only method, which registered an accuracy of 47%, a recall of 32%,

and a precision score of 80%. For Dataset 2, ALICE achieved an accuracy of 99%, a recall of

60%, and a precision score of 94%. On the other hand, the LLM-only method achieved an

accuracy of 97%, a recall of 0%, and a precision score of 0. These findings underscore ALICE's

ability to detect contradictions accurately, highlighting the effectiveness of integrating formal

logic with LLMs.

RQ2: How does employing ALICE affect the efficiency and scalability of contradiction detection

processes in real-world datasets?

The approximately 4-hour-long evaluation of Dataset 3, comprising 3,916 requirement pairs,

demonstrates ALICE's proficiency in managing large datasets efficiently with a precision score

of 86%. Despite the inherently unbalanced nature of real-world datasets, ALICE's methodology

was adept at identifying the relatively few but critically important contradicting combinations.

These results have been critically evaluated and validated as actual contradictions by the

second author, who brings over two decades of experience at a major automotive OEM in

Germany. His last role involved responsibility for concept design and component integration of

various vehicles, underscoring his requirements engineering expertise (Knothe et al. 2006;

Göhlich 2008). This showcases ALICE's practical applicability and scalability, marking a

significant step in automating contradiction detection in engineering requirements.

5.7 Limits

Several limits to the method must be considered: on the one hand, limits of the LLM, and the

other, limits of formal logic.

5.7.1 Limits of LLMs for Contradiction Detection

LLMs lack adequate data to facilitate the identification of specific types of contradictions.

Therefore, we consider the limits of general contradiction detection with LLMs.

Detecting fundamental contradictions is a straightforward task, and the corresponding

reasoning can be effectively deduced, as exemplified in the subsequent instance, as seen in

Figure 5-6:

Figure 5-6: GPT3 has enough knowledge to detect and explain the present contradiction

Nevertheless, LLMs do not perform accurately in detecting contradictions according to the

definition in Section 5.3.1. It is worth noting that the model's response is not necessarily

incorrect but rather inappropriate in the RE context. It is plausible that a human, unaware of

the context, might respond similarly to GPT3, as shown in Figure 5-7.

Prompt: Are the following sentences contradictory?

1: The car shall stand.

2: When it rains, the car shall drive.

Response: Yes, these sentences are contradictory.

Follow-up prompt: Elaborate.

Response: Sentence 1 implies that the car should always stand, no matter what

the weather is like. Sentence 2 implies that when the car shall drive when it

rains, which contradicts the first sentence.

Automated Requirement Contradiction Detection through formal logic and LLMs 55

Figure 5-7: GPT3 is not able to properly detect the present contradictions. Firstly, the sentences are not

conditions but mere sentences, and secondly, although the conditions can indeed be true at the same time, this

would lead to contradicting effects.

Another example of LLM shortcomings was mentioned in Section 5.6.2 as the false positive

example concerning the opposite words below and above, which were not correctly classified.

One way to address such shortcomings is to combine machine learning models, such as

detecting opposing words from Li et al. (2017).

Furthermore, an inconsequential usage of the words must and shall can lead to issues with

LLM models. In some cases, the model did not recognize these words as synonyms and

labeled the effects differently. In general, however, one should refrain from using both words

in the same document when writing requirements (Sophist GmbH 2016).

Also, LLMs have trouble with strings that are too long. For example, a parameter name such

as OpfsOpfsCnvrGfgsddhFTiOutWRTEForHGFoUftToGhfS cannot be analyzed as it is a 'long

and complex string that doesn't provide any meaningful information to perform a character

comparison' according to GPT. The string is analyzed correctly after inserting an underscore

between the signal name, such as OpfsOpfsCnvrGfgs_ddhFTiOutWRTEForHGFoUftToGhfS.

Finally, for example, when specifying requirements, using the phrase

'it must be set equal to TRUE'

in one instance and in another instance

'it must be not be manipulated'

it would not result in a contradiction. However, substituting the second instance with

'it must be not be set'

it would produce a correct contradiction detection. Although this may not be immediately

apparent, employing precise and consistent language in technical writing is critical to prevent

errors and misinterpretations. Thus, adhering to formal language conventions when dealing

with NLP tasks in technical documents is highly recommended.

5.7.2 Limits of Formal Logic

Filler words can introduce errors in natural language processing tasks. For instance, consider

the sentence 'Amy and I both have to fight him,' where the term both is treated as the variable

instead of Amy and I. On the other hand, when comparing it to a second sentence, 'Neither

Amy nor I have fought against him,' Amy and I would appropriately be identified as the

variables. No contradiction would be observed in this case because the variables are

considered distinct. Eliminating the filler word both enables accurate identification of the

contradiction in the sentence. These findings suggest that filler words should be cautiously

treated in natural language processing applications to ensure accurate results.

Passive sentence constructions are prone to misinterpretation and should be avoided. For

instance, the statement 'There must be no manipulation of Com_Batt' is better expressed as

'The function must prevent manipulation of Com_Batt.' Using active sentence structures

enhances clarity and enables the algorithm to identify the variable performing the action. This

aligns with general rules for writing requirements that aim to reduce ambiguity (Sophist GmbH

2016).

Prompt: Are the following sentences contradictories?

1: If the actual heater stage „Cbn[...]“ is greater than 0, the requested pump

power „Cbn_Sp[...]“ must be limited by the parameter „Cbn_Tri[...]“.

2: If „Batt[...]“ is false, the requested pump power „Cbn_Sp[...]“ must be

limited to 20%

Response: No, these sentences are not contradictories. They are both conditions

that could be true at the same time.

Automated Requirement Contradiction Detection through formal logic and LLMs 56

Improper placement of commas in complex sentences can result in erroneous condition

detection, leading to inaccurate identification of variables and actions. As such, it is crucial to

ensure that commas are placed correctly in technical writing. Failure to do so may introduce

flawed interpretations and subsequent errors. Consequently, it is advisable to exercise caution

when using complex sentence structures and to verify that commas are placed accurately to

enhance clarity and precision.

The creation of neologisms should be minimized in technical writing. In cases where they

cannot be avoided, adding them to a dictionary is crucial to avoid being detected as out-of-

vocabulary terms. Therefore, it is advisable to employ established terminology whenever

possible.

5.7.3 Threats to Validity

In conducting this research, we have identified and addressed several potential threats to the

validity of our findings, guided by established frameworks in empirical research (Wohlin et al.

2012).

External Validity: The study does not provide sufficient grounds to make claims about the

generalizability of the results for non-formal requirements, as only formal requirements were

analyzed. Focusing on formal requirements was driven by their prevalence and criticality in the

domains studied. However, it should be noted that the basis of our investigation is generally

applicable grammatical rules. Also, the analyzed data sets follow rules automakers generally

apply when writing formal requirements (Ahmad et al. 2020; Dick 2017; Sophist GmbH 2016).

Further investigation and comparison with additional data sets are necessary to address

contextual factors such as the domain, development process, requirements engineering

techniques, and technologies used.

Internal Validity: We have minimized researcher and confirmation bias through a

standardized methodology based on validated grammatical rules, reducing subjective

interpretation. Future enhancements could benefit from practitioner insights and further

validation studies.

Construct Validity: In our study, we have taken careful steps to ensure that the operational

definitions used closely align with the theoretical constructs we aim to investigate, a critical

component for maintaining construct validity and ensuring our measurements reflect what we

intend to measure.

Conclusion Validity: We have not further detailed the statistical methods employed to affirm

the reliability of our conclusions. Robust statistical analysis is essential to verify that the

relationships observed in our study are statistically significant and not due to chance, thus

underpinning the integrity of our study's conclusions. The potential for overfitting the

methodology to the data is a legitimate concern, given the limited availability of a sizable

dataset upon which the approach was originally devised.

By acknowledging these limitations and areas for improvement, we underscore our

commitment to advancing automated contradiction detection in engineering requirements.

5.8 Conclusion and Outlook

This paper contributes to a fully automated contradiction detection for requirements written in

controlled natural language that combines formal logic and LLMs called ALICE. This hybrid

method involves prompt engineering, which entails designing and fine-tuning prompts to elicit

the desired behavior from the model, such as identifying contradictions. The seven-step

process is scalable and can be adapted to future advancements in the field, providing a

modular way to evaluate the consistency of arguments.

The performance of ALICE and a purely LLM-based method for automated contradiction

detection was evaluated in this study. ALICE achieved a higher accuracy (99%), recall (60%),

and precision (94%) than the LLM-only method (97%, 0%, and 0%). These findings show that

ALICE is more effective than LLM approaches in identifying contradicting formal requirements.

Automated Requirement Contradiction Detection through formal logic and LLMs 57

However, the limitations of both LLMs and formal logic in detecting contradictions in technical

documents were also highlighted. LLMs may produce false positives due to the use of opposite

words and lack adequate data to accurately detect specific types of contradictions. On the

other hand, formal logic may be misled by filler words and improper placement of commas,

and neologisms should be avoided to ensure accurate condition detection. To ensure precision

and reduce ambiguity, it is recommended to use active sentence structures and established

terminology in technical writing.

Combining different machine learning models, such as the detection of opposing words by Li

et al. (2017), could be a solution to address these shortcomings. Investigating tailored models

should be a priority for future research, as our method is predestined to be customized. On the

one hand, various approaches to answering the seven questions should be explored. On the

other hand, experimenting with different thresholds for string comparisons might prove

beneficial, especially when dealing with datasets that adhere to different formulation rules.

Also, we validated the method on requirements at lower, more detailed system levels. Although

we believe it would also work on higher levels, a thorough validation with actual requirements

should be conducted since the reference dataset included such requirements.

Our study focuses on detecting contradictions between pairs of requirements due to their

significant impact. We recognize the potential for contradictions among multiple requirements.

Addressing these would require advanced analysis methods. Future work could explore

extending our methodology to tackle these more complex scenarios, enhancing contradiction

detection in requirements engineering for complex system development.

Furthermore, using the presented method, it is now possible to create labeled datasets – which

do not yet exist, as explained in Section 5.6. This could enable the potential use of individual

ML models for contradiction detection in the future.

Alius contradictions are potential contradictions due to theoretically infeasible conditions, as

explained in Sections 5.6.3 and 5.6.6. An automated analysis of such conditions was not within

the scope of this paper; however, it is conceivable for the future.

A subsequent area of exploration involves the integration of ALICE into the product

development lifecycle. In another work (Gärtner and Göhlich 2024b), we have detailed a

comprehensive overview for embedding ALICE into product development processes, offering

developers and product owners a practical tool for identifying and resolving contradictions in

the development phase. This integration streamlines the requirements engineering process

and significantly reduces the risk of costly reworks and project delays associated with

unresolved contradictions. Future work will continue to refine these implementation strategies,

further elucidating how industry professionals can seamlessly adopt ALICE to bolster the

development of error-free, high-quality products.

In conclusion, advanced AI systems can perform sophisticated reasoning tasks by combining

formal logic and LLMs. ALICE provides a promising approach to detecting contradictions in

natural language text requirements specifications.

Automated Requirement Contradiction Detection through formal logic and LLMs 58

5.9 Appendix

Question 1

The first question determines whether the variables of effect 1 and effect 2 are identical or if

one set of variables is a subset of the other. This determines if two statements can exhibit an

LNC contradiction; see Section 5.3.1. A contradiction can only occur when the variables are

the same or when one set of variables is a part of the other, such as comparing:

1. table

2. table leg

First, we check if both variable 1 and variable 2 are considered as out-of-vocabulary labels,

identifiers, naming conventions, or newly coined terms. If they are, we verify if the characters

match exactly using formal logic. If there is a 100% match, the answer is 'yes, they are

identical.' If the match is not exact, the variables may still have the same meaning, like 'car'

and 'automobile.' In this case, we query GPT-3 to assess whether the variables are

synonymous, as shown in Figure 5-8.

Figure 5-8: Prompt for the first question

If the LLM confirms the variables are synonymous, the answer to the first question is saved as

'yes, they are identical". By explicitly asking to answer with 'Yes or No,' the response can be

framed to include 'Yes' or 'No' in the output.

Question 2

As explained below, the second question was not processed in this implementation. It

examines the potential inclusivity between the actions, determining whether one condition

subsumes the other, with one effect serving as the superaltern and the other as the subaltern.

For example, in the sentence 'It must be [action],' the actions

1. between 15 m and 30 m

2. between 20 m and 22 m

demonstrate this relationship, as the range of the second action is wholly encompassed within

the range of the first action.

This step is only relevant when it is necessary to identify a precise contradiction type (i.e.,

subalterns). If identifying the general contradiction type suffices, this question can be omitted.

Current LLMs cannot accurately answer this question due to GPT-3's limited mathematical and

logical abilities (openai 2023b; Friedman 08.2020). In our case, all contradictions belonging to

the subaltern category are automatically designated as contrary-type contradictions.

Question 3

The third question identifies mutual exclusivity between two actions, detecting contradictory

opposing actions such as

1. the table must be blue

2. the table must not be blue

When one statement is true, the other is false, and vice versa, indicating mutually exclusive

actions. To answer this question, we explored a solution with formal logic and a solution with

GPT-3.

Formal logic approach: First, we identify negation terms like 'not,' 'neither,' 'not equal,' 'none,'

'never,' 'only,’ 'nowhere,' or 'without,' which hint at a potential contradiction. Next, we compare

the actions, disregarding negation words, e.g.:

Prompt: "are the words \"{}\" similar to words \"{}\"? Yes

or No?".format(variable1, variable2)

Exemplary Response: '\n\nYes'

Automated Requirement Contradiction Detection through formal logic and LLMs 59

1. must be blue

2. must not be blue

and evaluate their similarity. As in question 1, this is done by comparing the strings. We

achieved the best results with a similarity factor of 90%.

GPT-3 approach: The GPT-3 prompt is shown in Figure 5-9. This solution yielded better results

than formal logic. Therefore, it was used for the evaluation in Section 5.6. If a similarity is

observed, the answer to the third question is saved as 'yes.'

Figure 5-9: Prompt for the third question

Question 4

The fourth question assesses mutual inconsistency between two actions, identifying contrary

(not contradictory, see Section 5.3.1) opposing actions like

3. the table is red

4. the table is blue

To answer this question, we evaluate whether the actions have the potential to contradict each

other using GPT-3. If the model output yields a 'yes' as response, the answer to the fourth

question is saved as 'yes', as shown in Figure 5-10.

Figure 5-10: Prompt for the fourth question

Question 5

The fifth question determines the existence of a condition, such as

1. If I am in France, …

to eliminate potential Simplex-contradictions. We apply our previous model (Gärtner et al.

2023) using NLP methods based on grammatical rules, as explained in Section 5.5.1.

Although LLMs might seem suitable for resolving this query, their reliability is questionable, as

shown in Figure 5-11. Obviously, the sentence does not contain a condition; in fact, it is a

simple statement. Consequently, we rely on the grammatical model for greater assurance and

100% replicability.

Figure 5-11: Condition Detection with GPT-3

Prompt: "Are the following sentences similar to each other?:\n-

{}\n- {}?".format(action_tokens1, action_tokens2)

Exemplary Response: '\n\nYes, they'

Prompt: "could the following statements potentially contradict each

other?\n- \"it must {}\"\n- \"it must {}\"\n\nAnswer 1:

".format(action1, action2)

Exemplary Response: 'Yes, the two statements could potentially

contradict each other.'

Prompt: “Does the following sentence contain a condition: The car

must stand.”

Response: 'Yes, the sentence "The car must stand" contains a

condition. This condition is that the car must remain in a standing

position.'

Automated Requirement Contradiction Detection through formal logic and LLMs 60

Question 6

The sixth question identifies if the first condition is equivalent to the second condition to detect

Idem-contradictions that require the same condition, such as

1. If I am in France, (the table must be blue)

2. If I am in France, (the table must be red)

This question is answered with formal logic using string compares. If similarity exceeds a

predefined threshold, the answer is 'yes.' We achieved the best results with a threshold of 95%.

Question 7

The seventh question determines if the first and second conditions can co-occur. The objective

of this question is to assess whether two statements have the potential to contradict each other

based on the theoretical possibility of both conditions happening at the same time. For

example, the statements

1. If I am in France, (the table must be blue)

2. If I am in Germany, (the table must be red)

cannot co-occur and, therefore, are not contradicting regarding RE. However, the statements

1. If I am in France, (the table must be blue)

2. If it is hot outside, (the table must be red)

could theoretically contradict each other as they can occur at the same time.

To answer this question, we use GPT-3, as shown in Figure 5-12. If the model output yields a

'yes' as response, the answer to the seventh question is saved as 'yes.'

Figure 5-12: Prompt for the seventh question

5.10 Author Contributions

Conceptualization, A.E.G., and D.G.; methodology, A.E.G.; implementation, A.E.G.; validation,

A.E.G.; resources, A.E.G.; data curation, A.E.G.; writing – original draft preparation, A.E.G.;

writing – review and editing, D.G.; supervision, D.G. All authors have read and agreed to the

published version of the manuscript.

5.11 Data Availability

Dataset 1 is available. Datasets 2 and 3 are partially available, as described in Section 5.6.

The data is available in the attachments.

5.12 Acknowledgments

We thank IAV GmbH for providing us with the requirements specifications for electric buses.

5.13 Competing interests

The authors declare no conflict of interest.

Prompt: "can the following states occur at the same time? Yes

or no?:\n1. {} \n2. {}".format(condition1, condition2)

Exemplary Response: '\n\nYes'

Discussion 61

6 Discussion

This dissertation primarily centers on automating an aspect of requirements engineering (RE),

specifically detecting contradictions in requirements. To assess the validity of the findings, this

section compares them to the outcomes of similar studies. Following this, Section 6.2 delves

into the potential development of a software tool named ALICE, which stands for Automated

Logic for the Identification of Contradictions in Engineering. It is a hypothetical software tool

that integrates the insights and discoveries presented in this dissertation, referred to as the

"hybrid model" in the third paper. Two fundamental aspects are explored: firstly, the viability of

developing such a tool, and secondly, its potential utility in the development processes. Finally,

the discussion addresses the method's limitations and reflects the methodology employed.

6.1 Comparison of the Results with the Literature

Contradiction extraction has been widely utilized in various domains, particularly in processing

news and rumors texts (Lendvai and Reichel; Sepúlveda-Torres et al. 2021a; Marques 2015).

However, directly applying these existing models to the domain of requirements engineering

presents challenges due to the generic type of negation commonly found in standard text. The

language used to articulate requirements in the RE field often exhibits a distinct characteristic

– abounding in mechanical semantics and symbols, such as signal names – thus introducing

an added layer of complexity. Furthermore, the sentence structures employed in requirements

documentation can be non-intuitive, further complicating the adaptation of existing models.

Consequently, adapting previous models directly to the RE domain is challenging.

Remarkably, the specific area of analyzing contradictions within requirements engineering has

received limited research attention despite the widespread use of contradiction extraction

techniques in other domains.

Before comparing the literature, it is essential to provide an overview of this dissertation's

research methodology and procedures (see Figure 6-1).

Figure 6-1: Procedure, based on Figure 2-2: DIKW and Contradiction Detection

1. A taxonomy is a systematic and hierarchical classification system that organizes and

categorizes various elements or concepts based on their inherent characteristics and

relationships. This dissertation's taxonomy serves as a structured framework for

classifying different types of contradictions. This hierarchical arrangement allows for a

more comprehensive and structured approach to detecting contradictions. In this

section, taxonomy is referred to as classification in order to discuss it compared to

existing literature.

2. Information represents data. It is often unprocessed and lacks context. Information

becomes meaningful when it is organized, structured, and contextualized. For example,

a list of numbers is Information. When transforming raw data into meaningful

Information, a preprocessing step is essential. Therefore, in subsequent discussions,

this stage will be referred to as 'preprocessing' in order to discuss it compared to

existing literature.

3. Knowledge is derived from information through organizing, analyzing, and

understanding data. It involves the application of reasoning, experience, and expertise

Discussion 62

to interpret Information. Since interpretation plays a pivotal role, acquiring Knowledge

will be denoted as 'Interpretation Part 1' in the following sections, which will be explored

and discussed.

4. Wisdom is the highest level of the DIKW pyramid and involves the application of

Knowledge in a way that demonstrates judgment, discernment, and ethical

considerations. It goes beyond knowing facts and includes the ability to make wise and

principled decisions. Wisdom is often built upon a foundation of Knowledge and

experience. As interpretation remains crucial in this context, acquiring Wisdom will be

referred to as 'Interpretation Part 2' in the following sections, which will be examined

and discussed further.

The main technical distinction of this dissertation from most existing methods lies in the

approach employed, which does not rely on training. Instead, a general methodology is

adopted that is not tailored exclusively to a specific objective. The approach combines symbolic

AI and LLMs (large language models). Symbolic AI draws upon principles of logic, grammar,

mathematics, and similarity measurements. Conversely, LLMs undergo training, but it is a one-

time process, and they are generally designed for general tasks rather than specifically for

detecting contradictions. In the case of this research, employing machine learning is

impractical due to the absence of available labeled training data specifically focused on

conflicting requirements.

However, there are rule-based methods documented in the literature that do not require

additional training. While these methods are often tailored to specific datasets, there are

approaches comparable to the one presented in this dissertation, which are dataset-agnostic

but focused on a specific topic. It is most reasonable to consider the medical field as a detailed

point of comparison, as medical texts are also scientific and technical, unlike, for example,

newspaper articles.

The following three studies will be discussed and evaluated as their methodologies align with

this dissertation’s approach, presented in Figure 6-1:

• Preum et al. (2017) present Preclude, a semantic rule-based system that identifies

conflicts in wellness advice obtained from online health forums. The system utilizes a

polarity lexicon created from verbs in the training set and their synonyms from WordNet.

WordNet is a lexical database and semantic network that organizes words, including

nouns, verbs, adjectives, and adverbs, into sets of synonyms and interlinks them

through conceptual-semantic and lexical relations. This lexicon enables labeling

actions mentioned in the text as positive or negative, facilitating the detection of

conflicting advice.

• Tawfik and Spruit (2018) proposed an automated two-phase contradiction detection

model for medical literature. The model integrates semantic properties and a Learning-

to-Rank framework to identify key findings accurately. It incorporates negation,

antonyms, and similarity measures to detect contradictions.

• Sarafraz (2011) introduced a method for extracting conflicting molecular events using

lexical, syntactic, and semantic features. The study explores both rule-based and

machine-learning approaches.

Discussion 63

6.1.1 Taxonomy

Preum et al. propose the categorization shown in Table 6-1:

Table 6-1: Possible cases of conflict (Preum et al. 2017)

Cases

Advice 1

Advice 2

Opposite

polarity

(actions)

Eat citrus fruits and green leafy

vegetables as they are rich in Vitamin

Be careful about green leafy vegetables

if you are on Coumadin or ACE

Inhibitors.

Opposite

polarity

(effects)

Pate made from meats may carry the

listeria bacteria and cause listeriosis.

Avoid eating it while pregnant.

Consume red meat at least two to three

times a week to fight anemia.

Temporal

Do stretching exercises when you

wake up.

Avoid stretching or similar exercises

after the end of week 12 of your

pregnancy.

Conditional

Alcohol may severely affect your

baby’s development. Avoid alcohol if

pregnant or trying to conceive.

Small amounts of alcohol increase the

body's metabolic rate, causing more

calories to be burned.

Sub-typical

Eat calcium-rich foods like milk,

cheese and green vegetables.

Use skimmed milk instead of whole milk

as dairy products often cause bloating

and gas.

Quantitative

Limit your caffeine intake to less than

200 milligrams per day during

pregnancy.

Up to 400 milligrams (mg) of caffeine a

day appears to be safe for most healthy

adults.

Cumulative

effect

Run for at least 30 minutes a day.

Take Salmeterol 1 inhalation (50 mcg)

twice daily.

What Preum et al. call ‘Opposite polarity (effects)’ is the same as what would be called in this

dissertation ‘Contradictories.’ The other cases by Preum – apart from ‘Conditionals’ – are equal

to what is called ‘Contraries’ in this dissertation. However, the ‘Conditionals’ from Preum et al.

seem to combine the lexical definition of conditions and causes. These are not the same, as

explained in the second paper in section 4.3 Terminology:

‘The strike of a match is a cause of the match lighting. The presence of

oxygen is a condition for the match lighting (Broadbent 2008). Confusion

often arises, as both conditional and causal statements can be introduced by

If …, then: If I strike a match, then the match lights (causal statement), and

If oxygen is present, then the match can light (conditional statement). ‘

Another distinction is that in this dissertation, contradicting requirements with a condition have

the prefix ‘Alius’ as seen in Table 6-2. This opens the possibility of contradictory and

simultaneously conditional contradictions, i.e., ‘Alius Contradictory’. In Preum et al.’s

taxonomy, this is not clearly defined. For example, the following sentences would be opposite

polarities as well as conditional conflicts: ‘If you are in hospital, avoid eating meat’ and ‘If you

are under 30, consume red meat at least two to three times a week’.

Discussion 64

Table 6-2: Possible cases of contradictions.

Contradictions

Requirement 1

Requirement 2

Simplex

Contradictory

𝑥 =

!𝑘

𝑥 =

!¬𝑘

Idem

𝐴 ⇒ 𝑥 =

!𝑘

𝐴 ⇒ 𝑥 =

!¬𝑘

Alius

𝐴 ⇒ 𝑥 =

!𝑘

𝐵 ⇒ 𝑥 =

!¬𝑘

Simplex

Contrary

𝑥 =

!𝑐

𝑥 =

!𝑘

Idem

𝐴 ⇒ 𝑥 =

!𝑐

𝐴 ⇒ 𝑥 =

!𝑘

Alius

𝐴 ⇒ 𝑥 =

!𝑐

𝐵 ⇒ 𝑥 =

!𝑘

Simplex

Subaltern

𝑥 <

!𝑐 + 𝑘

𝑥 <

!𝑐

Idem

𝐴 ⇒ 𝑥 <

!𝑐 + 𝑘

𝐴 ⇒ 𝑥 <

!𝑐

Alius

𝐴 ⇒ 𝑥 <

!𝑐 + 𝑘

𝐵 ⇒ 𝑥 <

!𝑐

Legend: Contradictory opposites are mutually exhaustive and mutually inconsistent. Contrary opposites are also mutually inconsistent but not

exhaustive. The statement ‘some people are sick’ is the subaltern of ‘everybody is sick’ (superaltern). Simplex (lat. = simple): without conditions.

Idem (lat. = same): with same conditions. Alius (lat. = different): with different conditions.

Sarafraz has no explicit taxonomy but divides the contradictions into one of nine predefined

medical topics. Tawfik’s detection categorizes them as either entailment or contradictory

relations.

6.1.2 Preprocessing and Part-of-Speech tagging:

Preprocessing refers to cleaning and transforming raw data to prepare it for further analysis.

Preum et al., Sarafraz, and Tawik employed traditional techniques as a preliminary step in their

research.

Following the preprocessing stage, they proceeded with feature extraction and part of speech

tagging. Feature extraction involves extracting meaningful and relevant information from the

preprocessed data to capture syntactic and semantic information. Part of speech tagging is a

specific technique within feature extraction that assigns a part of speech (e.g., noun, verb,

adjective) to each word in a text.

Similarly, this dissertation incorporates preprocessing and feature extraction methods,

indicating that the researchers followed similar steps to prepare their data.

6.1.3 Interpretation Part 1

Preum et al. categorize the sentences as ‘Action Clauses,’ ‘Conditional Clauses,’ ‘Temporal

Clauses,’ and ‘Cause / Effect Clauses.’ They can then be further divided into subcategories

called tokens, as depicted in Figure 6-2(1). Thus, a clause can be expressed as a tuple of

multiple semantic tokens. The detection process of Preum et al. also relies on dependency

relationships, similar to this dissertation’s approach.

Discussion 65

Figure 6-2: Different semantic decompositions: (1) - Preum et al. (2017), (2) - dissertation at hand, (3) - Sarafraz

(2011)

In comparison, this dissertation contains similar, partly identical elements. For example, a

sentence by Preum could be decomposed using the semantic objects of Conditional Clauses,

Effects, Actions, and Objects, which is similar to the concept of this dissertation. This is

illustrated in Figure 6-2(2), where a typical semantic decomposition from this dissertation can

be seen:

Sarafraz defines an event representational model, which follows another logic. Here, the

different parts of the clause represent certain events, as seen in Figure 6-2(3). He uses a

hybrid machine learning model and rule-based methods to extract a set number of events.

Tawfik does not break clauses down into constituents or other categories. Instead, they directly

proceed to Interpretation 2 – contradiction detection.

6.1.4 Interpretation Part 2

The following critical aspects can be identified for the actual contradiction detection: negation

detection, antonym detection, and implicitly negated formulations. This dissertation will now be

compared to the literature based on these aspects:

Discussion 66

1. Negation:

Tawfik and Sarafraz use a model called NegEx (Chapman et al. 2001). NegEx locates

trigger terms indicating a clinical condition is negated or possible and determines which

text falls within the scope of the trigger terms. This is similar to the method used by

Preum et al. and in this dissertation. Here, a predefined list of trigger words is also

employed.

Furthermore, Sarafraz employed ML classifiers (support vector machines) that classify

a sentence as ‘affirmative’ or ‘negative.’ This led to further improvement in the results

of the rule-based method.

2. Antonymy:

Due to the low number of described antonyms in lexicons, Tawfik decided to utilize

additional lexical sources such as WordNet. Preum et al. also use WordNet to detect

opposite words. Sarafraz, however, does not seem to consider antonyms.

In this dissertation, antonym detection was not explicitly performed. By directly applying

large language models (LLMs), these detections are automatically considered since

LLMs can draw such inferences. Furthermore, LLMs do not require maintenance and

offer a one-fits-all solution, as seen in the next paragraph. The drawback of LLMs is

their low certainty regarding the correctness of the results. The relationships and

associations in lexicons are transparent and verifiable.

3. Implicitly negated formulations:

Implicitly negated formulations were not considered in the literature. Implicit negations

are not to be confused with implicit contradictions. Instead, they refer to statements

such as ‘the signal x should be equal to 1’ and ‘the signal x should be equal to 2.’ The

contradiction arises not from negations or antonyms but from different, conflicting

instructions on handling the signal x.

6.1.5 Conclusion

The comparison of the literature’s methods and this dissertation’s approach reveals a notable

similarity in the basic approach, lending credibility to all the methods. This congruence implies

that the methodology adopted in this work aligns with established practices and demonstrates

its soundness in addressing the research objectives.

It is essential to evaluate the observed differences in the literature with a value-neutral

perspective, recognizing that they all present advantages and disadvantages when compared

to our method. These differences may originate from variations in data sources, domain-

specific considerations, or underlying assumptions. It is worth noting that the implementations

discussed in the literature are tailored to the medical field, limiting their direct applicability to

other domains. Therefore, while the findings and techniques presented in the reviewed

literature contribute to the broader understanding of contradiction detection, they must be

adapted and contextualized for relevance in non-medical contexts.

Examining these approaches serves as a valuable exercise, providing insights into alternative

perspectives and potential areas for improvement. It deepens our understanding of the

challenges and nuances in handling contradicting statements, offering potential avenues for

refining and advancing the current methodology.

Discussion 67

6.2 From Method to Mechanism: Envisioning ALICE as a Tool

At the beginning of the discussion, a potential software tool named ALICE was introduced,

incorporating the insights of the hybrid model developed in the third paper. This section looks

closer at the operational workflow of the proof-of-concept and its usability.

6.2.1 Operational Workflow

The swimlane diagram in Figure 6-3 visualizes the proof-of-concept workflow. The aim is to

identify conditions and contradictions within requirements.

Figure 6-3: ALICE Workflow: Identifying conditions and contradictions in requirements engineering

Discussion 68

The process starts in the Main lane by selecting the dataset for analysis and configuring the

global parameters, class definitions, and any additional parameters the script will require.

Moving to the Preprocessing lane, data reduction is the first step to cleaning the data by

replacing or eliminating special characters with their corresponding expressions, ensuring that

the parsers can effectively process the text. Project-specific preprocessing may follow, dealing

with unique terms or structures within the specific dataset.

The Text Parsing lane is where the cleaned text is submitted to parsers, such as Stanford's

Stanza parser, to generate dependency trees and perform part-of-speech tagging. These

parsers require well-structured sentences to interpret and tag the text accurately.

Next, the Action & Variable lane is where the structure of sentences is examined to identify

main verbs or roots. If Stanza fails to find the root, the workflow switches to Stanford’s CoreNLP

parser for an alternate parsing strategy.

Upon successfully identifying the root, the process seeks out sub-clauses and adverbial

clauses that indicate conditions within the sentences in the Condition Detection lane.

The second paper (Section 4) only aimed at finding conditions. The process may conclude at

this juncture unless contradiction detection is also desired. In the latter case, the identified

conditions are then examined in the Action/Variable lane to extract actions and variables

within both conditions and effects.

Finally, the Contradiction Detection lane involves evaluating the constituents – conditions,

effects, actions, and variables – to detect contradictions. This evaluation is the culmination of

the process, where the analysis is completed, and results indicating the presence or absence

of contradictions are produced.

This workflow is marked by its modularity, with individual stages operating independently and

synergistically. Moreover, an iterative parsing methodology is employed, wherein a series of

parsers are utilized in sequence. This redundancy is a strategic measure to bolster the

robustness of the text analysis, mitigating any single parser's limitations.

6.2.2 Usability and Integration

The following section attempts to answer three questions that should provide information about

the implementation of ALICE: (1) how the requirements for the algorithm are provided, (2)

computing times, and (3) integration possibilities in classic requirements management tools.

Question 1: How must the requirements be passed to the ALICE?

Transferring requirements to ALICE takes three steps, as shown in Figure 6-4.

Figure 6-4: ALICE’s structure (Gärtner and Göhlich 2023)

Step 1: Preprocessing

Preprocessing uses established methods from Natural Language Processing (NLP). Images

and non-textual content should be deleted or replaced with appropriate textual descriptions.

Preprocessing also includes identifying and cleaning up special symbols and characters that

cannot be interpreted. Although most Large Language Models (LLMs) should recognize

symbols such as equal signs, it is recommended to replace these symbols with plain textual

phrases, for example, "=" with "is equal to." This minimizes possible losses of unknown

characters when transferring the requests and ensures compatibility with classic text parsers

(see next step), which usually only use UTF-8 characters, i.e., the alphabet.

Discussion 69

Step 2: Constituent analysis

The second step is to analyze and prepare requirements in text form. This involves checking

whether the requirements only have an effect or consist of a condition and an effect. After this

identification, the action and the variable are identified for the effect and the condition (if any).

This is done, among other things, with the help of text parsers. The requirements were

extracted from a *.txt document in the second paper and analyzed line by line. In theory,

however, the analysis is independent of the origin of the requirements, whether from a text

document, an Excel spreadsheet, or other sources, as long as ALICE can recognize where

one requirement ends and the next begins.

Step 3: Contradiction detection

In the third step, the actual contradiction detection takes place, in which all requirements

concerning their properties are compared, as determined in step 2. This is done according to

the decision tree presented in the third paper. In the validation of the paper, this procedure

was implemented using a combination of formal logic and LLMs. All requirements were stored

together with their properties in a list-like structure. All unique combinations of requirements

were then generated and analyzed individually according to the decision tree. The results of

these comparisons can be saved, for example, in an Excel spreadsheet. This procedure can

lead to many comparisons if there are many requirements, with each pair of requirements

being checked. For example, if there are 20 requirements, 20(20 − 1) 2

⁄=190 combinations

must be analyzed one by one.

Question 2: What is the average computation time?

The algorithm is implemented in Python, focusing on the results' correctness rather than

maximizing computational speed. During the analysis, unprocessable characters may occur

(see question 1, step 1), resulting in program abort and deteriorating the computational speed.

Also, the OpenAI servers may be responsible for a program abort since a pipeline to OpenAI

must be established, which may be overloaded depending on the general request volume.

Without aborts, the computing time varies depending on the structure of the requests. In the

case of a simple request without conditions, the analysis is performed quickly. It is particularly

accelerated if the decision tree's first question already hints that the requirement pair is not

conflicting.

A complete analysis of a pair of requirements can sometimes take more than 12 seconds,

especially if all seven questions have to be calculated. However, this case does not occur very

often in practice. Usually, the analysis time ranges between 3 and 5 seconds.

For larger data sets of, for example, 4000 combinations, corresponding to about 90

requirements, the analysis could theoretically take up to 13 hours. However, a more realistic

estimate is between 3 to 5 hours, as most pairs do not require further analysis beyond the first

question of the decision tree, as was the case for the specification sheets used in the third

paper.

The research was conducted using conventional, single-threaded hardware configurations.

However, ALICE's calculations are well-suited for optimization, especially parallel processing.

Parallelization refers to breaking down a computational task into smaller, independent parts

that can be executed simultaneously on multiple processing units or cores, thereby speeding

up the overall computation. These components can be simultaneously executed across

multiple processing units or CPU cores, thus accelerating the overall computation remarkably.

As explained earlier in this section, each requirement pair is analyzed separately. The pairs

are individual tasks that do not need to be executed strictly after the other. This suggests that

parallelization techniques hold substantial promise for reducing computational time.

The extent to which parallelization can speed up the entire process depends on several factors,

including the number of processing units available, the efficiency of the parallelization strategy,

and the specific computational tasks being performed. In ALICE's case, it would lead to

substantial speed improvements, potentially reducing analysis times from hours to minutes or

Discussion 70

even seconds. However, the exact gains must be assessed through practical implementation

and testing.

Question 3: How could an integration of ALICE into classic RM tools be achieved?

The answer to this question leads to two approaches:

The first approach is to develop a stand-alone application that provides interfaces to various

RM tools. This approach offers high flexibility, as ALICE would be a versatile solution for

projects with varying RM tools. For example, it could be used with DOORS and Codebeamer

without significantly customizing it.

However, there are some drawbacks to this approach. Developing interfaces to different RM

tools can be challenging, as each tool may have different APIs and data structures. Even if a

standard import/export format such as ReqIF is used, manual requirements transfer between

tools remains error-prone. Another disadvantage is certification. In highly regulated industries

such as automotive or aerospace, the certification of a tool such as ALICE is difficult:

companies usually only allow validated and thoroughly tested software products. Therefore,

independent, new market entrants might have difficulties due to high regulatory hurdles and

company guidelines.

The second approach is to develop ALICE as a plugin for existing RM tools. This option offers

the advantage of integrating it into certified RM tools, bypassing the certification challenge: if

a plugin is developed for an already certified tool, it typically does not require separate

certification.

Another advantage of this option is that integration with the RM tool allows for seamless work

with requirements, as users do not have to switch between different applications and manually

transfer requirements. Also, such a plugin can adopt the design and user interface of the RM

tool, which makes it more intuitive for users.

Overall, both approaches have pros and cons, and the choice between them depends on

several factors, including the specific requirements and challenges of the target project.

Careful consideration of these aspects is critical to choosing the best possible integration

strategy into RM tools.

6.3 Development Processes Integration

The following section reviews the product requirements specification process (PRS process)

and relevant standards and guidelines to discuss how ALICE could be integrated into

development processes.

6.3.1 Product Requirements Specification Process

In the industrial context, a well-established procedure for translating customer requirements

into technical product specifications has gained broad acceptance, known as product

requirements specification process (Göhlich and Fay 2021a); see Figure 6-5. In some projects,

development involves multiple parties, including clients and contractors. When clients place

orders, they typically provide requirements specifications to the contractor, which address the

project's general goal. The contractor then formulates a specification in which he describes

how the requirements of the specification are to be realized. Clients often compile their

requirements by drawing from various specifications from past projects (Bender and Gericke

2021). Consequently, requirements from different, partially incompatible projects may be

merged, resulting in contradictions. Therefore, it is essential to check for completeness and

consistency.

According to current practice, expertise with the product, the application area, and other

relevant framework conditions is required to work in this context. On the one hand, the client

could use ALICE to scan the initial specification before handing it over to the contractor to

improve the document's quality and reduce unnecessary confusion on the contractor's side.

On the other hand, the contractor could also use the software to get a quick overview of the

complexity of the specification, as more contradictions would hint at a more complex project.

Discussion 71

Figure 6-5: Product Requirements Specification Process based on (Bender 2020)

However, it is important to note that fully identifying conflicting goals can be challenging at this

preliminary stage. In later product development phases described in the following section, the

development goals are elaborated in more detail, and contradictions typically become

transparent (Göhlich and Fay 2021b).

6.3.2 Process Models and Guidelines for Product Development

According to Bender and Gericke (2021), each development project is bound to three targets:

deadline targets, cost targets, and product-related targets. This dissertation only focuses on

product-related targets, which we will refer to exclusively in the following. However, it is

conceivable that future methods developed with the help of our findings will also be able to

evaluate cost- and schedule-related targets.

Various design process models exist to tackle these targets, aimed at supporting engineers in

planning, documenting, finding solutions, and decision-making within their work. Despite

variations in terminology and detail, these models share typical stages in the design process

(Gericke and Blessing 2012; Wynn and Clarkson 2018).

One crucial phase common to these models is the early development stage, where the design

task is clarified, creating a requirements list, sometimes referred to as requirements

specification (Eisenbart et al. 2011; Gericke and Blessing 2012). This list encompasses

essential functionality, influences, constraints, and dependencies derived from stakeholder

demands, market conditions, and other factors. While various methods and checklists aid in

identifying requirements, they often prioritize completeness over integrity, making it

challenging to assess consistency and conflicting design goals due to limited solution

principles or details (Göhlich et al. 2021).

Many design process models suggest that this task is complete once the initial requirements

list is formed, despite emphasizing the need for continuous revision and refinement (Göhlich

et al. 2021). Requirements are dynamic and evolve alongside understanding the design

problem, necessitating ongoing engagement with the requirements list throughout the design

process (Maher and Poon 1996; Gericke et al. 2013).

The diagram, as depicted in Figure 6-6, presents a structured yet adaptable approach to the

design process, informed by the revised VDI 2221 guideline (VDI VDI 2221 Blatt 1:2019-11)

and integrated into the updated Pahl/Beitz method, as noted by Bender and Gericke (2021).

The process model is categorized into four phases: Task Clarification, Conceptual Design,

Embodiment Design, and Detail Design. Each phase serves as a fundamental milestone in the

design journey. However, while these phases remain constant, the intrinsic activities within

each phase demand a more dynamic approach. These activities are not set in stone; instead,

Discussion 72

they necessitate individual assessment and potential adaptation based on the specific

circumstances and requirements of a given company or design team.

By allowing this kind of adaptability in the activities while maintaining the integrity of the

overarching phases, the model ensures that the essence of the design process remains intact.

Figure 6-6: Process model according to VDI 2221 (Göhlich et al. 2021)

In the Task Clarification phase, the starting point of the design process, the project's objectives

or problems are clearly outlined. However, the requirements are often not detailed or

thoroughly fleshed out during this early stage. They might be broad, general, or even just

described through use cases. This suggests that while there is a basic understanding of what

needs to be achieved, the specifics, nuances, or detailed criteria might not have been

determined yet, making an application of ALICE difficult.

The result of the phases of Conceptual Design and, to some extent, the Embodiment Design

can be summarized as the initial target system. It serves all stakeholders in all development

phases as a benchmark for assessing the project's success (Bender and Gericke 2021).

Consistency and conflicting design goals can be challenging to assess due to missing solution

principles or details at this stage (Göhlich et al. 2021). While a general contradiction tool would

be beneficial, ALICE primarily focuses on identifying contradictions within technical

specifications.

The Detailed Design phase elaborates on the product's technical specifications, materials,

components, and manufacturing processes (DIN ISO DIN ISO 9000:2015). Although our tool

cannot enhance drawings and schematics, it can uncover contradictions in the specifications,

potentially saving manual work. It may even detect contradictions in material and component

specifications, although this aspect was not examined in this dissertation.

This phase can be further deconstructed into distinct categories. This subdivision yields two

primary categories: Developing Requirements (also referred to as requirements engineering)

and Working with Requirements (also referred to as requirements management), each of

which can be further delineated into specific activities, as shown in Figure 6-7, according to

Bender and Gericke (2021).

Discussion 73

Figure 6-7:Tasks when developing requirements (own illustration based on Bender and Gericke (2021))

This dissertation focuses on Requirements Engineering and its Elicitation, Specification,

Structuring, and Analysis categories. They represent the critical phases where ALICE is most

effectively applied to identify and resolve contradictions in requirements. While Requirements

Engineering forms the cornerstone for utilizing ALICE, the tool also significantly influences

Requirements Management. The integration of ALICE necessitates robust documentation to

ensure a documented state of requirements is available for analysis. Additionally, ALICE

profoundly impacts traceability by precisely tracking requirement origins and their relationships

throughout the project lifecycle, thus ensuring that any changes made due to contradiction

detection are accurately reflected and traceable. In terms of versioning, ALICE facilitates the

management of different versions of requirements documents, helping teams maintain a clear

record of how requirements evolve in response to identified contradictions. Lastly, ALICE

affects change management by enhancing the responsiveness to necessary adjustments in

requirements, enabling quicker and more informed decision-making. This ensures that all

modifications are thoroughly documented and integrated into the project workflow, aligning

with the best practices in Requirements Management and supporting continuous improvement

in project outcomes.

In the following sections, we will delve into the four constituent activities of Requirements

Engineering and their relevance to ALICE:

1. Elicitation: The tool could augment the requirement elicitation process by proactively

identifying contradictions and inconsistencies early in development. This proactive

detection aids in more precise and comprehensive requirement elicitation. However,

findings obtained during the elicitation are often documented informally and not in clean

requirements. It would, therefore, be necessary to check the elicitation results before

deciding on the application of the tool.

2. Specification: By identifying and resolving contradictions, the tool promises to improve

requirement specification, ensuring that requirements are unambiguous, concise, and

devoid of conflicts, thereby enhancing their quality. Nevertheless, specificating

requirements is often a dynamic process in which they are subject to constant

correction and editing. Since the specification continues to undergo regular changes,

the tool's implementation might be premature at this stage.

3. Structuring: Applying the tool in this context could facilitate more effective requirements

structuring. By clarifying conflicting facets and supporting their reorganization into a

Discussion 74

coherent structure, the tool could assist in achieving improved requirement

organization. However, the tool's results would not directly lead to automated structural

insights. Therefore, it is not immediately applicable in this step. Further research might

be valuable.

4. Analysis: Among the various activities in requirement development, the Analysis phase

emerges as a natural and highly relevant domain for applying the software tool. This

alignment is especially evident as it involves a detailed analysis of requirements to

detect contradictions. The tool's analytical capabilities are specifically designed to

uncover contradictions within requirements. Examining the requirement set offers a

systematic means to pinpoint these issues in the development process. This proactive

identification serves as the foundation for resolving contradictions, ensuring that the

requirement set remains coherent, feasible, and aligned with project objectives.

Another important aspect during the Detailed Design phase is that complex development

projects are mainly carried out in interdisciplinary collaboration within cooperation networks.

Typically, the requirements for each level are managed in different documents for the overall

product in product specification books and the subsystems in component specification books

(Göhlich and Fay 2021b). The partial results must be combined according to their logical and

temporal dependencies to form the overall solution. 'For complex systems in particular, this

process involves a risk of error with regard to the consistency of the goals among themselves

and thus also for the consistency […] of the overall solution.' (Bender and Gericke 2021).

ALICE could facilitate this process by identifying potential conflicts across multiple

specifications and assisting engineers in recognizing cross-references spanning various

domains, thereby promoting effective

Changes to the requirements specification close to the implementation should be avoided, as

the complexity of the overall system and their interactions with each other can result in

disproportionately high change costs and efforts in all areas of the company (Giffin et al. 2009;

Ehrlenspiel and Meerkamm 2017; Göhlich et al. 2021). Nevertheless, so-called Change

Requests (Lashin 2021) cannot always be avoided due to unexpected changes in objectives

or new technical findings. ALICE has not been validated for performing impact analyses to

identify potential issues arising from changes due to contradictions. However, it represents a

step towards that objective: Hein et al. (2021) use machine learning to categorize requirements

based on their change behavior and aim to provide insights into managing requirement

changes effectively. Classifying contradicting requirements could be another criterion for Hein

et al.'s ML algorithm.

After discussing when the method can be applied in the development process, it is equally

important to explore how it can be employed within specific steps. Three potential methods are

described below:

1. Real-time contradiction detection during requirement authoring: ALICE could conduct

real-time contradiction checks as requirements are authored. This functionality would

ensure that conflicting statements are promptly identified and resolved during the

requirement formulation process.

2. Periodic or triggered specification checks: ALICE could periodically perform

comprehensive checks on the entire specification, such as after each baseline or at

predefined project milestones. This approach would ensure a systematic review of

requirements to identify and address contradictions or inconsistencies at strategic

points in the development process.

3. Automated checks: ALICE's utility could extend to automated processes like nightly

checks. In this context, automated checks provide continuous monitoring of evolving

requirements. This proactive approach aids in the timely detection and correction of

contradictions, even when manual review processes are not actively engaged.

In summary, ALICE appears most suitable for application in the Detailed Design phase,

specifically during the Analysis step. The choice of the specific method should be determined

based on the particular project at hand.

Discussion 75

6.4 Limitations and threats to validity

The following sections will address the constraints and potential sources of error that should

be considered when considering this dissertation's findings.

6.4.1 Limitations of the Methodology

Due to the intricate nature of natural language requirements, a trade-off had to be made

between effective contradiction detection and the broad applicability of the methodology to any

requirement. Consequently, the methodology is restricted to requirements written in a

controlled natural language. The specifics are elaborated upon below (Dick 2017; Sophist

GmbH 2016; Wiegers and Beatty 2013; Robertson and Robertson 2013):

Utilizing single-sentence requirements that adhere to the principle of atomicity is compulsory

for this methodology. These requirements cannot be further divided or decomposed into

smaller components. The principle of atomicity is widely recognized as a best practice in

requirements engineering. It ensures that requirements remain concise, unambiguous, and

focused on specific functionalities or characteristics of the system being developed.

The presence of filler words in natural language processing tasks can introduce errors. For

example, consider the sentence ‘Amy and I both have to fight him’ where the term ‘both’ is

treated as a variable. If the to-be-compared sentence is ‘Neither Amy nor I have fought against

him,’ the variables would be identified as ‘Amy’ and ‘I.’ Therefore, no contradiction would be

detected, as the variables of both requirements do not match.

According to standard guidelines, passive sentence constructions are prone to

misinterpretation and must be avoided to ensure high accuracy in this method (VDA 2007;

Sophist GmbH 2016). For instance, ‘There must be no manipulation of Com_Batt’ is better

expressed as’ The function must prevent manipulation of Com_Batt.’ Using active sentence

structures enhances clarity and enables the AI to identify the variable performing the action.

Furthermore, incorrect placement of commas can lead to erroneous condition detection,

resulting in inaccurate identification of variables and actions. Therefore, ensuring proper

comma placement in technical writing is crucial to enhance clarity and precision. Failure to do

so may introduce flawed interpretations and subsequent errors. Complex sentence structures

must be avoided, and accurate comma placement should be verified.

All these considerations emphasize the importance of controlling natural language as much as

possible. For example, by utilizing standard sentence templates for requirements, it can be

ensured that the set limits of the methodology are not exceeded. Such templates are commonly

used in various fields and sectors where hardware, software, or system development occurs.

This approach helps maintain consistency, reduces the likelihood of errors introduced by

linguistic nuances, and facilitates accurate condition detection and variable/action extraction.

Finally, it is advisable to minimize the creation of neologisms in technical writing. If their use is

unavoidable, adding them to a dictionary to avoid being detected as out-of-vocabulary terms

becomes essential to facilitate accurate analysis, as explained in the third paper.

6.4.2 Threats to Validity

Further investigation and comparison with additional data sets are necessary to account for

contextual factors such as the domain, development process, requirements engineering

techniques, and technologies used. These additional studies would help ensure a

comprehensive understanding and broader applicability of the methodology.

Given the limited availability of a sizable and labeled dataset on which the approach was

initially devised, there is a concern regarding the potential for overfitting the methodology to

the data. Considering this limitation and striving for a more diverse and representative dataset

in future studies is essential.

Discussion 76

7 Summary and Outlook

The following section will concisely recap the key findings and offer insights into future

directions and potential developments.

7.1 Summary

In conclusion, this research focused on automating requirements engineering (RE) by

detecting contradicting requirements.

The first paper proposes a formal logic-based method to identify and classify contradictions in

requirements documents. In contrast to other papers found in the literature, contradictions

were not classified according to a specific, available data set but rather according to a generally

accepted, well-tested systematic model. Subsequently, a classification tailored to requirements

engineering was developed to identify specific types of contradictions using straightforward

questions.

This method identified 49 contradictions out of around 3,500 requirements. According to the

Aristotelian Law of non-contradiction (LNC), three main contradiction types were defined:

Contradictories, Contraries, and Subalterns, each of which were subdivided into Simplex,

Idem, and Alius. A set of questions, referred to as building blocks, needed to be answered to

detect the specific category. Most of the identified contradictions were Alius Contraries and

were found at the detailed design requirements level. This aligns with expectations, as higher-

level requirements are often less concrete and describe the general functionality of the product,

resulting in fewer opportunities for contradictions to occur, as explained in Section 6.3. It is

important to note that these were not considered in this research.

The method developed in the first paper serves as a classifier, demonstrating its potential to

support manual contradiction detection and hint at future automated identification of

contradictions. The method requires the requirements to be classified into condition, effect,

variable, and action. In the following formalization, these constituents should be replaced by

symbols and formulas. Once the formalization is completed, answering the questions allows

for identifying specific contradictions.

By applying this method, requirements reviewers can recognize contradictions without

requiring extensive familiarity with requirements or the document’s subject matter. Apart from

the implication for automated contradiction detection, implications can also be derived for

automated quality analysis. The classification of different contradiction types could play a vital

role in quantifying the quality of requirements documents. Assessing the criticality of

contradictions based on this classification could facilitate the determination of meaningful key

performance indicators.

The second paper focuses on automatically detecting conditional sentences in requirements,

providing the solution for one of the building blocks of the method from the first paper. It is

important to note that conditional statements do not imply causality. For instance, the presence

of wheels on a car is a condition for it to move, while excessing torque on them is the cause.

The resulting movement is referred to as the effect.

The study found that grammatical models incorporating grammatical rules, trigger words, and

Part-Of-Speech tags are more suitable for identifying conditions than machine learning

methods. Another aspect addressed in this paper is the identification of verbal expressions

associated with the condition and the effect: variable and action. These are called constituents,

which can be individual words or groups of words that function as a single unit. To illustrate,

consider the sentence, ‘If the threshold is reached, the controller must limit the speed decay.’

Here, the condition is: ‘If the threshold is reached.’ The effect is: ‘the controller must limit the

speed decay.’ In the condition, the variable is ‘the threshold,’ and the action is ‘is reached.’ In

the effect, the variable is ’The controller,’ and the action is ‘must limit the speed decay.’ In other

words, the variable represents the central entity involved, while the action denotes what is

expected to occur.

Discussion 77

A sample of 1,861 requirements was evaluated, and the performance of a grammatical model

against two machine learning models was compared. The findings suggest that the

grammatical model is more effective for identifying conditional requirements and their

constituents. Moreover, labeling datasets for machine learning training can be laborious and

susceptible to human errors.

The third paper completes the proposed method using formal logic and large language models

(LLMs) for fully automated contradiction detection. It is presented as a decision tree based on

the building blocks developed in the first paper:

1. Question 1: This question determines whether the variables of effect one and effect

two are identical or if one set of variables is a subset of the other, establishing the

potential for an LNC contradiction. Formal logic was used to answer this question.

2. Question 2: This question examines the inclusivity between actions, determining if one

condition subsumes the other. However, this question was not processed in the

implementation due to limitations in the mathematical abilities of the current language

models.

3. Question 3: This question identifies mutual exclusivity between two actions, detecting

contradictory opposing actions. Formal logic and GPT-3 were used to assess similarity

and determine if the actions were mutually exclusive.

4. Question 4: The fourth question assesses mutual inconsistency between two actions,

identifying contrary opposing actions. GPT-3 was utilized to determine if the actions

have the potential to contradict each other.

5. Question 5: This question determines the existence of a condition, aiming to eliminate

potential Simplex-contradictions. The model of the second paper was employed for this

condition detection.

6. Question 6: The sixth question identifies if the first condition is equivalent to the second

one to detect Idem-contradictions requiring the same condition. Formal logic and string

comparison were utilized to assess similarity.

7. Question 7: The seventh question determines if the first and second conditions can co-

occur. GPT-3 was used to assess both conditions' theoretical possibility of

simultaneously happening.

This method is named ALICE (Automated Logic for the Identification of Contradictions in

Engineering). The study compared the performance of two methods: the hybrid method and a

purely LLM-based method. Results showed that ALICE outperformed the LLM-only method in

terms of accuracy and recall. ALICE achieved an accuracy of 99% and a recall of 60%%, while

the LLM-only method showed an accuracy of 97%% and a recall of 0%. This demonstrates

that relying solely on LLMs is not feasible for detecting contradictions in requirements

engineering.

The study also identified the limitations of LLMs and symbolic AI in detecting contradictions in

technical documents. LLMs may generate errors due to inadequate data to analyze and

understand requirements accurately. On the other hand, symbolic AI may be deceived by filler

words, improper placement of commas, and neologisms. Using active sentence structures and

established terminology in technical writing is recommended to ensure precision and reduce

ambiguity in the results.

In summary, ALICE, consisting of seven questions, can successfully detect formal

contradictions between requirements, showing better results than only relying on AI, especially

Large Language Models. The method is scalable and can be adapted to future advancements

in the field, providing a modular way to evaluate the consistency requirements specification.

7.2 Outlook

The research presented in this thesis has made significant contributions to requirements

engineering, but there is still room for further exploration and improvement. In addressing the

Discussion 78

limitations identified in Section 6.4.1, this part of the dissertation outlines potential

enhancements and future research directions that could significantly advance the capabilities

of ALICE.

Developing advanced parsing and semantic analysis methodologies is crucial to expanding

ALICE's applicability beyond controlled natural language. These improvements would enable

it to process a broader range of natural language requirements, reflecting the varied

documentation styles encountered in practice. Additionally, by exploring algorithms capable of

accurately handling multi-sentence and compound requirements, ALICE could better manage

documents that mirror real-world complexities, thus enhancing its utility and precision in

contradiction detection.

Another significant area for improvement involves refining ALICE’s handling of linguistic

nuances. Implementing sophisticated natural language processing models would improve the

tool’s ability to interpret context and semantics, particularly in managing filler words and

variable mismatches. This would enable robust contradiction analysis within complex sentence

structures.

The accuracy of technical document interpretation can also be significantly improved by

enhancing grammatical correction features within ALICE, focusing on technical writing norms

such as comma placement and sentence complexity. Standardizing sentence templates for

requirement documentation across various sectors would aid in maintaining consistency and

reducing errors introduced by linguistic variations.

Another area for future exploration is the development of custom models for specific domains

or requirements. By training machine learning models on a specific domain, such as healthcare

or finance, researchers could create highly tailored models optimized for detecting

contradictions and other issues specific to that domain. Developing a dynamic dictionary

management system would allow ALICE to integrate and update new technical terms

seamlessly, ensuring the tool remains relevant and effective in adapting to industry-specific

terminologies and innovations.

Further work can be derived from discoveries made in the publications. The first paper

mentioned other types of contradictions besides LNC contradictions, which were not discussed

in this research: dialectic contradictions and antinomies. Research should focus on these

topics by using the world knowledge of LLMs.

The third paper explained that Alius contradictions are potential contradictions due to

theoretically infeasible conditions. While automated analysis of such conditions was not within

the scope of that paper, it could be a potential area for future research.

Finally, a compelling opportunity exists to bridge the gap between academic research and

practical industry applications. Developing a user-friendly and robust tool based on the insights

and methodologies presented in this thesis could revolutionize how engineers in various

industries approach requirements engineering. Such a tool could be designed to integrate

seamlessly with existing engineering workflows, enabling engineers to detect and resolve

contradictions in their requirement documents quickly. Parallelization holds excellent promise

for future optimization of computational speed. Since each requirement pair is analyzed

separately, it could lead to significant speed improvements, potentially reducing analysis times

from hours to minutes or even seconds. The exact gains should be assessed through practical

implementation and testing.

In conclusion, the future of requirements engineering is filled with possibilities, from exploring

new contradiction types to optimizing computational speed and developing specialized models.

The path forward also includes practical solutions that bring the benefits of academic research

directly into the hands of industry professionals, promising a more efficient and practical

approach to requirements engineering.

Publication bibliography X

8 Publication bibliography

Acheampong, Francisca Adoma; Nunoo-Mensah, Henry; Chen, Wenyu (2021): Transformer

models for text-based emotion detection: a review of BERT-based approaches. In Artif Intell

Rev 54 (8), pp. 5789–5829. DOI: 10.1007/s10462-021-09958-2.

Ackoff, Russel L. (1989): From data to wisdom. In : Journal of applied systems analysis, vol.

16.1, pp. 3–9.

Águeda, Cristina Puente; Olivas, José A. (2008): Analysis, detection and classification of

certain conditional sentences in text documents: ReseaerchGate. Available online at

https://www.researchgate.net/publication/228985516_Analysis_detection_and_classification_

of_certain_conditional_sentences_in_text_documents, checked on 9/16/2022.

Ahmad, Arshad; Justo, José Luis Barros; Feng, Chong; Khan, Arif Ali (2020): The Impact of

Controlled Vocabularies on Requirements Engineering Activities: A Systematic Mapping

Study. In Applied Sciences 10 (21), p. 7749. DOI: 10.3390/app10217749.

Al-Aswadi, Fatima N.; Chan, Huah Yong; Gan, Keng Hoon (2020): Automatic ontology

construction from text: a review from shallow to deep learning trend. In Artif Intell Rev 53 (6),

pp. 3901–3928. DOI: 10.1007/s10462-019-09782-9.

Aristoteles (1986): Metaphysik. Schriften zur Ersten Philosophie. Übertr. u. hrsg. v. Franz F.

Schwarz. With assistance of Franz F. Schwarz: Reclam (Reclams Universal-Bibliothek, 7913).

Asghar, Nabiha (2016): Automatic Extraction of Causal Relations from Natural Language

Texts: A Comprehensive Survey.

Awad, Elias (2003): Knowledge Management. 1st edition: Pearson India. Available online at

https://permalink.obvsg.at/.

Babcock, Jonathan (2007): GOOD REQUIREMENTS ARE MORE THAN JUST ACCURATE.

practicalanalyst.com. Available online at https://practicalanalyst.com/good-requirements-are-

more-than-just-accurate/, checked on 5/20/2022.

Bender, Beate (2020): Requirements Engineering in the Context of IDE. In Sándor Vajna (Ed.):

Integrated Design Engineering. Cham: Springer International Publishing, pp. 587–614.

Bender, Beate; Gericke, Kilian (2021): Entwickeln der Anforderungsbasis: Requirements

Engineering. In Beate Bender, Kilian Gericke (Eds.): Pahl/Beitz Konstruktionslehre. Berlin,

Heidelberg: Springer Berlin Heidelberg, pp. 169–209.

Biagetti, Maria Teresa (2020): Ontologies as knowledge organization systems. In : Knowledge

Organization, vol. 2, p. 48. Available online at https://www.isko.org/cyclo/ontologies, checked

on 4/5/2023.

Bojić, D.; Bojović, M. (2017): A Streaming Dataflow Implementation of Parallel Cocke–

Younger–Kasami Parser. In : Creativity in Computing and DataFlow SuperComputing, vol.

104: Elsevier (Advances in Computers), pp. 159–199.

Bowman, Samuel R. (2023): Eight Things to Know about Large Language Models. Available

online at http://arxiv.org/pdf/2304.00612v1.

Bramer, Max (2013): Introduction to Classification: Naïve Bayes and Nearest Neighbour. In

Max Bramer (Ed.): Principles of Data Mining. London: Springer London (Undergraduate Topics

in Computer Science), pp. 21–37.

Britannica, The Editors of Encyclopaedia (2023): syntax. Edited by Encyclopedia Britannica.

Available online at https://www.britannica.com/topic/syntax, checked on 4/5/2023.

Broadbent, Alex (2008): The Difference Between Cause and Condition. In Proceedings of the

Aristotelian Society (Hardback) 108 (1pt3), pp. 355–364. DOI: 10.1111/j.1467-

9264.2008.00250.x.

Publication bibliography XI

Chapman, W. W.; Bridewell, W.; Hanbury, P.; Cooper, G. F.; Buchanan, B. G. (2001): A simple

algorithm for identifying negated findings and diseases in discharge summaries. In Journal of

biomedical informatics 34 (5), pp. 301–310. DOI: 10.1006/jbin.2001.1029.

Chen, Zhigang; Lin, Wei; Chen, Qian; Chen, Xiaoping; Wei, Si; Jiang, Hui; Zhu, Xiaodan

(2015): Revisiting Word Embedding for Contrasting Meaning. In Chengqing Zong, Michael

Strube (Eds.): Proceedings of the 53rd Annual Meeting of the Association for Computational

Linguistics and the 7th International Joint Conference on Natural Language Processing.

Beijing, China: Association for Computational Linguistics, pp. 106–115.

Chomsky, Noam (1957): Syntactic structures. Berlin: Mouton & Co.

VDI VDI 2221 Blatt 1:2019-11, 2019: Design of technical products and systems - Model of

product design.

VDI-Guideline VDI 2221 Blatt 1:2019-11, 2019: Design of technical products and systems -

Model of product design.

Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (2018): BERT: Pre-training

of Deep Bidirectional Transformers for Language Understanding.

Dick, Jeremy (2017): Requirements Engineering. 4th ed. 2017. Cham: Springer (Springer

eBook Collection Computer Science).

Dimitrieski, Vladimir; Petrović, Gajo; Kovačević, Aleksandar; Luković, Ivan; Fujita, Hamido

(2016): A Survey on Ontologies and Ontology Alignment Approaches in Healthcare. In Hamido

Fujita, Moonis Ali, Ali Selamat, Jun Sasaki, Masaki Kurematsu (Eds.): Trends in Applied

Knowledge-Based Systems and Data Science, vol. 9799. Cham: Springer International

Publishing (Lecture Notes in Computer Science), pp. 373–385.

Duden (Ed.) (2006): Die Grammatik. Das Standardwerk zur deutschen Sprache. Duden.

Überarb. Neudr. der 7., völlig neu erarb. und erw. Aufl. Mannheim: Dudenverl. (Der Duden/

hrsg. vom Wiss. Rat der Dudenred, Bd. 4).

Duden (2023a): Semantik. Available online at

https://www.duden.de/rechtschreibung/Semantik, checked on 2/5/2023.

Duden (2023b): Taxonomie. Available online at

https://www.duden.de/rechtschreibung/Taxonomie, checked on 4/5/2023.

Ehrlenspiel, Klaus; Meerkamm, Harald (2017): Integrierte Produktentwicklung. Denkabläufe,

Methodeneinsatz, Zusammenarbeit. 6., überarbeitete und erweiterte Auflage. München: Carl

Hanser Verlag (Hanser eLibrary). Available online at

http://dx.doi.org/10.3139/9783446449084.

Eisenbart, B.; Gericke, K.; Blessing, L. (2011): A framework for comparing design modelling

approaches across disciplines. In Culley, S. J, Hicks, B. J, et al. (Ed.): Proceedings of the 18th

International Conference on Engineering Design (ICED11), pp. 344–355.

Eisenberg, Peter (2016): Duden - die Grammatik. Unentbehrlich für richtiges Deutsch. 9.,

vollständig überarbeitete und aktualisierte Auflage. Edited by Angelika Wöllstein. Berlin:

Dudenverlag (Der Duden, Band 4).

Ersoy, Pinar (2021): Naive Bayes Classifiers for Text Classification. Types of Naive Bayes

Classifiers for Different Text Processing Cases. Edited by Towards Data Science. Available

online at https://towardsdatascience.com/naive-bayes-classifiers-for-text-classification-

be0d133d35ba, checked on 10/26/2022.

Estival, Dominique; Nowak, Chris; Zschorn, Andrew (2004): Towards ontology-based natural

language processing. In Nancy Ide, Laurent Romary (Eds.): Proceeedings of the Workshop on

NLP and XML (NLPXML-2004): RDF/RDFS and OWL in Language Technology on - NLPXML

'04. Proceeedings of the Workshop on NLP and XML (NLPXML-2004): RDF/RDFS and OWL

in Language Technology. Not Known, 01.06.2004 - 01.06.2004. Morristown, NJ, USA:

Association for Computational Linguistics, pp. 59–66.

Publication bibliography XII

Fischbach, Jannik; Frattini, Julian; Spaans, Arjen; Kummeth, Maximilian; Vogelsang, Andreas;

Mendez, Daniel; Unterkalmsteiner, Michael (2021a): Automatic Detection of Causality in

Requirement Artifacts: the CiRA Approach. Available online at

http://arxiv.org/pdf/2101.10766v1.

Fischbach, Jannik; Frattini, Julian; Vogelsang, Andreas; Mendez, Daniel; Unterkalmsteiner,

Michael; Wehrle, Andreas et al. (2022): Automatic Creation of Acceptance Tests by Extracting

Conditionals from Requirements: NLP Approach and Case Study. Available online at

http://arxiv.org/pdf/2202.00932v1.

Fischbach, Jannik; Hauptmann, Benedikt; Konwitschny, Lukas; Spies, Dominik; Vogelsang,

Andreas (2020): Towards Causality Extraction from Requirements.

Fischbach, Jannik; Springer, Tobias; Frattini, Julian; Femmer, Henning; Vogelsang, Andreas;

Mendez, Daniel (2021b): Fine-Grained Causality Extraction From Natural Language

Requirements Using Recursive Neural Tensor Networks. Available online at

http://arxiv.org/pdf/2107.09980v2.

Foote, Keith D. (2019): A Brief History of Natural Language Processing (NLP). Edited by

Dataversity. Available online at https://www.dataversity.net/a-brief-history-of-natural-

language-processing-nlp/#.

Frattini, Julian; Fischbach, Jannik; Mendez, Daniel; Unterkalmsteiner, Michael; Vogelsang,

Andreas; Wnuk, Krzysztof (2022a): Causality in requirements artifacts: prevalence, detection,

and impact. In Requirements Eng. DOI: 10.1007/s00766-022-00371-x.

Frattini, Julian; Fischbach, Jannik; Mendez, Daniel; Unterkalmsteiner, Michael; Vogelsang,

Andreas; Wnuk, Krzysztof (2022b): Causality in requirements artifacts: prevalence, detection,

and impact. In Requirements Eng. DOI: 10.1007/s00766-022-00371-x.

Friedman, Lex (2020): Grant Sanderson and Lex Fridman. Math, Manim, Neural Networks &

Teaching with 3Blue1Brown (118), 08.2020. Available online at

https://www.youtube.com/watch?v=TMxAbNAVrzI&t=14s.

García, Salvador; Luengo, Julián; Herrera, Francisco (2014): Data Preprocessing in Data

Mining. 1. Aufl. s.l.: Springer-Verlag (Intelligent Systems Reference Library, v.72). Available

online at http://gbv.eblib.com/patron/FullRecord.aspx?p=1968256.

Gärtner, A. E.; Göhlich, D.; Fay, T.-A. (2023): Automated Condition Detection in Requirements

Engineering. In : ICED23 Proceedings, vol. 3, pp. 707–716.

Gärtner, Alexander Elenga (2023): condition detection. Edited by Swarm Engineer. Available

online at https://swarm-engineer.me/2023/condition-detection/, updated on 2/7/2023, checked

on 2/7/2023.

Gärtner, Alexander Elenga; Fay, Tu-Anh; Göhlich, Dietmar (2022): Fundamental Research on

Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach. In

Applied Sciences 12 (15), p. 7628. DOI: 10.3390/app12157628.

Gärtner, Alexander Elenga; Göhlich, Dietmar (2023): Automated Requirement Contradiction

Detection through Formal Logic and LLMs.

Gärtner, Alexander Elenga; Göhlich, Dietmar (2024a): Automated requirement contradiction

detection through formal logic and LLMs. In Autom Softw Eng 31 (2). DOI: 10.1007/s10515-

024-00452-x.

Gärtner, Alexander Elenga; Göhlich, Dietmar (2024b): Towards an Automatic Contradiction

Detection in Requirements Engineering. In Internation Design Conference (Ed.): Design 24.

Gericke, Kilian; Blessing, L. (2012): An analysis of design process models across disciplines.

In D. Marjanovic, M. Storga, N. Pavkovic, N. Bojcetic (Eds.): DESIGN 2012. Proceedings of

the 12th International Design Conference, May 21 - 24, 2012, Dubrovnik, Croatia. Zagreb: Fac.

of Mechanical Engineering and Naval Architecture (DS, 3), pp. 171-180.

Publication bibliography XIII

Gericke, Kilian; Qureshi, A. J.; Blessing, Lucienne (2013): Analyzing Transdisciplinary Design

Processes in Industry: An Overview. In : Volume 5: 25th International Conference on Design

Theory and Methodology; ASME 2013 Power Transmission and Gearing Conference. ASME

2013 International Design Engineering Technical Conferences and Computers and Information

in Engineering Conference. Portland, Oregon, USA, 04.08.2013 - 07.08.2013: American

Society of Mechanical Engineers.

Gervasi, Vincenzo; Zowghi, Didar (2005): Reasoning about inconsistencies in natural

language requirements. In ACM Trans. Softw. Eng. Methodol. 14 (3), pp. 277–330. DOI:

10.1145/1072997.1072999.

Giffin, Monica; Weck, Olivier de; Bounova, Gergana; Keller, Rene; Eckert, Claudia; Clarkson,

P. John (2009): Change Propagation Analysis in Complex Technical Systems. In Journal of

Mechanical Design 131 (8), Article 081001, p. 1. DOI: 10.1115/1.3149847.

Gillani Andleeb, Saira (2015): From text mining to knowledge mining: An integrated framework

of concept extraction and categorization for domain ontology.

Göhlich, Dietmar (2008): Innovationen der Fahrzeugtechnik am Beispiel der Mercedes-Benz

S-Klasse. In Volker Schindler (Ed.): Forschung für das Auto von Morgen. Berlin, Heidelberg:

Springer Berlin Heidelberg, pp. 129–154.

Göhlich, Dietmar; Bender, Beate; Fay, Tu-Anh; Gericke, Kilian (2021): Product requirements

specification process in product development. In Proc. Des. Soc. 1, pp. 2459–2470. DOI:

10.1017/pds.2021.507.

Göhlich, Dietmar; Fay, Tu-Anh (2021a): Arbeiten mit Anforderungen: Requirements

Management. In Beate Bender, Kilian Gericke (Eds.): Pahl/Beitz Konstruktionslehre.

Methoden und Anwendung erfolgreicher Produktentwicklung, vol. 25. 9. Auflage. Berlin,

Heidelberg: Springer Vieweg (Springer eBook Collection), pp. 211–229.

Göhlich, Dietmar; Fay, Tu-Anh (2021b): Arbeiten mit Anforderungen: Requirements

Management. In Beate Bender, Kilian Gericke (Eds.): Pahl/Beitz Konstruktionslehre.

Methoden und Anwendung erfolgreicher Produktentwicklung, vol. 25. 9. Auflage. Berlin,

Heidelberg: Springer Vieweg (Springer eBook Collection), pp. 211–229.

Göhlich, Dietmar; Fay, Tu-Anh; Jefferies, Dominic; Lauth, Enrico; Kunith, Alexander; Zhang,

Xudong (2018): Design of urban electric bus systems. In Des. Sci. 4. DOI:

10.1017/dsj.2018.10.

Goodfellow, Ian; Bengio, Yoshua; Courville, Aaron (2016): Deep learning. Cambridge,

Massachusetts, London, England: The MIT Press (Adaptive computation and machine

learning).

Gruber, Thomas R. (1993): A translation approach to portable ontology specifications. In

Knowledge Acquisition 5 (2), pp. 199–220. DOI: 10.1006/knac.1993.1008.

Guarino, Nicola; Oberle, Daniel; Staab, Steffen (2009): What Is an Ontology? In Steffen Staab,

Rudi Studer (Eds.): Handbook on Ontologies, vol. 5. Berlin, Heidelberg: Springer Berlin

Heidelberg, pp. 1–17.

Gudivada, Venkat N. (2018): Natural Language Core Tasks and Applications. In :

Computational Analysis and Understanding of Natural Languages: Principles, Methods and

Applications, vol. 38: Elsevier (Handbook of Statistics), pp. 403–428.

Guo, Weize; Zhang, Li; Lian, Xiaoli (2021): Automatically detecting the conflicts between

software requirements based on finer semantic analysis. Available online at

http://arxiv.org/pdf/2103.02255v1.

Hauser, Christopher (2019): Aristotle’s Explanationist Epistemology of Essence. In

Metaphysics 2 (1), pp. 26–39. DOI: 10.5334/met.24.

Publication bibliography XIV

Hein, Phyo Htet; Kames, Elisabeth; Chen, Cheng; Morkos, Beshoy (2021): Employing machine

learning techniques to assess requirement change volatility. In Res Eng Design 32 (2),

pp. 245–269. DOI: 10.1007/s00163-020-00353-6.

Heitmeyer, Constance L.; Jeffords, Ralph D.; Labaw, Bruce G. (1996): Automated consistency

checking of requirements specifications. In ACM Trans. Softw. Eng. Methodol. 5 (3), pp. 231–

261. DOI: 10.1145/234426.234431.

Horn, Laurence R. (2018): Contradiction. Edited by The Stanford Encyclopedia of Philosophy.

The Metaphysics Research Lab. Available online at

https://plato.stanford.edu/archives/win2018/entries/contradiction/, checked on 4/21/2022.

Hunter, Anthony; Nuseibeh, Bashar (1998): Managing inconsistent specifications. In ACM

Trans. Softw. Eng. Methodol. 7 (4), pp. 335–367. DOI: 10.1145/292182.292187.

IEEE/ISO/IEC 29148-2018, 2018: ISO/IEC/IEEE International Standard - Systems and

software engineering -- Life cycle processes -- Requirements engineering. Available online at

https://standards.ieee.org/ieee/29148/6937/, checked on 3/15/2023.

Jang, Amy; Uzsoy, Ana Sofia; Culliton, Phil (2020): Contradictory, My Dear Watson. Edited by

Kaggle. Available online at https://kaggle.com/competitions/contradictory-my-dear-watson,

checked on 3/13/2023.

Jindal, Rajni; Malhotra, Ruchika; Jain, Abha (2016): Automated classification of security

requirements. In : 2016 International Conference on Advances in Computing, Communications

and Informatics (ICACCI). 2016 International Conference on Advances in Computing,

Communications and Informatics (ICACCI). Jaipur, India, 21.09.2016 - 24.09.2016: IEEE,

pp. 2027–2033.

Johnson-Laird, Philip Nicholas (2006): How we reason. 1. publ. New York, NY: Oxford Univ.

Press.

Jun, Yennie (2023): Exploring Creativity in Large Language Models: From GPT-2 to GPT-4.

Analyzing the evolution of creative processes in large language models through creativity tests.

Edited by Towards Data Science. Available online at

https://towardsdatascience.com/exploring-creativity-in-large-language-models-from-gpt-2-to-

gpt-4-1c2d1779be57, checked on 5/13/2023.

Jurafsky, Dan; Martin, James H. (2019): Speech and Language Processing. Available online

at https://web.stanford.edu/~jurafsky/slp3/, checked on 4/8/2023.

Kant, Immanuel (2011): Kritik der reinen Vernunft. Vollst. Ausg. nach der 2., hin und wieder

verb. Aufl. 1787, vermehrt um die Vorrede zur 1. Aufl. 1781. Köln: Anaconda.

Karlova-Bourbonus, Natali (2019): Automatic detection of contradictions in texts. Gießen:

Universitätsbibliothek. Available online at http://geb.uni-giessen.de/geb/volltexte/2019/14447/.

Kasneci, Enkelejda; Sessler, Kathrin; Küchemann, Stefan; Bannert, Maria; Dementieva,

Daryna; Fischer, Frank et al. (2023): ChatGPT for good? On opportunities and challenges of

large language models for education. In Learning and Individual Differences 103, p. 102274.

DOI: 10.1016/j.lindif.2023.102274.

Kesselring, Thomas (2013): Formallogischer Widerspruch, dialektischer Widerspruch,

Antinomie. Reflexionen über den Widerspruch. In Stefan Müller (Ed.): Jenseits der Dichotomie.

Wiesbaden: Springer Fachmedien Wiesbaden, pp. 15–38.

KHOO, C. S. G.; KORNFILT, J.; ODDY, R. N.; MYAENG, S. H. (1998): Automatic Extraction

of Cause-Effect Information from Newspaper Text Without Knowledge-based Inferencing. In

Literary and Linguistic Computing 13 (4), pp. 177–186. DOI: 10.1093/llc/13.4.177.

Kim, Haeng Kon (Ed.) (2018): 2018 IEEE/ACIS 19th International Conference on Software

Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).

June 27-29, 2018, Busan, Korea : proceedings. Institute of Electrical and Electronics

Engineers; IEEE Computer Society; International Association for Computer and Information

Publication bibliography XV

Science. Piscataway, NJ: IEEE. Available online at

http://ieeexplore.ieee.org/servlet/opac?punumber=8422066.

Kingston, John; Schafer, Burkhard; Vandenberghe, Wim (2004): Towards a Financial Fraud

Ontology: A Legal Modelling Approach. In Artif Intell Law 12 (4), pp. 419–446. DOI:

10.1007/s10506-005-4163-0.

Knothe, Frank; Mast, Jürgen; Böttger, Matthias; Pfeiffer, Peter; Futschik, Hans-Peter;

Hutzenlaub, Holger et al. (2006): Die neue CL-Klasse von Mercedes-Benz. In ATZ

Automobiltech Z 108 (10), pp. 800–813. DOI: 10.1007/BF03221821.

Kurtanovic, Zijad; Maalej, Walid (2017): Automatically Classifying Functional and Non-

functional Requirements Using Supervised Machine Learning. In : 2017 IEEE 25th

International Requirements Engineering Conference (RE). 2017 IEEE 25th International

Requirements Engineering Conference (RE). Lisbon, Portugal, 04.09.2017 - 08.09.2017:

IEEE, pp. 490–495.

Lamsweerde, Darimont, Letier (1998): Managing conflicts in goal-driven requirements

engineering. In : IEEE Transactions on Software Engineering, vol. 24, no. 11, Page(s): 908 -

926.

Landhäußer, Mathias; Körner, Sven J. (2017): Artificial Intelligence in Requirements

Engineering.

Lashin, Gamal (2021): Technisches Änderungsmanagement. In Beate Bender, Kilian Gericke

(Eds.): Pahl/Beitz Konstruktionslehre. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 919–

942.

Lendvai, Piroska; Reichel, Uwe D.: Contradiction Detection for Rumorous Claims. Available

online at http://arxiv.org/pdf/1611.02588v2.

Li, Luyang; Qin, Bing; Liu, Ting (2017): Contradiction Detection with Contradiction-Specific

Word Embedding. In Algorithms 10 (2), p. 59. DOI: 10.3390/a10020059.

Liu, Chun; Zhao, Zhengyi; Zhang, Lei; Li, Zheng (2021): Automated Conditional Statements

Checking for Complete Natural Language Requirements Specification. In Applied Sciences 11

(17), p. 7892. DOI: 10.3390/app11177892.

Liu, Quan; Jiang, Hui; Wei, Si; Ling, Zhen-Hua; Hu, Yu (2015): Learning Semantic Word

Embeddings based on Ordinal Knowledge Constraints. In Chengqing Zong, Michael Strube

(Eds.): Proceedings of the 53rd Annual Meeting of the Association for Computational

Linguistics and the 7th International Joint Conference on Natural Language Processing.

Beijing, China: Association for Computational Linguistics, pp. 1501–1511.

Loucopoulos, Pericles (2005): Requirements engineering. In John Clarkson, Claudia Eckert

(Eds.): Design process improvement. London: Springer London, pp. 116–139.

Luisa, Mich; Mariangela, Franch; Pierluigi, Novi Inverardi (2004): Market research for

requirements analysis using linguistic tools. In Requirements Eng 9 (1), pp. 40–56. DOI:

10.1007/s00766-003-0179-8.

Lutz, Robyn R. (1993): Targeting safety-related errors during software requirements analysis.

In SIGSOFT Softw. Eng. Notes 18 (5), pp. 99–106. DOI: 10.1145/167049.167069.

Maedche, Alexander; Motik, Boris; Silva, Nuno; Volz, Raphael (2002): MAFRA — A MApping

FRAmework for Distributed Ontologies. In G. Goos, J. Hartmanis, J. van Leeuwen, Asunción

Gómez-Pérez, V. Richard Benjamins (Eds.): Knowledge Engineering and Knowledge

Management: Ontologies and the Semantic Web, vol. 2473. Berlin, Heidelberg: Springer Berlin

Heidelberg (Lecture Notes in Computer Science), pp. 235–250.

Maher, Mary Lou; Poon, Josiah (1996): Modeling Design Exploration as Co-Evolution. In

Computer-Aided Civil and Infrastructure Engineering 11 (3), pp. 195–209. DOI:

10.1111/j.1467-8667.1996.tb00323.x.

Publication bibliography XVI

Manning, Christopher D. (2022): Human Language Understanding & Reasoning. In Daedalus

151 (2), pp. 127–138. DOI: 10.1162/daed_a_01905.

Mäntylä, Mika V.; Graziotin, Daniel; Kuutila, Miikka (2018): The evolution of sentiment

analysis—A review of research topics, venues, and top cited papers. In Computer Science

Review 27, pp. 16–32. DOI: 10.1016/j.cosrev.2017.10.002.

Marneffe, Rafferty, Manning (2008): Finding Contradictions in Text. Edited by Stanford

University. USA. Available online at https://nlp.stanford.edu/pubs/contradiction-acl08.pdf,

checked on 4/13/2022.

Marques, Ricardo (2015): Detecting Contradictions in News Quotations.

Maxwell, Joseph (1992): Understanding and Validity in Qualitative Research. In Harvard

Educational Review 62 (3), pp. 279–301. DOI: 10.17763/haer.62.3.8323320856251826.

Merriam-Webster: condition. Edited by Merriam-Webster.com. Available online at

https://www.merriam-webster.com/dictionary/condition, checked on 10/4/2022.

Merritt, Rick (2022): What Is a Transformer Model? Edited by NVIDIA Blog. Available online at

https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/, checked on 10.2023.

Miller, Tim (2019): Explanation in artificial intelligence: Insights from the social sciences. In

Artificial Intelligence 267, pp. 1–38. DOI: 10.1016/j.artint.2018.07.007.

Montgomery, Lloyd; Fucci, Davide; Bouraffa, Abir; Scholz, Lisa; Maalej, Walid (2022a):

Empirical research on requirements quality: a systematic mapping study. In Requirements Eng

27 (2), pp. 183–209. DOI: 10.1007/s00766-021-00367-z.

Montgomery, Lloyd; Fucci, Davide; Bouraffa, Abir; Scholz, Lisa; Maalej, Walid (2022b):

Empirical research on requirements quality: a systematic mapping study. In Requirements Eng

29 (4), p. 315. DOI: 10.1007/s00766-021-00367-z.

MP: 1005b, 19f.: Aristoteles (1831fI.): Metaphysik (hrsg. v. d. Preußischen Akademie der

Wissenschaften), Band 1, Berlin.

Narkhedem, Sarang (2018): Understanding Confusion Matrix. Edited by Towards Data

Science. Available online at https://towardsdatascience.com/understanding-confusion-matrix-

a9ad42dcfd62, checked on 10/11/2022.

Naz, Tabbasum; Akhtar, Muhammad; Shahzad, Syed Khuram; Fasli, Maria; Iqbal, Muhammad

Waseem; Naqvi, Muhammad Raza (2020): Ontology-driven advanced drug-drug interaction.

In Computers & Electrical Engineering 86, p. 106695. DOI:

10.1016/j.compeleceng.2020.106695.

Newell, A.; Simon, H. (1956): The logic theory machine--A complex information processing

system. In IEEE Trans. Inform. Theory 2 (3), pp. 61–79. DOI: 10.1109/TIT.1956.1056797.

Ontology learning and population. Bridging the gap between text and knowledge (2008).

Amsterdam: Ios Press (Frontiers in artificial intelligence and applications, v. 167). Available

online at

https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=

221155.

openai (2020): Models. GPT-3. Edited by openai. Available online at

https://platform.openai.com/docs/models/gpt-3, checked on 3/15/2023.

openai (2023a): ChatGPT — Release Notes. The latest update for ChatGPT. Available online

at https://help.openai.com/en/articles/6825453-chatgpt-release-notes, updated on 9/19/2023.

openai (2023b): GPT-4 Technical Report, p. 7. Available online at

https://cdn.openai.com/papers/gpt-4.pdf.

Publication bibliography XVII

Parsons, Terence (2021): The Traditional Square of Opposition. Edited by The Stanford

Encyclopedia of Philosophy. Available online at

https://plato.stanford.edu/archives/fall2021/entries/square/, checked on 4/21/2022.

Pawlik, V. (2022): Pro-Kopf-Stromverbrauch in Deutschland in den Jahren 1995 bis 2022.

Edited by Statista. Available online at

https://de.statista.com/statistik/daten/studie/240696/umfrage/pro-kopf-stromverbrauch-in-

deutschland/, checked on 10/13/2023.

Pohl, Klaus; Rupp, Chris (2021): Basiswissen Requirements Engineering. Aus- und

Weiterbildung nach IREB-Standard zum Certified Professional for Requirements Engineering

Foundation Level. 5., überarbeitete und aktualisierte Auflage. Heidelberg: dpunkt.verlag.

Preum, Sarah Masud; Mondol, Abu Sayeed; Ma, Meiyi; Wang, Hongning; Stankovic, John A.

(2017): Preclude: Conflict detection in textual health advice. In : 2017 IEEE International

Conference on Pervasive Computing and Communications (PerCom). 2017 IEEE International

Conference on Pervasive Computing and Communications (PerCom). Kona, Big Island, HI,

USA, 13.03.2017 - 17.03.2017: IEEE, pp. 286–296.

DIN DIN 69901-5:2009-01, 2009: Project management - Project management systems - Part

5: Concepts.

DIN ISO DIN ISO 9000:2015, 2015: Qualitätsmanagementsysteme.

Ritter, Alan; Soderland, Stephen; Downey, Doug; Etzioni, Oren (2008): It’s a Contradiction --

No, it’s Not: A Case Study using Functional Relations. In : Proceedings of the 2008 Conference

on Empirical Methods in Natural Language Processing. With assistance of Association for

Computational Linguistics, pp. 11–20.

Robertson, Suzanne; Robertson, James (2013): Mastering the requirements process. Getting

requirements right. 3. ed. Upper Saddle River, NJ, Munich: Addison-Wesley (Always learning).

Rowley, Jennifer (2007): The wisdom hierarchy: representations of the DIKW hierarchy. In

Journal of Information Science 33 (2), pp. 163–180. DOI: 10.1177/0165551506070706.

Russell, Stuart J.; Norvig, Peter (2016): Artificial intelligence. A modern approach. With

assistance of Ernest Davis, Douglas Edwards. Third edition, Global edition. Boston, Columbus,

Indianapolis, New York, San Francisco, Upper Saddle River, Amsterdam, Cape Town, Dubai,

London, Madrid, Milan, Munich, Paris, Montreal, Toronto, Delhi, Mexico City, Sao Paulo,

Sydney, Hong Kong, Seoul, Singapore, Taipei, Tokyo: Pearson (Always learning).

Sack, Harald; Alam, Mehwish (2020): Knowledge Graphs. Edited by Hasso-Plattner-Institut.

Fiz Karlsruhe - Leibniz Institute for Information Infrastructure & Karlsruhe Institute of

Technology. Hasso-Plattner-Institut. Available online at

https://open.hpi.de/courses/knowledgegraphs2020, checked on 4/6/2023.

Saenko, Kate (2023): A Computer Scientist Breaks Down Generative AI’s Hefty Carbon

Footprint. Edited by Scientific American. Available online at

https://www.scientificamerican.com/article/a-computer-scientist-breaks-down-generative-ais-

hefty-carbon-footprint/, checked on 10/13/2023.

Sandhu, Geet; Sikka, Sunil (2015): State-of-art practices to detect inconsistencies and

ambiguities from software requirements. In : International Conference on Computing,

Communication & Automation. 2015 International Conference on Computing, Communication

& Automation (ICCCA). Greater Noida, India, 15.05.2015: IEEE, pp. 812–817.

Sarafraz, Farzaneh (2011): Finding Conflicting Statements in the Biomedical Literature.

Scheuermann, Andreas; Leukel, Joerg (2014): Supply chain management ontology from an

ontology engineering perspective. In Computers in Industry 65 (6), pp. 913–923. DOI:

10.1016/j.compind.2014.02.009.

Publication bibliography XVIII

Schwitter, Rolf (2010): Controlled Natural Languages for Knowledge Representation. In Coling

2010 Organizing Committee (Ed.): Coling 2010: Posters, pp. 1113–1121. Available online at

https://aclanthology.org/C10-2128.

Sennrich, Rico; Volk, Martin; Schneider, Gerold (2013): Exploiting Synergies Between Open

Resources for German Dependency Parsing, POS-tagging, and Morphological Analysis: s.n.

Sepúlveda-Torres, Robiert; Bonet-Jover, Alba; Saquete, Estela (2021a): “Here Are the Rules:

Ignore All Rules”: Automatic Contradiction Detection in Spanish. In Applied Sciences 11 (7),

p. 3060. DOI: 10.3390/app11073060.

Sepúlveda-Torres, Robiert; Bonet-Jover, Alba; Saquete, Estela (2021b): “Here Are the Rules:

Ignore All Rules”: Automatic Contradiction Detection in Spanish. In Applied Sciences 11 (7),

p. 3060. DOI: 10.3390/app11073060.

Shultz, Thomas R.; Fahlman, Scott E.; Craw, Susan; Andritsos, Periklis; Tsaparas, Panayiotis;

Silva, Ricardo et al. (2010): Confusion Matrix. In Claude Sammut, Geoffrey I. Webb (Eds.):

Encyclopedia of Machine Learning. Boston, MA: Springer US, p. 209.

Simperl, Elena Paslaru Bontas; Tempich, Christoph (2006): Ontology Engineering: A Reality

Check. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann

Mattern, John C. Mitchell et al. (Eds.): On the Move to Meaningful Internet Systems 2006:

CoopIS, DOA, GADA, and ODBASE, vol. 4275. Berlin, Heidelberg: Springer Berlin Heidelberg

(Lecture Notes in Computer Science), pp. 836–854.

Sophist GmbH (2016): Schablonen für alle Fälle. Available online at

http://www.sophist.de/MASTeR-Broschuere/., checked on 3/28/2023.

Standish Group (1995): The CHAOS Report 1995.

Surana, S., Dembla, S. & Bihani, P (2022): Identifying Contradictions in the Legal Proceedings

Using Natural Language Models. In : SN COMPUT. SCI. 3. Springer. Available online at

https://doi.org/10.1007/s42979-022-01075-3.

Tawfik, Noha S.; Spruit, Marco R. (2018): Automated Contradiction Detection in Biomedical

Literature. In Petra Perner (Ed.): Machine Learning and Data Mining in Pattern Recognition,

vol. 10934. Cham: Springer International Publishing (Lecture Notes in Computer Science),

pp. 138–148.

The Standish Group (Ed.) (2014): Chaos Report.

Touvron, Hugo; Lavril, Thibaut; Izacard, Gautier; Martinet, Xavier; Lachaux, Marie-Anne;

Lacroix, Timothée et al. (2023): LLaMA: Open and Efficient Foundation Language Models.

Twain, Mark (1880): The Awful German Language. In Mark Twain: A Tramp Abroad. Toronto:

Belford & Co., pp. 879–897.

Uschold, Mike; Gruninger, Michael (1996): Ontologies: principles, methods and applications.

In The Knowledge Engineering Review 11 (2), pp. 93–136. DOI:

10.1017/S0269888900007797.

VDA (2007): Automotive VDA-Standardvorlage Komponentenlastenheft (1st ed.). With

assistance of Druck Henrich, Medien GmbH & Co. KG. Frankfurt am Main.

Wiegers, Karl; Beatty, Joy (2013): Software requirements. 3. ed. [fully updated and expanded].

Redmond, Wash.: Microsoft Press (Best practices).

Wohlin, Claes; Runeson, Per; Höst, Martin; Ohlsson, Magnus C.; Regnell, Björn; Wesslén,

Anders (2012): Experimentation in Software Engineering. Berlin, Heidelberg: Springer Berlin

Heidelberg.

Wong, Wilson; Liu, Wei; Bennamoun, Mohammed (2012): Ontology learning from text. In ACM

Comput. Surv. 44 (4), pp. 1–36. DOI: 10.1145/2333112.2333115.

Publication bibliography XIX

Wu, Xiangcheng; Niu, Xi; Rahman, Ruhani (2022): Topological Analysis of Contradictions in

Text. In Enrique Amigo, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper,

Gabriella Kazai (Eds.): Proceedings of the 45th International ACM SIGIR Conference on

Research and Development in Information Retrieval. SIGIR '22: The 45th International ACM

SIGIR Conference on Research and Development in Information Retrieval. Madrid Spain, 11

07 2022 15 07 2022. New York, NY, USA: ACM, pp. 2478–2483.

Wynn, David C.; Clarkson, P. John (2018): Process models in design and development. In Res

Eng Design 29 (2), pp. 161–202. DOI: 10.1007/s00163-017-0262-7.

Zhao, Liping; Alhoshan, Waad; Ferrari, Alessio; Letsholo, Keletso J.; Ajagbe, Muideen A.;

Chioasca, Erol-Valeriu; Batista-Navarro, Riza T. (2021): Natural Language Processing for

Requirements Engineering. In ACM Comput. Surv. 54 (3), pp. 1–41. DOI: 10.1145/3444689.

Zodhya (2023): How much energy does ChatGPT consume? Edited by medium.com. Available

online at https://medium.com/@zodhyatech/how-much-energy-does-chatgpt-consume-

4cba1a7aef85, checked on 10/13/2023.