Linguistic methods of Requirements
The incorporation of linguistic methods (NLP)
in the investigation and quality assurance of requirements
an experience report from industrial practice
SOPHIST GmbH, Nürnberg
SOPHIST Ltd. transferred research from other sciences, i.e. linguistics
and psychology, to computer science, and the result is an easy-to-use set
of tools to find and avoid ambiguous, incomplete and inconsistent requirements.
This paper will examine the various linguistic phenomena that underlie
errors found in requirements. Furthermore, the paper will show how these
errors can be found and corrected.
In order to control exploding costs and development time, the root causes
that lie in the analysis phase must be removed before they can do serious
damage. Improvements at the analysis stage level are more effective than
in the design phase and incomparably better than in the implementation
phase. The earlier an error is found and fixed, the less damage it can
do to later development and the better the important development parameters
of cost and time can be controlled. The first phase in the development
of a project is decisive; it usually decides the success or failure of
A semi-detailed specification document with excessively large sections
and vague formulations of requests is submitted in natural language (prose)
to the client‘s analyst. The analyst may then perhaps transform this specification
document into a semi-formal representation such as a SA-, SADT- or OO model.
On the basis of this representation, the analyst researches further and
adds important points that he determines through dialog with the client
or user(s). The original specification document loses all relevance as
the model of the analyst shapes all new requirements, hence the original
specification document becomes useless.
The result of this quite everyday scenario opposes the thought and purpose
of the original representation because the analysis model should be understandable
for all participants in the process and not merely from the analyst’s point
of view. A clear representation enables the user to follow what the analyst
models. The user can immediately "change course" if he notices that the
model has gaps.
Perhaps some points are missing or are not adequately addressed. Lastly,
(a) the user is more certain that he will receive what he wants and (b)
in case of a dispute regarding system functionality (keyword acceptance),
he is not standing in legal deep yogurt.
For reasons mentioned previously, natural language requirements (prose
requirements) must now be given the primary status, which is opposite the
previous convention. This does not mean that the current (semi-) formal
representations of requirements are incorrect and useless; object engineering
requires the translation of prose requests into an integration model and
a simulation model. Rather, the representation shall be in natural language
and the subsequent work with the natural language requirements is an integral
component of the analysis process.
These two often-diverging points of view must be integrated in the development
process. Here the transposition of requirements into an executable program
and therefore the model of the future system must include the totality
of all requirements and the model’s representation must be formal. The
analyst and designer pursue this overall goal as if they were the system
implementers and therefore they must keep this "prime directive" constantly
in mind. The user however is much more interested in a delivered system
that corresponds to his ideas. If the program does not correspond to the
customer‘s expectations and mental "grand plan" complaints will follow.
The criteria for the model, which includes the totality of the requirements,
are essentially derived from these two previously mentioned points of view.
To further this goal, the requirement sentences should be --
Fig.CRUPP.1 : Reality, personal Knowledge,
expression of the Knowledge
To this end, a set of rules for writing and checking of prose requirements
would be very expedient and helpful in removing or minimizing the deficiencies
and ambiguities of natural language. The goal of an analysis process on
the basis of prose requirements must be the named formal criteria of legal
obligation and intelligibility for all participants. Happily, such a set
of rules exists and the rules can be followed. Therefore they should be
put into practice.
The father of the theory of a systematic construction of language is
Noam Chomsky, founder of generative transformational linguistics. The results
of his theory make it possible to build a definite sentence by applying
grammar rules to any language component, be it spoken or written. Chomsky’s
theory has undergone certain expansions and alterations since its first
publication because the theory didn’t adequately explain certain aspects
The method of analysis of prose requirements is essentially the application
of the improved theory of transformational grammar. Therefore, one must,
by means of generative rules, subject sentences to transformations. Ultimately
the product is a grammatically correct and exactly defined sentence.
More recently, the basic science from linguistics has been applied to
a model of human communication and mode of expression. This has produced
a set of psychology rules or transformations, which enable an exact meaning
of spoken or written sentences to be found via the application of transformations
upon the spoken or written sentences. Correctly applied, the underlying
meaning of the communication is revealed. Avenues of further questioning
and investigation are also evidenced. The original set of rules are enumerated
in broad strokes in the book, "The Structure of Magic I," by R. Bandler
and J. Grinder, the creators of therapy-based Neuro Linguistic Programming
The firm SOPHIST Ltd. has similarly applied models of language upon
areas of Computer Science. The result has been the creation and testing
of simple, easy to use rules for the creation and quality assurance of
prose requirements in project development.
Through the use of these flexible and adaptable rules, a systematic
mode and manner for the review of natural language requirements formulated
in natural language is applicable. This is particularly true in regards
to eliminating requirements, which contain ambiguous, incomplete and inconsistent
statements as are often found in requirements documents. Finally, this
method can and should be utilized in the writing of requirement documents
so that errors or ambiguities never even "see the light of day."
Fig.CRUPP.2 : linguistic defects
A few of these defects and their case by case removal will be clarified
on the basis of suitable examples.
The following examples exhibit various types of deletions and give an
insight into the spectrum of possible representations from deletion transformations.
Presuppositions or implicit assumptions must be made explicit in requirement
documents in order to be meaningfully complete. Presuppositions originate
frequently through an omission by the author of a requirement because the
presuppositions are either so obvious to the author that he or she doesn’t
consider it noteworthy or the author is not even aware that there is an
The corrections should
be easily modifiable.
the question remains what exactly is easily?
A comparative or superlative always requires a reference point to be
completely defined. Furthermore, the unit of measure (ex. Meter, Second)
and the tolerance (ex. +/- 0.1 meter, +/- 0.0001 seconds) must be declared.
Furthermore, the process word "monitored" is referential. The statement
is defined completely only if the following questions are clarified: Who
monitors? What is monitored? How or in which manner is it monitored?
Through the process of generalization requirements are often made which
seem to apply to a large or entire part of a system. For other parts of
the system, these requirements can be very false indeed, whereas a correctly
defined requirement would actually apply to a smaller piece of the system
in order to have accurate scope and correct meaning.
Typical for the process of generalization is the suppression and omission
of both special and error cases respectively. In the following, the most
frequently seen variations of generalization are reviewed.
It is critical with this type of generalization to immediately define
the range of applicability so that no possibilities and occurrences of
applicability are left out. Special cases must also be defined in the generalization
Which message? Which/whose working position?
The process behind the noun data loss actually consists of:
Data is being lost.
So this sentence has the related questions: Which data is being lost?
How is the data being lost? How can the loss of data be recognized?
When picking methods from the methods kit object engineering it is hardly
possible to omit the component linguistic methods in contrast to the possible
elimination of the integration and simulation model components. Object
engineering forms the basis for further work in the software development
process with prose requirements, which are the essential foundation for
continued development, as previously demonstrated.
Hence, with some concessions, it is possible that the scientifically
necessary formal model, which is formed by the entirety of all requirements,
can be represented informally with the assistance of linguistic methods.
So the door is then opened such that the client and user may, for the first
time, directly review and criticize the working description of the problem.
SOPHIST Gesellschaft für innovative Software-Entwicklung mbH
Vordere Cramergasse 11-13
90478 Nürnberg, Germany
Tel: +49 (0)911/ 40 900-0
Fax: +49 (0)911/ 49 900-99
All rights reserved, including any rights arising from the granting
of a patent or the registration of a patent or the registration of a utility
model or design. No part of this publication may be reproduced, distributed,
stored in a retrieval system, or transmitted in any form, or by any means,
electronic, mechanical, photocopying, recording or otherwise without the
prior consent of the publisher.
Violators will be prosecuted to the maximum extent possible under law.