Improving the System Test Process
Jörn Münzel Bosch Telecom, Frankfurt, Germany
The paper shows the results of an ESSI-funded experiment which has evaluated the use of a standardised test programming notation to increase test efficiency at Bosch Telecom. The goal of the experiment was to decrease system test time and to decrease manual effort involved in maintaining regression test cases through automatic testing.
The experiment includes changes of the test process based on the integration of a test case design and programming method together with the installation of new tools. Beside the experiment the paper describes the results and the experiences made with the approach regarding costs, effort and organisational aspects.
The main experiences of the project show that the increase of automatic testing is possible, but it requires a close binding to product development and new skills of testers. Efficiency in the use of test programming, which was not reached in the experiment, needs a high degree of test case reuse. Further activities to improve the approach based on an easily programmable and easily maintainable test design are also outlined.
Looking for ways to improve the existing test process, Bosch Telecom decided to evaluate the use of the standardised notation TTCN (Tree and Tabular Combined Notation - ISO/IEC IS 9464-3), together with currently available test environment tools. This evaluation is embedded in the ESSI-funded Process Improvement Experiment ‘RESTATE’(REuse of System Test cases through Applying a TTCN Environment - PIE-No. 23978).
The main goals of the experiment are test effort reduction and a shortening of the time spent on performing system test.
The paper presents an outline of the experiment (old vs. new test process) including the expected goals in detail and the motivation for introducing TTCN.
The measured results and the assessments made are the base for further activities to disseminate the technology in Bosch Telecom together with further improvements in test technology.
The paper also presents experience about the Bosch Telecom approach to technology transfer and process improvement. Bosch Telecom has established a centralised ‘software best practice’ competence team (called Software Technology Department) which has the task of improving software development by collaboration. Members of this team together with testers from the development department have carried out the experiment project.
Outline
The paper is structured into four chapters followed by a short conclusion.
The first chapter ‘Process Improvement Experiment’ describes the context of the evaluation experiment including the goals, technical aspects and organisational remarks.
The second chapter ‘Improvement Activities’ then shows the changes applied to process, test technology and the organisation.
The third chapter ‘Experience and Assessment’ expresses the results of the experiment with reference to costs, effort and development culture.
The fourth chapter ‘Future Activities’ summarises the current dissemination activities and the additional activities to improve the TTCN test technology.
The RESTATE experiment has had the goal to evaluate the use of a standardised formal notation to specify automatically executable system test cases. The supporting tool environment is used to perform test case execution in different environments (target and development environment) in order to reuse test cases. Test objects are private communication switches on special hardware under real-time conditions.
For further information see Appendix II or connect to URL: ‘http://www.bosch.de’.
The products concerned are mostly developed in-house, with an increasingly heavy emphasis on software, sometimes in excess of 80 %.
Market openness, technological variety and rising customer expectations are forcing vendors in the field of telecommunications to come to market faster with high quality, better tested products.
To do evaluation and integration the department works very closely together with the product development departments. New technologies normally are evaluated in a prototype project in a realistic context and with the participation of the department which currently wants to use the new technology first.
To transfer technology, the members of the department work as consultants, teachers and coaches. The main work and basic strategy is to work as participant coaches. This assistance normally continues for a longer term and may last for the complete duration of a project. Main advantages is that the people being coached are in close contact for a longer period with the coach and the ideas of the new technology. Additionally the ‘expert’ and the new technology have to succeed in ‘reality’ which increases the qualification of the technology and the experience of the coach.
The software development process is well structured in different phases, derived from the German V-model, with several fixed product quality evaluation points.
Test phases cover several levels of quality assurance, e.g. unit test, software integration test, system integration test including such activities as feature testing (also known as system testing), load testing and field testing.
The system test consists of testing each feature as a single capability together with testing the correct correlation of related features. Tests are specified as textual descriptions of behavioral scenarios involving several users, e.g. ‘participant A takes handset off hook and dials 4711’. Test execution is done by using real terminals to stimulate the test switch as described and by checking the system reaction.
The first step in test automation is done by capturing the test execution at the signaling interface and by replaying these test runs automatically. During the replay the signals (messages) of the system under test are compared with the recorded one. This possibility is used to do regression tests at the system test level of new product releases.
The main problem with captured test sequences is their lack of robustness in the face of small signaling changes and optional concurrent behavior of asynchronous working links. This sometimes forces new capturing phases after small changes.
The current test process is structured into six phases/activities which are listed below to show what effort was measured:
The two main goals focused on were:
The following paragraphs describe in more detail the relation between the goals and the realised experiment.
When analysing the existing test process, it was detected that current test automation via capture/replay has two main disadvantages. Firstly, a captured signalling sequence does not cover all possible correct dynamic behaviour of the system under test. To test the functionality of the switch does not necessary force a special sequential behaviour at the different signalling interfaces each time. Secondly, each captured test contains a complete sequence including all the used signals/messages which forces the storage of a lot of redundant data. Changes of message data have to be edited or newly recorded for each test case because no building mechanism using data references is possible.
To improve the process it was decided that a ‘test programming’ technique is needed to increase the flexibility of test sequences and to decrease redundant data.
Existing test programming techniques in the area of communication testing are home-made or related to a tool supplier or based on TTCN. When realising this, the different arguments of costs, own development and maintenance cost, efficiency and future portability were considered carefully. It was decided to use TTCN and buy existing tools because of the long term cost aspects and the higher portability outlook. Home-made or special supplier solutions seem to be more efficient, but not on a long term view.
The Tree and Tabular Combined Notation (TTCN) is standardised by ISO as part 3 of the ISO/IEC 9646 IS (Conformance Testing Methodology and Framework) [1] and includes a formal notation for specifying test cases as sequence trees of message interactions [2],[3],[4].
The main feature of the standard is the use of PCOs (points of control and observation) to define abstract test suites (ATS). To stimulate and check a system under test, the tester has to define one or more test points (PCOs) and has to specify the stimulation and checking of messages as abstract commands.
To specify asynchronous or optional behaviour, TTCN offers features to define alternative receives, default behaviour and concurrency. To check time dependencies, there are commands to start and stop timers together with commands reacting on time-outs. To assess the resulting behaviour, each path of a message sequence has to be assigned a test verdict.
The experiment was structured into three work packages which were sequentially [5].
Work package one was entitled ‘Installation, Education and Preparation’. It comprised the evaluation of the available TTCN tools including installation of the chosen one. The participants were trained to the use of TTCN and the tools. Main task of the work package was the conception of the used new test process based on a new test design and programming method, named the Bosch Telecom TTCN modelling technique.
Work package two was called ‘System test at target environment’. It comprised of the realisation of two TTCN test suites including their verification and execution in the target test environment. An overview is shown below in the figure, more information about the test process and the tools is described in the following chapter.
Fig. JMUNZEL.1 : Experiment Overview - System Test
Work package three was called ‘Integration test with reuse’ which evaluates a continuing improvement possibility. It contains the reuse of the realised TTCN test suites to test the switch software already in the development environment.
This part of the experiment is not included in this paper because these activities were not finished at the time of writing.
All the activities were measured based on a GQM (Goal/Question/Metrics) measurement plan. This includes also a baseline measurement where the same tests were realised and executed based on the current test process (see Baseline).
The main task of the technology department was the evaluation of the TTCN tools and the conceptual designs. The members of the product department mainly worked on the test contents, the baseline measurements and the test environment adaptations.
Regular project meetings, conceptual discussions, reviews, feed-back sessions and overlapping tasks encouraged a high degree of teamwork.
The idea of this chapter is to give some information about the activities done and technical results achieved in the experiment.
The new TTCN test process is structured into seven phases which are listed below:
In our case these activities resulted in a domain specific architecture [6], where each ‘user’ involved within the feature execution is modelled as a PCO. A user realises the interaction with a kind of terminal including telephone handset, display and keyboard. Each type of user is specified as a finite state machine (FSM) where each transition is designed as a logical building block. Synchronisation between the users and final verdict assignment of the test result are modelled in a central test task.
Because the used PCOs were not accessible with an automatic test environment each transition had to be transformed into a TTCN test step at signalling interface level, e.g. ISDN protocol messages. To reuse test steps they were coded as configurable macros. Problem of the representation is the not always clear relation between user behaviour and signalling interaction.
Basically the structuring of independent, but synchronised test points including reusable test steps allowed an efficient specification and coding of test cases.
The tool environment as shown in figure JMUNZEL.1 covers three areas of TTCN test support:
Compilation and test case configuration are supported by a TTCN compiler and a PIXIT editor (Protocol Implementation eXtra Information for Testing). The compiler needs to be specialised to the execution environment. The PIXIT editor is used to connect TTCN variables to the configuration data of the system under test, e.g. telephone number, hardware addresses. PIXIT data and executable code have to fit.
Test execution and result analysis are supported by a test campaign manager, a PCO platform, a tracer and a TTCN animator. The test campaign manager supports the selection and execution of single or connected test cases. The PCO platform is needed to support each used PCO with the underlying services. It realises the physical and logical access between the system under test and the test point. The tracer stores the execution data exchanged at all interfaces including time stamps and offers these data for further analysis. The TTCN animator is a tool which supports the trace analysis via showing the used path high-lighted in the TTCN code.
As mentioned before test case programming requires software development skills in addition to domain know how. That means TTCN testers need training in the use of TTCN, the tool environment and the Bosch Telecom TTCN modelling technique. Because of this amount of special knowledge, it was decided to build special test teams.
Project management is also involved in the changes because test case programming (at least initially) increases the effort for test specification and coding. These activities should be done in parallel to the product development to get executable test cases when the system test phase begins. Project scheduling and resource management have to be improved to integrate these changes.
The information is structured into three sub-chapters to separate the aspects of costs, improvement / changes and additional factors.
This chapter should be a must for all readers because it contains the results of the RESTATE experiment.
One licence for the TTCN editor is approx. 5,000.- US $, a licence of the TTCN Module is about 27,000.- US $. Assuming that each tester involved needs a TTCN editor licence and 3 - 5 tester share a TTCN Module environment, each tester’s place of work costs approx. 12,000.- US $. In our experience, this investment is similar to a software developer costs (CASE tool, programming environment). Cost may decrease by negotiation and the number of licences bought. Additional effort/cost is necessary for tool installation and permanent support.
The training costs are distinguished between TTCN training, tool training and test domain training.
A standard TTCN training course (3 days) in our case was about 1,200.- US $ for each participant. The training included a ‘hot-line’ support over three month.
The tool training consists of a three day course and was about 2,400.- US $ each participant (3 participants). Tool support during the experiment without additional cost was part of the licence fee.
The test domain training depends on the knowledge of the tester about the features under test, the system under test and the interface specifications of the test points (e.g. ISDN protocol). In our case the cost ranged between no cost and 6 weeks effort for reading and coaching.
We assume training costs are also similar to software development, maybe less if the tester already has knowledge of software development in general.
The initialisation cost contains the effort we invested to develop our test architecture and our test design technique including TTCN coding rules and basic test steps. In the RESTATE project this effort was about 8 man months.
We assume that these costs have high dependency on the complexity of the test area (number of parallel test points, complexity of the interface protocols), the quality of an existing test architecture and the quality of the test case specification documents. In our case we had two complex interface protocols, up to four test points and no usable test architecture.
In detail we measured the effort needed at each phase of both processes to get information about the entire effort and the effort distribution across the different activities. One test suite, called Basic Call, tests the feature of connecting two users under several conditions. The other test suite, called Advice of Charge (AOC), tests the feature of displaying and storing charging information.
Because we expected an effort increase for the first time programming a test suite rather than simply recording test cases, we also tried to measure the maintenance effort of both the processes. This was measured via testing two different switch types.
The following figures show the measured data of the ‘Baseline’ and ‘Experiment’ projects.
Fig. JMUNZEL.2 : Baseline Measurement - Test effort distribution
The figure above shows the results of the reference measurement (baseline) based on the current test process. Average effort per test case is 2.5 ph (person hour) for Basic Call and 5.6 ph for AOC.
Fig. JMUNZEL.3 : Experiment Measurement - Test effort distribution
Looking at the experiment team, we measured the distribution shown above for the initial coding and verification of the two test suites. Average effort per test case (TC) are 7.2 ph for Basic Call and 20.1 ph for AOC. The zero effort during the ‘Design Test Specification’ (Design Spec.) phase is because we already used the results of the baseline team.
To illustrate the effects of using programming techniques and the reuse of building blocks in TTCN programming the following two figures show the evolution of effort per test case over the experiment period.
Fig. JMUNZEL.4 : Experiment Measurement - Effort per test case Basic Call
Fig. JMUNZEL.5 : Experiment Measurement - Effort per test case AOC
As shown above, the effort to program a test case decreases after an initial period in a range comparable to the baseline values.
The following two pictures show the results of measuring maintenance effort. We called this measurement the ‘Robustness’ factor because of our demand is that regression test cases should not need maintenance without required changes.
Fig. JMUNZEL.6 : Baseline Measurement - Maintenance effort distribution
The results of the baseline measurement show the activity distribution and an average effort of 1.5 ph for Basic Call and of 1.6 ph for AOC.
Fig. JMUNZEL.7 : Experiment Measurement - Maintenance effort distribution
The experiment test suite for Basic Call was already robust and the AOC required an average effort of 1.2 ph for each test case.
Assessing the current test process measurements we found that the main effort occurred during phase Test Specification Design, and that it was not expected at such a high level. The current test captures the test data at signalling level where a lot of communication details are included. Most time is needed to abstract from the detailed message contents (signalling level) to get statements about the behaviour of tested features (user level).
Another interpretation made is about the direct relation of test effort and complexity of test cases. The Basic Call test cases do have a less number of test actions than the AOC test cases to be specified, recorded and verified.
Assessing the TTCN test process measurements we came to appreciate the high amount of effort involved in developing / programming test cases. Until a library of basic test steps exists the effort is much higher. Although the experiment already used the Test Specification Design phase information of the baseline team, the designing and coding of TTCN took a longer time.
A positive evaluation of TTCN is during the Test Environment Preparation and Result Analysis phases where less effort was needed because of a higher degree of automation.
Interpreting the maintenance effort for both processes there are no striking differences between current process and TTCN. This may occur from the approach we made to test two different switches. For Basic Call we registered zero effort with TTCN and some effort with the baseline team, which shows the distribution we expected. For AOC we registered a higher effort with TTCN because the programmed configuration of one switch has differences to the second switch approach. This resulted in a small redesign of the TTCN test suite. Another influence factor we did not measure was the degree of comparison we used. In the case of RESTATE we checked the receiving messages very superficially and this was easy to adapt at the Baseline approach. Detailed test checks should be easier to maintain in the TTCN programming environment than at the protocol simulator.
Concluding the effort data of the experiment we assessed that the expected advantages of using TTCN were not reached by the experiment. The goal to reduce test time and manual test effort are not deducible in a short term view compared to the costs and the effort.
On a long term view there are aspects which contain advantages and which we will use as a base for our ongoing work (see chapter Future Activities). One aspect is the effort reduction of TTCN programming for complex test cases based on a stable test step library and a lot of regression. Another advantage may be a higher degree of automation at regression test which may be executed at 24 hours at seven days a week. Another aspect where we guess an advantage is the possibility to do test case programming in parallel to product development which forces an early inspection of the interface specification. We guess that this would not only lead to earlier existing test cases but also to an increased quality of the realised product.
When using a test programming technique the designing and interpretation rules have to be formalised to be effective. The coding of stimulating messages and the verification of the receiving messages has to be based on formal specifications. In our experiment we had the problem of a mismatch between the level of testing (user behaviour) and the level of test access (signalling interface). Because a formal representation of user behaviour and the associated signalling behaviour did not exist, we had to build one. Such representation is required for programming test cases but was not directly accepted by the developer. From their point of view this kind of specification may handicap the flexibility of a layered development where realisation is hidden from the service interface (access point). During the experiment we could not solve this problem.
Another experience we had during the experiment is the similarity between the management of test case programming and of software development. In the current test process the testers do a lot of work independent of each other. When designing and coding TTCN the programmer have to work as a team to use common information, e.g. message declarations, constraints, test steps. To manage this work such technologies as team management, quality assurance and configuration management (access rights, release management, etc.) are necessary.
To introduce a technology like TTCN requires not only the tailoring of tools and naming conventions but also the development of a vision about a new process. We have spent a large amount of effort in discussing and understanding the ideas of TTCN based testing, the application domain context and the goals of the system test before designing a concept how the TTCN technology and tools may fit to the problem area. Successfully we early integrated together external consultants, experienced testers and members of the technology department, so that we got the information needed and, in time, a common vision of the solution. Generalising this experience, we realised the necessity of adapting a ‘new’ technology to the problem domain. In particular the integration of experienced people of the application domain and the technology domain over a longer period supports a well designed and accepted solution.
The main changes in the test process were brought about by the increase of documentation depending on a higher degree of necessary formalism. Both sides, testers and developers, had reservations about writing and using documented specifications. On the other hand we got more acceptance during the project when we used our specification for problem localisation. The main problem of detailed documentation is often that filters are missing which could help in extracting currently needed information. Missing flexibility and the effort of maintaining detailed documentation often were used as arguments.
Generalising this experience we will have to think about techniques to get hierarchical structured specification to filter adequate levels of information and to maintain traceable and consistent data.
Our experience in cultural and organisational changes is limited to a small amount, because we did not work with a larger team over a longer time. In conclusion, we had no problem in discussing the problems and in designing a new test approach but there is still uncertainty as to how the results of the experiment should be interpreted. The problems of integrating a test programming technique are focused to two areas. The first is the need for a new kind of specification (representation ‘user behaviour’ - ‘signalling behaviour’) which has to be written by development staff. The second is the need to teach the testers how to program test cases. Finally also project scheduling should be changed to integrate a specialised TTCN test team early on in the development.
Because of the gravity of changes involved our doubts seem to be justified and we have to think about a stepwise integration.
TTCN requires investment (tools, training, initialisation) and increases effort for the first test case realisation. Maintenance effort is less than today but to reach a return on investment we need a high number of maintenance cases.
On the other hand we learned a lot when evaluating TTCN and measuring our current test process. The main message for further activities is based on the experience that the developed test architecture including the use of PCOs is a strong concept to improve test specification and test verification. Also the TTCN technology with the existing tool environment is valuable to be used for test automation but the programming effort has to be decreased.
The following sub-chapter show an overview about some activities already started or aimed at.
Public dissemination of results and experience are done via ESSI reports and Conference presentations [5], [6] as this paper shows. In-house dissemination at Bosch is done via reports, presentations, intranet pages and several workshops.
Main distribution is supported by the ‘software best practice’ competence team which is involved in consultant and coaching activities. Some other development departments already started to include the TTCN technology to their process using the experience of the experiment and the know-how of the technology department.
We have started a co-operation with the Institute for Telematics at the University of Lübeck, Germany, to analyse the problem of automatic generation of TTCN test cases out of formalised Message Sequence Charts (MSC). MSC is a standardised notation to specify dynamic interaction on a higher level of abstraction. This should decrease the effort for TTCN coding.
Another area is to increase the use of formal specification techniques as SDL and ASN.1 for interface specification. These could be used for increased tool supported code generation at development and at testing.
A reduction of development cost per test case should be reached by increasing the reuse of test cases. This is the goal of the second part of the RESTATE experiment where the already existing TTCN test suites are executed during the software integration phase in the development environment (see Experiment Facts: Work package 3).
To improve the test process we started a further analysis about the current used test scenario (protocol PCO). Our target is to deduce PCOs which are easier to handle from the viewpoint of feature tests. Preconditions are a necessary access at this interface and a less complex interaction model. Additional we believe, that a stable syntactical and semantic interface is a basic requirement for the efficient use of a test programming technique. This will increase considerably the use of building blocks and reduce maintenance effort.
Efficiency of programmed test cases is reached through a high degree of regression to reduce maintenance effort. Efficiency of programming is reached through the use of stable interfaces as test access points (PCOs) to reduce coding effort. Interlocking the test process with the development process (e.g. requirements, architectural design) will support product quality. An earlier review of interface specifications from a tester’s point of view will increase transparency and completeness.
On the other hand, test case automation requires investment in tools, new skills, test process and initialisation effort.
We have not yet reached all our goals, but we are on a promising course for the future.
ASN.1 Abstract Syntax Notation 1 (Standard for specifying abstract data types)
ATS Abstract Test Suite (TTCN test cases, independent of special execution information)
ESSI European Systems and Software Initiative (Program of the European Commission)
GQM Goal/Question/Metrics paradigm (basic metric concept, originated by Victor Basili, University of Maryland / USA, and the Software Engineering Laboratory)
IEC International Electrotechnical Commission
ISDN Integrated Services Digital Network (Standard for Telephony)
ISO International Standardisation Organisation
ISO 9646 Open System Interconnection - Conformance testing methodology and framework (Standard including the specification of TTCN)
LAN Local Access Network
MSC Message Sequence Chart (Standard for dynamic behaviour flows)
PCO Point of Control and Observation (basic concept of TTCN methodology)
PIE Process Improvement Experiment (ESSI work program task type)
PIXIT Protocol Implementation eXtra Information for Testing (Variables used in TTCN which are assigned to configuration data at run time.
RESTATE REuse of System Test cases through Applying a TTCN Environment (Title of the ESSI PIE project of Bosch Telecom)
SDL Specification Description Language (Standard for specification of finite state machines)
TTCN Tree and Tabular Combined Notation (Test programming notation, specified at ISO 9646 - part 3)
[2] Kron J., Wiles A., A Tutorial on TTCN, Tutorial at the 11th International Symposium on Protocol Specification and Verification, 1991
[3] Baumgarten B., Giessler A., OSI Conformance Testing Methodology and TTCN, Elsevier Sciences B.V., Netherlands, 1994
[4] Ek A., Grabowski J., Hogrefe D., Jerome R., Koch B., Schmitt M., Towards the industrial use of validation techniques and automatic test generation methods for SDL specifications, in: Proceedings of the 8th SDL Forum ed. Cavalli and Sarma, Elsevier Sciences B.V., Netherlands, 1997
[5] Münzel J., Better testing for private communication networks, in: Proceedings of the 5th European Conference on Software, Testing, Analysis and Review, Edinburgh, (CD-ROM), 1997
[6] Anlauf M., Programming service tests with TTCN, in: Proceedings of the IFIP TC6 11th International Workshop on Testing of Communication Systems ed. Petrenko and Yevtushenko, pp. 263-278,Kluwer Academic Publishers, Boston, USA, 1998
Mr. Münzel is active in the areas of software test and software quality management.
From 1993 to 1995 he worked at Robert Bosch Research and Development as a software engineer in the areas of test and quality management of object-oriented software development.
Jörn Münzel received a diploma in Computer Science from the Technical University of Darmstadt (Germany) in 1986.
Prior to joining Bosch, he was a software engineer at a German software house, where he took part in and led several projects in the area of telecommunication and conformance testing.
For several years Mr. Münzel has been an active member of the German Special Interest Group on Test, Analysis and Verification of Software which is part of the German ‘Gesellschaft für Informatik e.V’ (German Computer Science Society).
In 1997 Bosch Telecom achieved sales of approx. 5.3 billion DM which represents 11% of total Bosch Group sales. 560 million DM were invested in Research and Development.
The Bosch Communication Technology Business Sector is concentrated on communications technology for public and private networks, and mobile telephones, as well as on security and traffic control systems.
Six Product Groups are responsible for this business and are present in the market with a wide product mix. This ranges from systems for radio-relay, multiplex engineering, and network management, through telecommunications, fire and emergency-alarm systems, video-supervision installations, up to cellular phones, traffic-management systems and equipment for satellite engineering. The business is characterised by its strong service orientation.