Data Driven Improvement for SMEs
Tor Stålhane, Ph.D., Kari Juul Wedde, MSc, Tore Dybå, MSc. SINTEF, Trondheim, Norway
In order to perform data analyses, it is necessary to combine collected data with experiences that are available in the organization. This is important in order to decide:
It is outside the scope of this paper to report on our experiences with extending the knowledge created inside a team across to other teams and to the organization as a whole. Besides the need for building up an experience bases, an additional - but often ignored role of data collection - is to dispel myths, which act as roadblocks for the improvement work. Such myths are often unconscious acts of self-defense in the company. As long as the myths are not challenged, they block all thoughts of improvement. Thus, only by dispelling the myths can we move on to changes and true process improvement.
The rest of this paper is organized as follows: First we discuss learning in organizations in general and how this relates to the special problems facing SMEs. Then we will go into more details and focus on how SINTEF has solved the identified problems by using two approaches, namely GQM and risk based improvement. At last we will sum up our experiences and offer a set of conclusions.
Fiol and Lyles [7] suggest that organizational learning is "the process of improving actions through better knowledge and understanding." We agree with this viewpoint, and hold that the role of organizational learning, within the context of software process improvement, is to provide a framework for improved actions. However, in order to understand how such a framework could be used, there are two basic dimension of software process improvement that must be conceived. One has to do with the type of situation; the other has to do with the type of learning. For software organizations in general, it is important to be alert to the fact that some combinations of these dimensions facilitate improvements, while other combinations inhibit improvements. This is summarized in Table 1.
Table 1. The two dimensions of software process improvement.
The situation in software development processes is a sequence of stable and turbulent conditions demanding alternations between innovation and standardization. This suggests that process improvement require both change and stability. Fiol and Lyles elaborate on this, noting that too much stability within an organization can be dysfunctional and that too much change and turbulence leads to difficulties for people to make sense of their environments. In other words, software process improvement involves the creation and manipulation of this tension between constancy and change.
The learning dimension consists of what Argyris and Schön [1] call "single-loop" and "double-loop" learning. They define single-loop learning as "instrumental learning that changes strategies for action or assumptions underlying strategies in ways that leave the values of a theory of action unchanged." Double-loop learning, on the other hand, is defined as "learning that results in a change in the values of theory-in-use, as well as in its strategies and assumptions."
The concepts of single-loop and double-loop learning are crucial in understanding the restructuring of the software organizations’ routines and practices. At their best, SPC-based improvement models are concerned with how to achieve better effectiveness within the existing values and norms, that is, single-loop learning. Often, however, they are connected with simple adaptation rather than learning. Furthermore, they are concerned with solving the needs of large organizations, operating in highly stable environments with long-term contracts. This can be contrasted with the majority of SMEs that operate in increasingly changing environments where the periods of stabilization are constantly shortened, thus requiring adeptness to double-loop learning and reflective practice.
In sum, both standardization and innovation can produce improved actions for SMEs in some situations, but can also harm these organizations in other situations. Consequently, only by recognizing this challenge of the "learning paradox", and the intrinsic short periods of stabilization facing most SMEs, can they expect to succeed with software process improvement.
There are two aspects of the experiential learning model that are important for process improvement in SMEs. First, the emphasis on here-and-now, concrete experience to validate and test abstract concepts. Second, the concept of feedback to describe a social learning and problem-solving process that creates knowledge. This feedback loop provides the basis for data analyses and goal-oriented action.
Whereas Kolb’s theory is individually oriented, Nonaka and Takeuchi [12] have presented a theory of knowledge creation that is team and organization oriented, emphasizing Polanyi’s [16] distinction between tacit and explicit knowledge. Tacit knowledge is personal and context specific, and therefore hard to formalize and communicate. Explicit knowledge, on the other hand, is transmittable in formal, systematic language. Furthermore, human beings acquire knowledge by actively creating and organizing their own experiences and only a part of this knowledge can be expressed in words and numbers (Polyani).
Nonaka and Takeuchi define organizational knowledge creation as a continuous and dynamic interaction between tacit and explicit knowledge. This interaction consists of four modes of "knowledge conversion", as shown in Figure 1. First, the socialization mode starts by building a "field" of interactions, letting the members share experiences and creating tacit knowledge. Second, the externalization mode helps team members to engage in a process of converting tacit knowledge into explicit concepts. Third, the combination mode lets organizational members systematize and share newly created explicit concepts, and existing knowledge into a knowledge system. Finally, internalization or "learning-by-doing" embodies explicit knowledge into tacit knowledge.
Externalization holds the key to organizational knowledge creation, since this creates new, explicit concepts from tacit knowledge [12]. Unless shared knowledge becomes explicit, it cannot easily be reused. The organization cannot create knowledge on its own without individual initiative and interaction at the group level - teams play a central role in the knowledge creation process in SMEs since they provide a shared context in which individual developers can interact with each other. Consequently, we have primarily focused on software development teams.
For learning to become more than a team level affair, however, knowledge must be spread quickly and efficiently throughout the whole organization. One powerful method of such diffusion is through the use of computer-based organizational memory (Huber, [8]) or an experience base (Basili, [4]). This topic will, however, not be discussed any further in this paper
These things taken together implies that SMEs need to grab the data as soon as they are available, extract the important information for learning and convert it to improvement actions without collecting long time series or amass large amounts of data needed for a tradition statistical improvement approach.
When we have collected the data according to the GQM plan and performed the necessary analysis, we can select one of two approaches, depending on the kind of approach we have chosen:
In the next section, we will first describe our overall approach to process improvement within the GQM framework. We will then go on to discuss the two selected improvement approaches and how they have been used in a large national project called SPIQ.
SPIQ is based on the general process improvement principles of Total Quality Management (TQM) [2] [19], [20]. Specifically, the Plan-Do-Check-Act (PDCA) cycle is important. Figure 2 shows the SPIQ improvement process - a two level PDCA cycle, the project level and the organization level.
Figure 2 Two level PDCA cycle
The inner loop of this model is realized by one or more Process Improvement Experiments - PIEs. The PIEs are implemented according to the ESSI model where improvement project and development project are managed as two separate projects – but with strong relations. The development project is a real project; not an experiment set up just for the PIE.
Figure 3 Process Improvement Experiment
When we implement the PDCA cycle on the project level, GQM [14], [15] is our most important tool for the Plan and Do steps. The fundamentals of GQM are the Goals, Questions and Metrics, documented in a GQM plan. The Plan part of PDCA covers the definition of Goals, Questions and Metrics and the Do part is the implementation of the GQM plan - including feedback sessions. The Check part is covered by the Post Mortem Analyses - which may be seen as a special feedback session - and the Act part consists of feeding the experiences back into the organizational level for institutionalization. For the institutionalization we are using the principles from Experience Factory [5].
The conclusion from an earlier ESPRIT project is that feedback sessions are the single most important element in keeping a measurement program alive. This is consistent with our experience. Feedback sessions are important both as a means to motivate the project team and as a method for data analyses and learning.
Another element that is important is the use of group interviews to define the GQM plan – i.e. to define what to measure and how to measure it. Group processes – including interviews and feedback sessions – are used to move from data and individual experiences to shared explicit knowledge.
In addition, we have good experience with using TQM tools – simple tools like histograms and scatter plots – in combination with GQM’s feedback sessions to analyze data. TQM tools are important in order to communicate and to make sense of the collected data.
This includes our experiences in going from individual tacit knowledge to shared explicit knowledge that can be reused by software development teams. We discuss the importance of involving the whole team and our experiences in using group interviews and feedback session at the team level to interpret data and to share experiences. Finally, the importance of writing experience reports from the PIEs will be exemplified, as an important action when converting tacit knowledge into explicit knowledge.
It is our experience that it works much better if the company involve the whole project team from the start, i.e. from the moment the development project for a SPI project is chosen. The overall improvement goal will normally be known by then. The same goes for the GQM Goal.
In GQM the people representing the Viewpoint of the GQM Goal are considered to be the experts. In the context of process improvement the project team always represents the Viewpoint and as such they are the experts. They should therefore be given a chance to confirm whether the defined SPI project is relevant for their project or not. Further more they should take part in defining Questions and Metrics related to the Goal.
It is often claimed that software developers are not interested in SPIs, they are technology driven. Technology is important for the software community and technology is therefore of great importance for the developers. Our experience, however, is that they also take interest in SPIs if they are properly informed and allowed to participate from the beginning. In SPIs performed the GQM way, the process improvement is bottom up as soon as the Goal is given. The bottom up approach helps us to collect metrics and solve problems that the developers consider important.
By using a group process, the team can discuss and obtain an agreement during the meeting. To overcome the problem with non-homogenous teams and dominating persons, we start the session by splitting the team into groups of two persons. In a group of two persons nobody could just drop out and all team members had a fair chance to come forward with their own ideas. These two-person groups were given 15 minutes to come up with a set of Questions. The questions from all groups were written on a whiteboard and served as a starting point for further discussion. The whole session takes two to three hours and the result will be a draft GQM abstraction sheet.
The GQM abstraction sheet is a means for acquiring the information needed for defining the GQM measurement plan. The abstraction sheet has four quadrants – see Figure 4. The upper quadrants show the Questions while the lower half shows the hypotheses. The hypotheses are important in order to verify the validity of the Questions. If the team have difficulties in coming up with any hypotheses, they may have defined the wrong Questions. The hypotheses are, however, often dropped. This is especially true for the Baseline Hypotheses.
Figure 4 GQM abstraction sheet
The Baseline Hypotheses quadrant shall document the developers’ view of the current status of the measured properties. This means that the team should use their current knowledge about the process to answer the defined Questions. In one company they did not see any reason why they should fill in this quadrant, but they started the job anyhow. Filling in this quadrant, however, started an enthusiastic discussion, involving all team members. The discussion served as a great motivation factor for the measurement program and when the data collection started, they were all eager to see the result of the measurements.
All together, group interviews is an efficient way to fill in the GQM abstraction sheet, giving a large range of important effects:
Note that it is dangerous to jump for instance from "A is linearly dependent on B" to "B causes A". It is always a possibility that both A and B depend on some unobserved variable C. If C is not observed, "B causes A" and "C causes A and B" will give the same kind of observations. In order to sort this out, it is important to use the knowledge and experience of the developers during data analyses.
Seen from a learning perspective, the feedback sessions thus gives results on many levels. The project team will learn as individuals and as a group and the organization collects experiences that can be useful in other projects later on. All together the result can be summed up as:
Below is an example of an Ishikawa diagram. The main cause – Incomplete requirements specifications – has been identified as an important problem cause by collecting and analyzing data in a feedback session, supported by a Pareto analyses.
Even if we use the Ishikawa diagram to identify possible improvements, the risks rising from changes to the process still have to be controlled. As demonstrated by others, there is a close connection between the GQM abstraction sheet and a risk management table. This connection stems from the observation that the environment factors in the GQM abstraction sheet are the important risk factors in the project – see Figure 4. The environment factors are important for two reasons:
Identified risks
Causes
Possible actions
Table 2 A general risk control approach
It is our experience that the continues learning needed for process improvement is best facilitated through a sequence of group processes. This sequence starts with the group process needed to define what to measure and how to measure it, goes on through collection and interpretation of data and ends up with institutionalization.
As a further result of few data and fast changes, we will seldom get enough data for solid statistical analyses. As a consequence of this, human knowledge and experience is of utmost importance when interpreting the collected data. Statistical analyses will, however, be important for deciding the significance level for our conclusions. Statistical analyses will also be important for areas where we will get large amounts of data. This is the case for process steps that are repeated often, such as code reviews, unit tests and maintenance activities.
An important alternative to experimental learning – at least in the strict sense of this word – is to go for direct improvement. This is done by combining Pareto analyses – what are our major problems – and one or more brainstorming sessions supported by an Ishikawa diagram – what can we do about it. This way of improving the process is faster, but carries a larger risk that the strict learning / understanding approach.
This view will partly collide with the idea of experience reuse. As stated several times in the past, one of the goals for process improvement is to collect data that can be stored in a data bank for later reuse. There are, however, some problems with the reuse of experience when we are strongly dependent on expert judgement. All the experts’ knowledge stems from the process as it was before the improvement steps and is thus not reliable for the new, improved process. Reusable experience must thus focus on how to solve problems and on the parts of the process that were not changed.
However, the most important experience to reuse is that it is possible to improve the process through data analyses and the use of simple problem solving techniques. In all case, improvement should be attempted through several, small steps, not in one or a few giant leaps. The search for a best practice all too often results in a static view of the process, which is dangerous in an ever-changing market – at least as seem from the SMEs. In addition, it is important to keep in mind that static knowledge is not necessarily a good thing in a dynamic environment.