Data comes in all shapes and sizes. Quantified measures, qualified relative information, anecdotal recollection, opinions, theories, and misinformation all are used with varying success as a means of making informed decisions. An astute organization will recognize the different levels of quality that is embodied in the data provided, and use that knowledge to help identify the corresponding quality of the decisions that are made.
At its best, data is objectively measured, corroborated from more than one validated source, and irrefutable. Decisions from this sort of data are easy to make, but we live in a real world where this luxury is rare. In our imperfect world, we need to be able to make decisions with less than perfect data. In software projects, most of the significant decisions are made in the early stages of the project, where there is a great deal of risk and uncertainty associated with the available information.
Business cases need to be selected based on probability and likelihood rather than a deterministic formula. This degree of uncertainty is an important component of the data itself; for early-project estimates or similar information, the uncertainty can be more important than the ‘line in the sand’ data point. In planning a project, I might have two tasks that each nominally requires 2 weeks. If one of these is a well known activity with very little uncertainty but the other is something we have never done before and don’t even know if it is possible, this is important information that will allow me to treat these two very differently.
In software shops that start down the path of initiating a metrics program, they often fall into the trap of collecting data because it is easy to gather, regardless of whether it can reasonably used to answer the questions they need to answer. While the data itself may be clear and objective, it may not be the appropriate data to use. Generally, metrics programs can start with anecdotally gathered or easily available information to provide coarse initial measures and insights into their issues.
One of the most important insights this data can provide is the nature of the more precise data required to be gathered in the future. Bug counts at the highest level, for example, can be an indicator of the overall quality of the product, but based on the goals of the organization, drilling down to more precise data can take many forms. You might start to use bug counts and closures as an indicator of ongoing product stabilization. You may identify specific areas of the application that are more prone to failure and hence might need more thorough review and test. You might perform phase-based measures to determine which area of your development approach could be shored up.
All this assumes, though, that the intent is to come up with an objective, unbiased decision. Unfortunately, data is often gathered with less than objective intent in order to defend a previously determined position. I worked with one organization where they were performing peer reviews, and were quite satisfied with their results. I was asked to sit in on one of these reviews, and was appalled to see people showing up late without having reviewed the material, and a generally uncontrolled hour-long meeting where virtually no bugs were found. Digging deeper, it was found that their results (which they believed indicated they had clean code) were being used as a basis for performance and compensation reviews – no bugs, get a raise! Even though they weren’t maliciously abusing the situation, the dynamics had evolved to radically skew the metrics program in the wrong direction.
The quality of the intent and gathering methods as well as the quality of the data source all contribute to the overall quality of the data, which in turn determines the quality of the decision made as a result. – JB