Validate Your Assumptions
Last week I mentioned a particularly nasty bug from way back that held us up for weeks as we were trying to implement an RS422 protocol between two devices.
This was a typical lab environment, with the two devices sitting on a bench, surrounded by oscilloscopes, logic analyzers, other test devices, and nervous managers. It seemed that no matter how rigorously we devised our tests, the system would seem to work well one minute, fail the next.
One of our standard sanity checks for the software was to use a loopback plug on one end of the makeshift cable to confirm that a specific set of tests worked. Unfortunately, what sometimes happened was that even when we removed the loopback connector, the loopback tests still worked! We pored over these tests and the operational code with a fine tooth comb.
Our cable was a bunch of loose wires, perhaps 4 or 5 feet long, so we could connect the two devices together. When it was time to try a loopback, we would pop the cable off the receiving device and attach the loopback to the end of the cable. How could anything possibly change in that situation?
We were making the rash assumption that these wires and connectors would work in a consistent manner for our tests. It turns out that when we carefully ran the cable across the bench, the wires were close enough together so that the signal on one wire could be sensed on another wire right next door, effectively closing the loop. The software had actually been working as expected for quite some time, and when we introduced a properly built cable, our problems were resolved.
What had cost us all that time? We were making an assumption that what worked once would continue to work in the future, and there was no need to worry about it further.
As we work on a product for a period of time, we will make assumptions. Early discoveries or decisions become assumptions, which is a reasonable mechanism for us to have the capacity for new issues that we are dealing with. This works for most of the issues, which makes this a valuable mechanism to use, but it is not 100% effective.
Every time we add to an existing system, there is a finite chance that we will perturb some part of the system that is already in place. This is true for new work, and even more so when we are fixing defects in the system (studies have shown that the likelihood of injecting a new problem is 10x as likely when you are fixing bugs). When this happens, the system has regressed.
For that reason, we can’t assume that tests that have worked in the past will still work. This is one of the key drivers for building tests that can be maintained over time and managed in a regression suite.
Beyond a regression suite for software, though, it is important to capture your assumptions that have been made along the way. To some degree, a well managed set of requirements and design decisions can serve as a higher level regression suite. Without this information captured in a reasonable form, assumptions become implicit. Possibly internalized by those that were involved at the time, but invisible to everyone else. A dangerous way to proceed.
One great way to expose assumptions in your system is to get as many fresh sets of eyes looking at the problem as possible. Often, when I run inspections workshops, people will claim that new employees or people that are more junior will not add a great deal of value in an inspection or peer review, but I have found just the opposite to be true.
Without fresh perspectives into the problem, it can be easy to fall into a group-think trap. With new eyes, we get the naïve questions that will surface the assumptions that have been ingrained into the group. You may be surprised how often there is value in rethinking and correcting those old assumptions. What we get is a great deal more innovation in our solutions, while at the same time distributing the knowledge of the system, reducing the dependency on heroes.
Back to that old bug we had, it was someone just walking through the lab that looked over our shoulders and pointed out our problem. She didn’t have a strong engineering background, but she asked the questions we were all assuming to already answered.
Are you making any assumptions today that could use a sanity check? – JB