Distance Debugging Logo

When good designers build systems, they take a lot of criteria into account, including common ones such as how it meets the requirements, maintainability, and extensibility. One aspect that is often missed is debuggability, or how easy it will be to fix a problem with the system when it occurs. Like other criteria, debuggability can mean sacrificing potentially beneficial complexity in the early stages, for example, for improved performance.

Consider the problem of using an artificial neural network for classification. They often give excellent results, and can learn basically an arbitrary association between input features and output classification given enough nodes. They can generalize from a set of training inputs to a set of new inputs, and are often an ideal solution for large data set partitioning. They have one big drawback though: if they fail to classify something correctly you have pretty much no idea why. The problem is buried somewhere in the weights and connections of the nodes in the network. There is little debugging that can be done directly, with your only option being more training of the network in the hopes that it will solve whatever issue it is having. At the opposite end of debuggability is rule-based classification. While it may be time-consuming to create an appropriate set of rules to classify all documents correctly, and newly arriving documents might require new rules to be added in the future, it should be perfectly clear how the resulting classification was reached.

If you were building a system with a classification component, you might be inclined to use the ANN solution because of the speed and power, but the possibility looms that you will pay for it with counterintuitive, hard-to-fix bad classifications. If you take debuggability into account, you would likely avoid this type of design solution, opting for either the rule-based approach in the early going, or possibly a hybrid solution that uses the statistical approach for a quick classification, and then a rule-based approach to prevent common mistakes.

The ANN solution is an extreme example of something that is not debuggable, but many sophisticated algorithms suffer from this problem to one degree or another. This isn't to say that these solutions cannot be used, but they must be used with caution and when you have a working fallback. On the other hand, there are other cases where you can build capabilities into your system that help with debugging and don't require a functional tradeoff in general. A summary of some of those capabilities will be the subject of the next post.