Distance Debugging Logo

There are a handful of data collection techniques that you will use again and again. Each of these will help you fill certain knowledge gaps, and each has its own set of pitfalls and gotchas that will waste your time and energy. I touched these in a general sense yesterday, but here is a more detailed list:

  • Runtime probe - Data collections that involve directly querying or observing the running system. Typical probes include running the system in a debugger for the purpose of observing control flow or the value of variables, or using a more general-purpose query language such as DTrace for Solaris, or system utilities like strace or netstat on Linux.
  • Human inquiry - Data collection that involves asking another person for information. This might be asking a user who discovered a bug for more information about what they experienced or asking a system adminstrator if they've applied a certain OS patch.
  • Test Case Execution - Data collection that involves crafting a specific test case meant to replicate a bug, or produce a specific error.
  • Artifact Examination - Data collection by looking at some artifact produced by the running (or crashing system) to glean specific information. This includes looking in a log for an error message, or opening a core dump file in a debugger.
  • Differential Computation - Data collection that consists of simply determining the difference between two or more things in order to know first if there is a difference, and second, what the specific difference is. The example given a few days ago of comparing the set of dependent libraries is a good example. Other common activities include diffing a configuration file, and diffing a set of data inputs to determine why one version of a system is working and another is not.

Tomorrow: The Caveats and Issues with Each Approach