I have always wanted to invent a new, commonly used expression or cliche. Most of the things I come up with refer to events or states that don't happen that much, or are just not catchy enough to endure. Here are some of my creations. I apologize if I actually stole them from somewhere else, but I've seen no evidence on the web:
- Throwing rocks down the well - There is a fable by Aesop called The Crow and the Pitcher. In short, a crow uses stones to slowly raise the level of water in a pitcher until he can drink it. It's supposed to be about ingenuity and perseverance. I always took it to be about recognizing when brute force is your only option. I use this more dramatic sounding variant to describe situations where you have been doing things in a slow, grinding way because there is simply no (known) alternative. For example, if you are trying to build a new line of business for your company, you can't just create it through a flash of insight. You have to find new customers, and convince them of your worth, etc. In short, you can only build a new business by throwing rocks down the well.
- All '5's and 'yes's - I've purchased two cars from two different car companies. In both cases, the sales and service staff has admonshed me, "<car company> will be calling you for a follow up survey. Please, please, please do not give us a rating other than a 5 or a yes; PLEASE, ALL '5's AND 'YES'S!!!", with the idea being that they need a 5 on a scale from 1 to 5 for the numerical questions, and yes on the yes or no questions. On a side note, this notion of customer satisfaction where anything besides perfection is failure is patently ridiculous, leading to the situation described; the company actually gets no feedback at all, but that's a topic for another post. Anyway, I've adapted this expression to describe a situation where you want an honest critique, but you just get superlatives or gladhanding. "I wanted her feedback on the latest design document, but she was all '5's and 'yes's."
So that brings me to my latest creation: the Check Engine Light. My old car had a little orange light that many cars have, and is generally referred to as the Check Engine light. In my limited understanding of cars, I have only a vague notion of what this light is supposed to tell you, especially with a title like "Check Engine". Basically, I was told, it is supposed to come on when one of the multitudes of sensors that track the efficiency and emission level of the engine and exhaust detect an out-of-bounds condition. So when that happens, you should dutifully take it over to the nearest official service shop and have them look at it because, at least on my car, it stores a code that indicates what was wrong.
Here's the problem: if you had the car up at highway speeds, that little light would come on for about 10 minutes and then it would turn off for 10 minutes, over and over. We'd take it to the dealer and everytime it would read some random code and they couldn't find anything wrong with that code. One out of the dozens of times we had it in, they found a hose that had a leak and replaced it. Eventually, the theory was that the engine had a timing problem causing it to misfire occasionally at higher speeds, and this would trigger the light. Overall though, this car had very few problems and was an excellent vehicle, so it was just annoying. It got to the point that I began referring to it as the "Everything is Fine" light because it would come on when we were happily cruising along. I would have been worried if it didn't start it's off-on cycle.
Ultimately, I understood this light though, because I have built "Check Engine" lights into software many times, and I see it in software that I use. It happens like this: you have a piece of software that is doing a fair number of complex things, and there are hundreds of possible things that can go wrong, so in general, you are better off monitoring a limited set of outputs for problems than you are putting lots of checks in the code itself. The trouble is, you are just not sure what constitute "normal" values for outputs in every case.
A good example is something like a server thread stuck timer. It's good practice to put in a timer that waits a certain period of time before declaring a thread "stuck". Since threads can become stuck for many reasons, this is much easier than trying to detect every cause, you can just kill the thread and let someone know. The problem is, if you set the threshold too high, then threads will be stuck for a long period of time before being noticed, and if load is heavy, the system might lock up in a cascading effect. If it's too low, then jobs that take a long time might be misclassified as "stuck". So it has to be calibrated to the application, but there will always be cases of this "Check Engine" light coming on for no apparent reason.
So into your arsenal of debugging terms, I hope you will add "Check Engine Light", defined as errors which indicate non-specific, recurring fault conditions, and which may have a reasonable cause, or may be false positives. Or at least the next time you are poring over a log file with someone and they dismiss a stack trace with a wave of their hand and a reference to "The Check Engine Light", you'll know what they mean.
