Distance Debugging Logo

I was trying to help someone copy settings (email, web bookmarks, etc) from one Windows machine to another and they asked "Why is it so complicated to copy my settings?" The true answer, at least in my mind, is some combination of: each program handles its settings information differently, Windows doesn't provide a standard mechanism for transferring settings, and Windows tends to store pieces of a program in a zillion undocumented places making it difficult to do a post hoc gathering of the data. Each of these pieces would require some back explanation ("applications need a place to store your preferences so that it can remember them between usages"), and so there is a natural tendency to summarize and compress.

This compressed answer, something like "The information is stored all over the place", is basically correct, but it isn't the whole story. It sufficient in this case, but often a choice has to be made between giving a long, complicated answer and giving a terse but somewhat incorrect answer. I call this line beneath which any compression would result in incorrect information the "Irreducible Correctness Boundary". The problem is, the boundary is not fixed, but it varies depending on the sophistication and skill of the recipient.

I was reading "Fortune's Formula" recently and he posed the example of two different people asking if the world was round. The first was someone from the middle ages where the general conception was that the world was flat (I believe the theory of widespread belief in the earth's flatness has come into dispute recently, but bear with me), versus someone from modern times studying the shape of the earth. To the first person, an answer of 'yes' is a signficant amount of information and additional content regarding the non-perfectly-spherical nature of the earth is more harmful than helpful. The modern person, who knows that the earth is at least roughly round, might benefit from additional information that the earth is not really spherical, but an unqualified 'no' is a much less correct answer than an unqualified 'yes'. The real problem is that the irreducible correctness boundary is much higher.

Substitute a non-technical and a somewhat-technical questioner for the two people in the previous example, and you have a situation that technical people are encountering all the time. If you guess too high on someone's level of technical sophistication, you will swamp them with unnecessary or even misleading details. On the other hand, if you guess too low, you can quickly drop below the correctness threshold by omitting important details. Estimating a particular person's level of technical sophistication can take time, but it's critically important to know when passing data back and forth, especially when using someone as a trusted debugging contact.

As those of you who read my earlier posts about Linux and T-Mobile Dash may remember, I had a couple of lingering issues (you can read part I and part II). Well, with a little help from the internet community, my two biggest issues are now solved, and I believe that there is little or nothing that can be done on Windows that can't be done on the Linux platform.

  1. PIM Synchronization - After correctly patching my synce installation, I was able to sync contacts, calendar, tasks, and they've just added file sync support so that you can synchronize your smartphone file system with a directory on disk.  PIM stuff is under heavy development with new features and bug fixes all the time.  If you are interested in testing the bleeding-edge code, subscribe to the SynCE Windows Mobile 2005 mailing list for more info.
  2. After a kind WINE developer read my earlier post about problems installing ActiveSync, he took it upon himself to fix the issues and now ActiveSync installs correctly (see his posts on my earlier T-Mobile dash posts for more info).  I then was able to use WINE to run one of the standalone program installers (gnuboy, an opensource game boy emulator), which was "fooled" into thinking that ActiveSync would install it on the next sync.  Instead, I grabbed the .cab file that it unpacked into the ActiveSync directory, and was able to install that directly, so now there shouldn't be any problem installing applications either.

    After discussing this with the Windows Mobile 2005 SynCE list, there is some talk of other ways to accomplish this, such as determining how an application tries to check for ActiveSync, and tricking it so that you don't have to install ActiveSync in the first place, or writing/improving a tool to scan through these special installers looking for the .cab header, so this won't be last word on this issue for sure.

Hope this helps those of you concerned about Dash support on Linux.  After another month or so of use, I am extremely pleased with the whole setup.

The site has begun to approximate the layout I had in my head, with the content boxes encircling the center content.  If you have a smaller screen, you may have to scroll the outer window to see the whole page, but I'm hoping for the majority of visitors, the content will appear in its entirety, and you should only need to scroll the interior box to read the posts.  I may shrink the scroll box a bit so that it will fit even on a small screen (like my 12" laptop screen).

I am still somewhat baffled by the notion that DIVs and SPANs are somehow supposed to replace tables, since for the life of me, I cannot get any combination of them to layout in the simplest possible side-by-side configuration.  For instance, the group of three boxes beneath the post scroller was done with a table.  Why?  Well, you can't just use DIVs, because they break afterwards since they are block element so they stack vertically.  You can't just use SPANs, which don't break, because at least as far as I can tell, because I am using DIVs inside to represent the title and content of each box, the SPANs non-breaking aspect is ignored.  I'd switch the titles to be SPANs instead of DIVs, but SPANs don't accept a width or height attribute so all my title boxes would be different sizes which is visually unappealing (at least ot me).  I'd use DIV with style: inline, except that has the same problem.  I'd float them all to the left to get them to stack up, except then I'd need to put in some kind of placeholder DIV to bump the footer down, and the float property seems to be the thing I understand least because I guess it pulls those elements out of the static layout and adds them to their own layout so I'm even worse off when things start to go wrong.

But hey, I create a table with three cells, set the margin information, stick my two DIVs in each cell and voila, a nice, simple side-by-side layout that is easy to understand and debug.  If some HTML guru out there who really understands this stuff can explain to me how to get the effect I'm looking for here, I'd love to have a better understanding. Hours of web research have turned up very little that would work for me here.

I had a Citibank credit card. Technically, I still have it but I will be closing the account shortly. My recent experience has been a not-so-rare glimpse at how strange the business practices of a mega-corporation can be sometimes. This is not a story about fraudulent charges or bad customer service, just a sequence of events that made me doubt their ability to competently run their own business.

1) The story starts a few months ago when we received an ominous letter telling us that our card had been compromised and that we needed to contact their customer service people immediately. With us both in full possession of the cards, our overall rare use of the card (it's mostly backup for the rare occasions when our primary card won't scan or decides to go on fraud-alert freak out, or for when I'm traveling to help keep track of expenses), and my fairly elaborate attempts to keep our personal information protected (crosscut shredder, receive electronic billing statements only, etc) I wondered how that was possible and wondered if it was some kind of Citibank ad campaign.

It turns out that they were the ones who were compromised; Citibank had lost our information, and then sent us a letter making it look like they had ingeniously detected some fraud and proactively addressed it. They insisted on sending us a new card and an affidavit in case we needed to report fraudulent activity. They were not amused by my wife's sardonic questioning regarding their ability to protect this new account number any better than the previous one.

2) When I received the new card,we had to call to activate it as is customary. Normally, you dial a number and get an automated message asking you to enter the number and then it is activated almost immediately. We recently received a new card for our primary credit card account because it had expired and this is exactly what was necessary to activate the card. Citibank on the other hand has "pioneered", at least I experienced it with anyone else, the use of a live person to activate your card when you call. This provides them with the opportunity to cram some sort of up-sell gimmick down your throat when all you were expecting was an automated message.

This time around it was the new "credit protection" plan that everyone wants to sell you, where for a fee based on your balance, you can skip a payment (or 2 or 3?) if you lose your job or you have a serious medical incident or one of the other horrible qualifying conditions. I tried to explain to him that regardless of the fact that I don't take part in fear-based nonsense, if I instead took the money I would have been paying for the credit protection and stuck it in a savings account, I would easily be able to cover a minimum payment for a few months. Insurance only makes sense if the gap between the cost to insure and the cost to replace/repair is high (home insurance), or qualifying conditions are likely (car insurance). In this case, it wasn't even close. I wasted 10 minutes of my life trying to get off the phone without being rude, and then finally resorted to being rude since I had only called to activate my f&@$ing card.

3) After activating the new card I went to log on to the card site to check that everything was in order. Of course, I was naive to assume that because this card was the logical successor to the previous one, that card would show up. Instead, it showed our old account as closed and nothing else. I was forced to create a whole new username and account for this new card. That was annoying, but not terrible. It will be turn out to be important later on though.

4) This is more of an annoyance than a real part of the story: Citibank revamped their credit card website, and now whenever you go there you get a pop-over add thing that blocks out your access to their actual website until you click a few things. It's like they created a DIV that they position over the site, but instead of just making it the size of the ads, it blocks out the entire site with white. And this is just to show me ads for a product I already own. Keep in mind that I had to click through that thing a zillion times during the course of the ordeal I am about to describe.

5) A few weeks ago, I made a payment on the web site scheduled to take effect a few days later. A day after the day it was to be debited, I got an email from Citibank thanking me for my payment. I went and looked at our bank account and there was no charge. That seemed odd and I chalked it up to a timing glitch with online processing. When I checked the next day, still no charge. Then it hit me, I had another bank account that I had been using previously but which now had like $100 in it but I was too lazy busy to close. Going to look at that other account, I noticed that sure enough, a charge had been attempted and of course rejected for insufficient funds.

I was really mad for 2 reasons: first, I had no idea that bank account was even on that new card's account. It had been on my old Citibank card, and without telling me, Citibank had brought over that old information to the new account I opened. So it never occurred to me to double check that I had the right bank account selected because I didn't even realize that there was more than one. Why they could link my bank account after the fact, but forced me to create a whole new card account is beyond me. Second, why can't they ask if you have the funds available without actually debiting it? It seems silly for me to pay a bounced check fee when there is no person receiving the check, and no goods changing hands, etc. Why should electronic funds work exactly the same as physical funds? Especially with a huge conglomerate like Citibank would could save themselves and their customers huge hassles with a better system.

6) Now it got fun. My credit card statement showed that it had been paid, despite the fact that there was evidence that it had been rejected by my bank, so I couldn't just make another payment and be done with it because you can't pay more than your remaining balance. I called customer service who explained that they must not have been notified yet (which seems unlikely since it showed up in my online banking statement) but that the problem was that Citibank might keep trying to debit the amount depending on "what response they received back from my bank". So I was supposed to call my bank and find out how they would have marked the rejection, either as insufficient funds, or just denied. If it was insufficient funds, they might just keep retrying, with me incurring a new charge every time! She insisted that this was all done electronically and her hands were tied.

I called my bank and actually got someone who understood what I needed to know. He advised me that given that it wasn't an active bank account and had only a small amount of money, I should just close it and make it impossible for them to retry and incur additional charges, which I did (thanks guy at bank, you are the only person in this entire thing that had any interest in helping me!)

7) Fast-forward to today. Ten days after I originally got the bounced check notification, Citibank finally notified me about it. The best part is that their email says, in a nutshell "since your bank account was closed, we have disabled your on-line payment functionality until you call customer service". Remember that I have a perfectly valid bank account that I have used several times before to pay this card, and it was only because they decided to bring over an older bank account unexpectedly that this happened, and that I had already spoken to their customer service to explain the problem and was given essentially no alternatives. So now I can't pay the bill on line any more and have to call them. I guess I would have needed to call to close the account anyway.

The thing that bothers me most is that Citibank will feign interest in my leaving them when I call to close it, but they ultimately don't care about people like me except in some sort of aggregate statistical way. I feel at every turn their message is "we don't really care about making you angry or costing you time and money if it means we might make a few extra bucks because of sheer volume", which I'm going to call the "spam theory of business". After a decade of card membership with an ever increasing level of disinterest on their part, it's time for me to stop rewarding them.

I was recently reading The Pragmatic Programmer and was thinking about the concept of Good Enough software. Briefly, the idea is that when you learn to build software that meets users' high priority needs rather some standard of technical perfection, you and your users will be much happier. The critical piece to making it work is engaging them in the dialog about where corners can be cut and making them aware of the resource constraints that exist (time, money, ennui, etc) so that they can make an informed decision.

What I found interesting though is that even these authors writing from the pragmatic point of view felt the need for the caveat regarding high-performance, zero-fault software such as the stuff runnning pacemakers or controlling aircraft to say that there are cases where the Good Enough theory and process doesn't apply. Why do we feel the need for a methodology or process concept to span all types of projects when looking at its validity? People will dismiss extreme programming by saying "well that would never work if you needed to build software for the space shuttle". I would probably agree with that, but that's beside the point; in my experience, the level of implied quality and criticality of the system is generally inversely proportional to the scope of the requirements. In other words, if it really has to work all the time perfectly, chances are that you also have been told exactly what it needs to do and how it has to do it. No one ever says, "Can you build me a pacemaker, and while it's being used for humans right now, we might also use it for mice, and it might need to perform dialysis functions also."

We don't need agile techniques and rapid prototyping and usability studies and iterative development to build a pacemaker because from the get-go it's clear exactly what needs to do, and all the time and energy is spent on making sure it does that one thing perfectly. Extreme Programming, and the idea of Good Enough software which is closely related to the XP planning game, is for what has been in my experience the vast majority of cases where you really don't exactly know what you want, and where the feature set and degree of robustness are up for debate. Building systems that have nebulous requirements but a variable margin for error can be as difficult if not more than those with strict requirements but zero margin for error. The multitude of software project management theories and the endless stories of projects being killed for failing to deliver the right (or any) functionality attest to that.

In most instances, it should be possible to look at the number and detail of your requirements and figure out where your project is on the scope vs. quality management curve and select your methodology accordingly. Of course, you will likely end up drifting along the curve during your project's lifetime and should adapt accordingly. Project Management theories and companies that adopt one methodology as part of their identity ("We do Scrum here") tend to have trouble with one of the two inversely compatible goals, or at least they end up burning a lot of management overhead to solve the problem they aren't having. Conversly, we should stop making excuses or offering caveats for our management tools that do provide us a process to tackle the problems we are having.

I used to work as a projectionist in a movie theater that showed primarily classic films. This involved significantly more work than is required with modern films because of the need to switch between reels. In a nutshell, films used to be distributed in a stack of canisters each containing about 20 minutes of film, called a reel. Showing a film consisting of reels requires two projectors, and the projectionist has to load up each reel on the currently unused projector and wait for a cue to start up that next reel, which was loaded up such that the actual picture would begin exactly 10 seconds after the projector was started. Then, a second cue tells them exactly when to swap between the two projectors (in my case there was a floor pedal that would instantaneously close one projector and open the other). If done correctly the movie would appear to seemlessly transition from one reel to the next and audience would never be the wiser.

So what are the cues that the projectionist uses? In every movie you have ever seen (chances are) a small black dot appears in the upper-right hand corner of the film at 10 seconds remaining, and 1/4 second remaining. Now that I know it's there, I can't "unsee" it, so I notice it in pretty much every film I see in the theaters. Despite the fact that most films released nowadays are made of stronger materials and put together into one huge reel, making the projectionist job significantly easier and the dots superfluous, they are still there.

Even if you do know they are there, they can be very hard to notice unless you have spent a fair amount of time looking for them. To me, the most interesting part about the dots is that I've yet to meet someone who when I tell them about the phenomenon for the first time says "oh yeah, I always wondered what those were for!" Most people say, "No way, I would have noticed that" and are somewhat incredulous that this could have been happening in every film without them consciously processing it.

It just goes to show how powerful the brain's capacity to filter and make sense of the world is. Since those dots appear to have no meaning in terms of the film content, they are dismissed at some lower level of processing as perhaps some dirt on the film or a one-time anomaly to be corrected for. The brain is compensating for the eye's blind spot all the time, so throwing out a little black dot every so often is a piece of cake.

The black dot problem, as it relates to debugging, is the problem of failing to perceive useful information because it is grouped in with a larger, consistent data set.   For example, a rare but recurring error buried in hundreds of megabytes of uniform log files could be the black dot that you are missing when searching for the cause of a system crash.  Sometimes, simply being told that the "black dot" is there is enough to allow you to find it.  Other times, it's just a matter of stepping back and repeating to yourself that the information is there and trying to force those subconcious filters off for a while to really see the data in front of you. However it comes about, when you finally notice the dot, you will wonder how you could have missed it for so long.

Remember in math class the teacher admonishing you to go back and "Check Your Work" after completing a problem set? It's a simple request that can have amazing results for your overall score when you go back and notice simple errors, or identify answers that don't pass the smell test. Of course, this never helped me much in math because I am terrible at arithmetic, and no amount of work-checking ever seemed to allow me to notice that I'd added numbers that I meant to multiply or vice-versa. However, checking work is an important part of what I do when debugging.

Basically, if I am executing a set of steps to replicate or gather data about a problem, the first thing I do is to verify, to the extent possible, that I am really doing the steps I think I am. This is harder than it sounds for two main reasons: silent success and silent failure. Silent success means that things work as expected, but produce no output. This is fairly common since we tend not to write out logging information for succesful operations. Silent failure can be an attempt at fault-tolerance, or sometimes the message happens in a low-level library and can't be handled correctly, or is just buried or piped to an obscure log file (apparently silent failure).

Both of these problems need to be addressed if you want to check your work properly. In the case that the system can be run in a debugger, work-checking is trivial since you can step through the sequence in question. If not, some judicious print-outs to verify can work instead. Checking your work is more than that though, it also means looking for the evidence of execution. This can be reviewing log files after startup to see if everything looks kosher. It can be looking at process tables to verify that something is running. It can be pinging ports to verify that something is alive. Anything that gives you some level of confidence that things are doing what they say they are.

Without taking these kinds of actions and checking your work, you run the risk of burning many hours debugging a shadow bug, which hints at the real problems in an indirect way like Plato's shadows on the cave wall.  For example, you build a new version of an application, but the build script output a warning that a properties file is missing and so it is skipping part of the build. Since you are not expecting the build script to fail, this warning goes unnoticed. You run the latest version and suddenly weird things are going wrong, and you are convinced that you have broken something in the code. When you finally notice the build script output, you may have spent significant time debugging a non-problem, or at least trying to debug an epiphenomenal problem rather than the real thing. Checking work means more time fixing real problems, and less time chasing after specters of bugs that are faults of the process and not of the application.