Distance Debugging Logo

A few weeks ago, I described the probability-testability matrix, and emphasized that you should consider both how likely a theory is, and how difficult it is to test that theory, when choosing how to proceed in a debugging effort. I have always been surprised that people seem to have trouble with this, and choose very low probability things to try first regardless of testability. I haven't had the resources to actually carry out a study on this, but I have a strong feeling that a cognitive bias is to blame, although I don't know if it already has a name in the literature. I've dubbed it the "Holding out Hope" bias. The idea is that people are biased against taking actions that leave them with worse options, even if selecting the better options means that the likelihood of success is higher early on.

To be more concrete, let's say that I have 4 theories for a problem, that I've guessed are 70, 50, 5, and 1 percent likely, respectively (note that it does not sum to 100% since I am simply giving a probability for each to be the outcome, and I might feel strongly about more than 1). If I start with the 70% theory and I'm wrong, then I've got something that's less likely to try. If I test the 50% and I'm still wrong, then I'm looking at a couple of low-probability theories and I'm starting to feel very uneasy. If I instead start with 1% and it's wrong, I can say "well I didn't think that was it, and I've got some much better options". I can continue to "Hold out Hope" that my higher likelihood theories are correct. It results in failure feeling like I'm improving my chances instead of reducing them.

Of course, by trying the things that we think are less likely first, we are greatly increasingly the number of theories that have to be tested, and that makes the whole effort take longer, even if it helps with morale. Remember that it's the fixing part that feels good, so avoid the Hold out Hope bias and go for the high probability stuff first.

With the WGA strike in full swing, I have spend a lot of time looking over the arguments and trying to understand the issues.  At first, the whole thing came off like a major league baseball strike: both sides have more money than the Flying Spaghetti Monster, so who really cares?  After further review, not only is one side (the writers) clearly "the little guy" here, but I have a very self-serving reason for wanting them to win: I love content.  I like new ideas, new plots, new characters, it's the one and only reason I watch TV.  I need narrative; everything else is just window dressing.  I asked myself, when is the last time I saw a great TV show or movie with tremendous acting, but so-so writing?  Probably never, that combination is awful.  I've seen a lot of great shows with amazing writing and a so-so cast because it's the writing is what carries it through, despite the fact that that the actors get the big bucks.

I know from personal experience that good writing is just plain hard, and I admire those who can do it for a living.  Giving writers more incentive to create great works, or try out new ideas because they have a little money in the bank to risk will only produce better quality content for the future.  Especially since the avenue that is the main point of contention, receiving residual payments for content delivered in formats other than television and DVD sales, in particular, the internet, is about to open up dozens of new possibilities.  We seen these ideas scratched at with web puzzles, webisodes, and character blogs, but if we want writers to really take full advantage of these media, we surely want them to get paid to do it.  In particular, I liked the idea pitched by one writer, to just simply pack up their tents and head over to Google, and see if they wouldn't want to pay them decent money to put stuff up on YouTube.  I agree that this is management's worst nightmare.

I'd like to ask for something in return from the writers though: make it a matter of professional pride to continue to deliver new content instead of going the route of the RIAA and MPAA who have chosen to spend their time finding new ways to profit from existing content.  If you receive the desired 5 cents for a $2 episode of The Office, your incentive becomes greater to assist the network in finding a way to get 5 more cents from me instead of writing another episode.  I don't believe writers will do this, as they profit from their ability to create something new, but it's a concern that has lingered in the back of my mind since I first heard about reasons for the strike. 

Ultimately, I believe that the writers will prevail in the end because they hold all the cards, so it's just a waiting game.  In the meantime, I'll guiltily churn though all my backlogged TiVo content.  Or maybe I'll just keep watching the some old episode of Flavor of Love over and over again in an act of solidarity.

A well-designed piece of software includes specific places where the designer anticipated the need for alternatives, despite the fact that they may not have had a need for more than one at the time. These are often referred to as change points. This might be allowing for pluggable search or indexing algorithms to be added, alternate representations of a core datatype, or even an entire replacement for some core piece of functionality such as the backend of a system. A system without any change points is certainly bad: it is brittle and inflexible, and the alternatives must be grafted on with lots of if-else statements or #ifdefs. However, too much flexibility can be just as bad.  Furthermore, while many designers are good at putting them in, they tend to be very bad at taking them back out again. We convince ourselves that we Just Might Someday Need That Flexibility, and It Doesn't Hurt To Leave It In. That's just plain wrong. It does hurt, in the form of the cost to maintain the flexibility throughout the design. There is cost in complexity, maintainability, and in many cases, a runtime performance cost.

If you make use of a change point and have implemented several alternatives, then this overhead is usually justifiable. For instance, architecting a system to allow for different databases to be plugged in, when you actually need to be able to run the system on several different types of database is a big win. Architecting the system to allow for database pluggability when you only ever need to run on MySQL is just a huge waste. Every time you add a new feature that affects that backend, you'll need to consider how that affects your generic database interface instead of of just implementing it for the one. I've seen too many projects bog down in this kind of unnecessary flexibility, where everyone spends their time worrying about how to keep the change points tidy instead of writing code that does anything. In describing the emphasis in Extreme Programming on meeting current needs rather than making things unnecessarily general, Kent Beck uses an economic analogy to show that since feature value tends to be highly unknown, and the value of many features holds essentially constant over time, by waiting on all the features that you don't absolutely need, you eliminate the risk of implementing a lot of worthless features. It's not that easy in practice since it is still hard to guess the value of features, especially points of code flexibilty, but the point is well-taken.

While you can't recover the costs of implementing a change point in the first place, you can help reduce their ongoing expense through a flexibility audit. The idea is simple: as you develop a piece of software, keep track of where you made an explicit decision to allow for alternatives, and a note about what you expected those alternatives to be. Every few iterations, go over the list with the team and ask:

  • What is this change point costing us?
  • Do we still need this level of flexibility?
  • Could we get the same kind of flexibility in a less costly way?
  • Has one alternative proven superior to others such that that alternative could become the only implementation?
  • Where do we really need flexibility that don't currently have?
  • Have any change points become significantly more or less costly than last time we audited?

The outcome should be kind of a score for each change point, which is the cost to keep it versus the value to the application. Then purge the ones that aren't worth the trouble. It's can be surprisingly cathartic to eliminate code flexibility as we spend so much time worrying about how to make things more general, and rarely get to make them concrete again. You can make your team happy by eliminating troublesome and never useful bits of flexibility, and your code will be better off for it.

An interesting article posted on Yahoo discusses the classic dilemma of the employer, best summarized in the title of the Harvard Business Review paper referenced in the article: "Fool vs. Jerk: Whom Would You Hire?". The article highlights recent efforts by many companies, despite a serious need for new hires, to focus on weeding out potential employees that would end up clashing with the corporate culture through, dare I say, authentic assessment practices. My position on the best way to hire candidates is clear, but this is one of the first instances I've seen of companies attempting to insert more of their actual business practices into the hiring process, instead of more nonsense. A few highlights from the article:

The 1,900-person company is divided into 18- to 20-person teams. One
team is so close, the whole group shows up to help when one member
moves house, Napier said. Job interviews at the San Antonio-based company last all day, as interviewers try to rub away fake pleasantness.

"They're here for nine or ten hours," Napier said. "We're very
cordial about it. We're not aggressive, but we haven't met a human
being yet who has the stamina to BS us all day."

If you can pull it off, I love the idea of the full day interview, and I couldn't agree more that you need to give someone a chance to let their guard down so you can see what they will be like on a day to day basis.

In the mating dance of job interviews, employers traditionally put
their best feet forward, too, trumpeting their wonderful benefits
packages while leaving out the bit about working late, eating cold
pizza.

Not Lindblad. It sends job applicants a DVD showing not one, but two
shots of a crew member cleaning toilets. A dishwasher talks about
washing 5,000 dishes in one day. "Be prepared to work your butt off,"
another says.

"It's meant to scare you off," company founder Sven Lindblad said.

It does. After watching the DVD and hearing an unvarnished
description of life onboard a Lindblad ship, the majority of applicants
drop out, Thompson said.

It's hard to give prospective hires a clear sense of what the job is like day to day, and putting out a DVD is a great way to say "we work really hard and aren't afraid to tell you that" I'm sure the people that stick around are really committed.

KaBoom [a non-profit that builds playgrounds] sends prospective project managers to one of its four-day
playground building trips, with the actual build on the last day
involving 200 to 300 volunteers, many of whom have questions for KaBoom
staff.

"If they're not easily approached, or they're easily stressed —
this is the way we find out and they find out if it's not going to
work," he said.

Hammond wouldn't say what percentage of applicants drop out,
but he did say project managers' tenure has increased since they
started sending them on the trips four years ago, from one year's
tenure to between two-and-a-half and three years.

Changing the average employment term from one year to between 2.5 and 3 years saves what, $100K a year in spin-up and adminstrative costs? Not to mention the effect is has on morale. Employers so often fail to consider the real costs of turnover, and the true sources, and a mismatch between employer and employee expectations is so often the cause. That's why authenticity is so valuable: not only does it allow you to more truly evaluate a perspective hire's capabilities, but it gives them a chance to find out what it is they are signing up for.

Most of us are introduced to the scientific method early on. We learn about making hypotheses, collecting data, and analyzing results. Unfortunately, we are rarely given an opportunity to study the second part of science, which is establishing a theoretical basis for the result that meshes with current understanding. One simple example is the classic "ESP" experiment. Let's say that I could demonstrate successful "mind reading" to a level that is incredibly statistically unlikely, and I could do so repeatably. In order for it to become "scientific", that's simply not sufficient.  I would need to explain the following:

  1. In what form is the information being transmitted from the other person to me?
  2. How might that be observable through other means?
  3. What about me is different such that I have this capability while others do not?

Otherwise, I'm making an even bolder claim: not only can I read minds, but I also rely on some method of data transmission that is undetectable or unknown to science.  To avoid that territory, we certainly we could invent answers to these questions that made scientific sense. Perhaps I am especially attuned to the magnetic fields created by neuron firings, and I can decode that information. Maybe I have some sort of genetic anomaly that amplified a natural capability, and so on. Many incredibly bizarre and counterintuitive findings have proven to be totally explainable (or have invalided previously established scientific concepts) once the theoretical basis is established. The need to link new findings in with established scientific precedent
strikes many people as a kind of dampening effect on research that
challenges the status quo. In some sense, that's true, but it's really
about burden of proof. If you find something that is significantly at
odds with long-standing research with a significant experimental basis,
you better have both a good explanation, and triple-checked your work.

The problem isn't that people don't create good theoretical bases for their experiments, it's that they don't  consider them enough in the first place. I remember a class in graduate school where the professor recounted the "surprising" finding that deaf children acquire sign language in the same way (progressing from babbling of word elements through individual words through sentences and so on) and at the same rate that hearing children learn spoken language. I immediately thought: "Why is that surprising?" Since we have such a wide basis of theory and experimental knowledge about language and the brain's innate capability to acquire it, wouldn't we be a lot more surprised if deaf children were somehow stunted in their language development? That would be a very interesting finding, suggesting that humans are predisposed to spoken language specifically, which would fly in the face of many other theories, such as the fact that it's generally believed sign language predated spoken language. A finding that spoken language has since become favored would need it's own theoretical basis.

What does this have to do with debugging? As I have mentioned in this space many times before, you have to have a theory when working, and that theory better be plausible, ahd it should hopefully be probable. However, on the data collection side of things, those same rules must apply. When lacking a solid theory, we often being running test cases and other perturbations of the system to see if anything unusual turns up. Frequently though, those experiments break down into two outcomes: one that is trivial in that it fails to challenge our current understanding, and the other that is wildly at odds with our current understanding. To avoid these kinds of polarizing tests, it's good practice to look at the expected possible outcomes and say "If the test fails in the expected way, what does that mean? If the test succeeds, what does that mean?". Ideally, both should tell you something interesting, although that's not always possible. Most importantly, if your test does give you back a surprising result, start from the assumption that your test is wrong. You will save a lot of heartache by throwing out bad tests instead of becoming a true believer in a bizarre theory with one measly test case to support it.

So it's probably the collapse of the lending industry, and not my angry post a few months back that has brought them crashing down, but it still feels oddly good to see Citigroup in trouble. From their ridiculous customer service, to their constant upselling pressure, to their trigger-happy bill collection policies, I for one will not be sad to see them have to scale back operations, and I think in many ways, they are reaping what they've sown.

I've been making slow, steady progress towards recapturing in Drupal everything that WordPress provided out of the box. Here are my most recent updates:

  1. Added a recent posts block and archive by month block on the right side using Views. That seems to work very well. This also gave me "posts by month" pages for free.
  2. Enabled the autosave module to store blog entries (and any content, really) that I am writing every 10 seconds or so.
  3. This is more a convenience thing than an actual necessity, but I remapped all URLs of the form ?m=XXXXYY, which were old WordPress style URLs and are what is still in Google, to map to new month/XXXXYY URLs using apache mod_rewrite. If folks are interested, I'm happy to share my RewriteCond and RewriteRule directives.
  4. I forgot to mention this last time, but I have installed the TinyMCE module so that I can get WYDSIWYG editing for any content, including blog posts.
  5. Turned on the ping module to ping pingomatic with updates.
  6. Enabled anonymous commenting, and made the comment form show up for each blog entry. I added a simple text CAPTCHA because that seems to be the least offensive of the options. With WordPress, I used Akismet and Bad Behavior and got 10 or so spam comments a week that had to be deleted and the occasional one that slipped through. I think an easy CAPTCHA is a better approach. If it fails or is too onerous, I'll add the Akismet and Bad Behavior modules.

Still todo:

  1. CSS template still horribly screwed up. Sorry it looks so bad. I think I'm going to write a simple new theme from scratch that holds all the necessary info and that I really understand, rather than try to hack an existing one as I did with the current version.
  2. Remap some of the old WordPress URLs that are still in google to their new equivalents to keep my incoming links correct.
  3. Spell check
  4. ???

It's starting to approximate a halfway decent blogging platform, with surprisingly little work.

Apple recently released an upgrade to OS X, version 10.5 ("Leopard") to much acclaim. It certainly appears to have some excellent new features, in particular the Time Machine capability that gives you a version control-eque view of your file system, allowing you to see what your system looked like at any previous point in time. As with any major upgrade, there have been a wide variety of reported problems, particularly the "Blue Screen of Death" upgrade issue where the machine refuses to restart until manually rebooted. Whether these problems are more or less than a typical upgrade is for others to decide, but it got me thinking about what happens when I upgrade to a new version of Linux, or try out really any new piece of open-source software: what my expectations are for how smoothly it will go, and what the outcome will be.

To put it bluntly, as a long-time user of Linux on the desktop, I have lost any concept of technical support. I never have a phone number, email address, or in some cases, even a forum to which I can go complain or get assistance with my problems. When I install the latest version of Fedora, or a new version of the kernel, or get a new video driver, I can expect a wireless card not to work, or the system to hang on boot, or my laptop to fail to suspend and resume properly, or worse. While Linux has made huge strides from the early days of endless handholding to even bring up a graphical environment, a few glitches are still par for the course. This isn't a complaint at all; to me, the process of fixing these things is half the fun. I have argued before that users have developed a kind of learned helplessness from Windows (and Mac) hiding information from them, and needing to learn something to get my system fully functional is a great opportunity to acquire new skills.

With Linux, it's not just that there isn't a help line or a dedicated user support forum (and certainly there are many places to get help; learning how to research a technical question is a valuable skill in-and-of itself), it's that there isn't a company that you can ultimately force to do anything about your problem. If my laptop isn't booting, my choices are to try to fix it myself, try to get a more knowledgeable person interested (which usually works!), or give up and switch to something else. I am owed nothing by the Linux community in the way that Microsoft owes me a working machine because I dropped some amount of $ on the license.

I think that notion is what scares most people, and businesses, off of Linux. The arguments that it doesn't run needed applications, or doesn't have the driver support, or is not user-friendly are all slowly eroding, but I think there is still a deep-rooted psychological fear of removing that safety net. Some companies, like Red Hat or Novell, have stepped into that void with support contracts aimed at businesses, but I look at friends and family who all are very smart and capable people, but still pay the "Microsoft tax", and I can understand not wanting to have to give up that feeling of having someone to complain to, or at least blame when something goes wrong. When I take a cross-country car trip, I have a gnawing fear that something strange is going to go wrong and I will be stuck in another state with an inoperable vehicle, and no way to judge the validity of what a mechanic is telling me, nor any faith that their solutions will be the right ones. So I compensate in the same way as people in the computer world: I buy a comprehensive warranty package that makes sure that I can always have my car towed to a dealer who will pick up the tab for the vast majority of problems.

As a "computer mechanic", I feel sorry for computer users who have so little knowledge of what goes on in their system and who are beholden to a faceless software company, in the way that car mechanics probably feel a little sorry for me, but I can still understanding it. For developers though, I have more trouble accepting it.. The large upside of living without technical support is that it has given me a level of confidence that I will be able to resolve any problem, and this has spilled over into my work as a developer on a variety of projects. This benefit is yet another reason that I encourage young software engineers to begin using and learning an "unsupported" system like Linux as early in their training as they can. The sooner you start living without technical support, the better you will be able to effectively tackle the problems you encounter in your professional career.

Spam, in all its forms

Once upon a time, businesses actually tried to find customers by doing market research, and actively tried to retain them through quality service and attentiveness. I now find it rare to encounter such a company, with most businesses utilizing the physical equivalents of spam to create business. We focus a lot of attention to preventing it in our inboxes, but think little about how the pervasive and accepted the practice is in general.The core principles are incredibly simple:

  1. Invest as little as possible in both the process of locating customers.
  2. Use the money saved to blanket as large a population as possible with ads and other marketing materials.
  3. Treat customer relationships as commodities, from whose sale a profit can also be made.

Here are a few examples of the kinds of practices that these business engage in, besides the obvious spam email messages:

  1. Home renovation and alarm companies staking out the registry of deeds to look for new home sales so that they can begin mailing materials about their services to newly purchased homes.
  2. Business products companies such as credit card processors, branded merchandise sellers, and invoice printers watching for new business registrations to being frantically mailing their marketing materials.
  3. From my own life recently - a purchase of diapers online leading to the sudden receipt of a blizzard of mailers for formula, picture packages, and other materials for new parents. Unfortunately for them (and for me) my kid is 2.

In each case, the company is relying on the low cost to obtain information and some likely statistical correlation between the activity and their business, rather than having any clear sense of who a likely customer would be. For instance, I just opened this business a few months ago and received a steady stream of materials for credit card processing companies, despite the fact that I am not a retail business (which is clear from my business registration info), have no need to process credit cards, and even if I did I would just use Paypal, so they were just throwing away money.

We are seeing the result in our mailboxes and transactions every day: junk mail from companies that either captured a public event such as a home purchase or bought a name from a partner institution that has little or nothing to do with anything we'd ever be interested in. Unfortunately, consumers are very much to blame for this result, in two ways. First, it must be effective, otherwise companies would not continue through money at bulk mailings. Second, we are too willing to part with personal information for littler or no cost. Think of how you casually give out your zip code or email address to a retailer when asked. We feel significant social pressure to do so, despite the fact that it is totally unnecessary.

I believe that if consumers wise up to spam marketing practices in all forms, it will no longer be cost effective to do it, which would be great in many ways. Here are a few rules I try to follow:

  1. Never give out any personal information for free. If they want my zip code to complete the purchase, I'll need at least a 10% discount.
  2. Never purchase anything from a company from whom you have received an unsolicited mailing. Not only does it encourage the practice, but chances are, they are going to continue to treat you as a statistic, giving you as much attention as they did in finding you in the first place.
  3. Whenever you buy anything that requires you to hand over personal information, make sure to indicate that you would not like your name or personal information sold or turned over to any other party. In most cases, they have to abide by your request, although sometimes they will simply refuse to complete the transaction.
  4. Terminate business with any company that sells your name. Sometimes, it's impossible to tell who was the source of a sudden torrent of junk, but other times it's all too obvious.

Of course, occasionally it can work in your favor: after the diaper company sold our name to a couple dozen other retailers, we eventually were given a free subscription to "parenting" magazine by one of them. My wife, who is ultra diligent about calling and cancelling unwanted subscriptions and junk mail flows, immediately called the publisher to cancel this one. A few weeks later we got a check for the unused portion, for a little over $15. I tried to figure out exactly where that $15 ultimately originated from, but it just made my head hurt.

I have once again signed up for NaBloPoMo to get myself back in the regular blogging habit. I've joined the DreamInCode and Midwestern bloggers groups, so I'm looking forward to seeing what others are writing about. This year, instead of being hosted on a private website, it is being hosted on Ning, the do-it-yourself social networking platform. That's interesting to me in and of itself as I read and enjoy the blog of one of the founders of Ning (and several other big companies), Marc Andreessen, so I am also curious to see if all the nice things he says about it are true.

Yet another great thing about NaBloPoMo is that it gave me an excuse to finally undertake a project I have been putting off for months: migrate this blog from WordPress to Drupal such that it fits in with the rest of the Distance Software site. Drupal is a general content management system, so I can host my main site and articles, plus have a blog, a file sharing area, project maangement, and what have you, which I couldn't do (easily) with WordPress. However, I didn't want to just use the default blogging feature, which leaves much to be desired, so I've been frantically installing modules and configuring the system to begin to approximate what you'd get with a normal blog. Here are some of the steps I took:

  1. First, I needed to import all the content from WordPress into Drupal. That proved to be very straightforward using the fabulous converter tool found here. One oddity: when I entered my database settings, it didn't like "localhost" for the hostname, but it happily accepted 127.0.0.1, so a word to the wise.
  2. Next I removed all the posts from the front page (this is the default for some reason), and decided I wanted to have a nice paged series of posts instead of one huge page, so I created a new view using the Views module that filtered to show nodes of type "blog" and status "published". I set it to be at the URL /shoutingdistance so that people looking for the old blog would be redirected to the new page.
  3. I wanted my posts to show up with nice URLs like /this-is-my-latest-post instead of node/124, so I used the ever-popular pathauto module to set that up, and created nice aliases for the existing and future posts.
  4. I needed syndication to be set up correctly. This turned out to be a breeze with Views, because they already offer an RSS argument option. A search for the concept quickly turned up this page, which explains how to set it up. I switched feedburner to use that new URL, and then I installed the feedburner module to redirect all accesses for the existing feed to the FeedBurner feed, which seems to be working, but we shall see.

So now I have something that approximates a blog, but I have quite a few things left to do:

  1. It's using my default page template so it ends up laying out all weird (this is mostly my fault for making a screwed up CSS template in the first place), but I want to override that with a new format.
  2. There aren't any archives or anything. I need to create another view that can be added as a block on the blog page so that people can see older posts in a simpler way.
  3. Turn on goodies like Akismet and Bad Behavior to block spam and spam attempts, so I can turn comments on. Right now folks, you are SOL.

I will continue to post on the various improvements as they happen, but I hope this helps others who might be considering transitioning from traditional blogging software to Drupal. It's definitely doable, but it's definitely not out-of-the-box ready for blogging, in my opinion.

Syndicate content