Distance Debugging Logo

I've been working on a project that ties together a few different systems and technologies, and the whole thing is written in C++ and uses a lot of MFC and Windows libraries (don't ask, it's a long story).  I was working on integrating a third-party application that produces XML documents that are parsed by the application under development when I hit a weird snag.  The third-party application would produce an schema file with a namespace declaration like this:

 

xmlns:xml="http:www.w3.org/XML/1998/namespace"

 

Note that this is technically legal.  The namespace prefix "xml" is reserved, but if defined, it must point to the W3C namespace schema definition.  Technically, it is implied so it is not present in most documents, but making the legal, yet redundant, declaration is not technically wrong.  However, the older Microsoft XML library I have been working with (i.e. MSXML which is pre-.NET) complains about it saying "the namespace prefix 'xml' is reserved and may not be declared".  It's not checking if it has been declared correctly. 

So I'm stuck between two pieces of code I don't control: one that is doing something that is technically legal but unnecessary, and the other that is performing an error check that is sort of correct, but not really.  So where is the bug?  It's a little bit on both sides, or what I call a "bug in the cracks": something that emerges from the interaction of two pieces that are doing something fairly reasonable, but in an incompatible way.  My workaround is somewhat unfortunate. I have to do a string-level manipulation of the XML document before it gets passed to the MSXML parser to strip out the offending namespace declaration, since I can't actually parse it to fix it.  Hopefully we'll soon be able to upgrade to a newer version of the XML libraries (or maybe even move the whole thing to C#/.NET) and then this problem and the associated workaround will simply vanish.

The Memory-Leak Dustup

There was humorous sort of "argument" that erupted over what was essentially a product placement article. Basically, one of the DARPA driverless car competition teams ran into trouble because their code was not freeing C# pointers from an event listener data structure, and so the car would exhaust memory and then go off the road. They used a commercial product to find the problem, and provided a testimonial. Of course, it was linked on Slashdot and people quickly divided into a few camps, the "how could they be so stupid/naive!" camp, the "C#/Microsoft sucks!" camp, and the "Hey, this is just an ad posing as an article" camp. Leaving aside the hardliners for a second, I thought about the underlying problem a bit. I certainly think of this type of memory leak as a "rookie" mistake, and this team was probably not made of up top-notch programmers, but top-notch vehicle designers and AI people who were good enough coders to get to the finals of a serious competition. There is a big difference between people who write code as their primary objective, and those who see it as a means to an end, so I don't fault them for making the kind of mistake they did.

I do however take issue with the follow up comment posted by the software vendor:

There seems to be a lot of criticism of
the Princeton team, both here and on Slashdot, for missing such an
“obvious” mistake. If I had written code which managed to drive a car
9.8 miles through the desert I would be incredibly proud. I feel that
this vitriolic criticism of a group of undergraduates who had the
gumption and the vision to go out there and create a software and
hardware solution to drive a car across a desert is very disingenuous.
They had a hard deadline and did an incredibly good job given the
constraints on their time and budget.

They had tried to fix
the problem they were facing and had put in a workaround so that they
could compete on the day. Unfortunately, they had not used a profiler
which could have shown them the underlying problem.

For me the
story is a great illustration on why having the correct tools when
programming is vital – regardless of if you are developing in C, C++,
C#, SQL or any other language. At <company> we are committed to
providing these tools and ultimately, due to the attention this article
has had, I believe that we were right to publicise it and that
hopefully the message that it is easier to succeed if you have the
right tools is getting through.

I like and agree with everything up until the " the story is a great illustration on why having the correct tools when programming is vital" part. In fact, I think it's just the opposite. Going back to my "Tools are not Techniques" theme, focusing on the success that this particular tool had in this situation obscures the fact that they didn't apply the proper techniques to finding the problem. The profiler helped them see what was leaking, but I would guess that some regular debugging techniques, like making the destructor or finalize() method print ouf some info so that you can see which objects are being collected, would have been just as effective, and would have cost nothing. The fact that they lacked the techniques at the time they needed them only reinforces the notion that while tools can give you efficiencies and data that might otherwise be unavailable, but they are no substitute for knowing how to investigate a problem.

In a post a week or so ago on the United Hollywood blog added a link to someone usingTwitter to send out updates on the activities on the strike line, noting that it was "Perfect for all you nerds, hipsters and hideous nerd-hipster hybrids." I thought to myself, that pretty much sums me up. I might lean closer to geek-hipster hybrid, or hacker-hipster hybrid, but the general idea is the same: I make a lot of decisions based on both technical merit and hipness factor. Here are a few examples:

  • I prefer to use emusic over iTunes.The technical reason is that they offer DRM-free mp3s, a Linux client for downloads, and you can redownload any track you have paid for any an unlimited number of times. The hipster reason is that I can get dozens of rare Jazz tracks, and bands that no one else has ever heard of, like Giant Robot.
  • I am a dedicated Linux user. The technical reason is that I find it a far superior operating system in lots of ways that I care about. The hipster reason is that it's like a band that no one else has ever heard of. I'm actually concerned about it's growing popularity, especially on the desktop. If Linux gets cool, what OS am I going to have to resort to in order to maintain my geek-hipster cred? FreeBSD? AmigaOS!?!!?
  • I love old games and hardware, and the world of emulation, and I insist to this day that the old games are far superior to the crap they are putting out these days (except the Wii, which is my guilty hipster pleasure).
  • I will sit and debate the relative merits of hardware and software, and talk endlessly about who pioneered what, and whom was influenced by whom (like a long-winded riff on the influence of LISP on Java given that Guy Steele was part of the core specification team) without a hint of irony.

I should just get a plain black t-shirt printed up that says "Your Favorite Chip Architecture Sucks" to clearly define the genre. Hideous Nerd/Geek/Hacker-Hipsters Unite!

I feel like I've gotten into the spirit of Nablopomo this year more than last.  Last year, I had a bunch of ideas about debugging theory that I wanted to get out in some form, and each day was mostly a matter of figuring out what I wanted to write about.  With a year of on-and-off writing under my belt, and ample opportunity to write on a variety of topics, I find myself more in the position of seeing an interesting article, or video, or problem, and just immediately writing up the ideas that occur to me on the subject rather than doing an extensive pre-analysis and structuring for each post.  To me, that's what Nablopomo is really all about: getting out of the mindset that each post has to be the essence of insight and wit, and instead just turning thoughts into material as quickly and as frequently as possible.

I also have had to face more of the prospect of having nothing to write about, and using that as a springboard to come up with new and better ideas.  It reminds me of what I was taught about doing improvisational theater.  When you first start out doing improv, you cling to certain cliches: humorous "inside" references that the audience appreciates, stock characters and goofy voices that you know will get a laugh, oh-so-clever visual gags, silly plot twists, and of course, potty humor.  These things will get you a laugh or three, but they go against the essence of what good improv consists of, which is the bizarre and incredible stories and gags that a group of people who are all committed to going with whatever gets thrown out there will create when they have nothing planned in advance.

This kind of improv has a very different feel to it, both from the actors' standpoint, and the audience's standpoint.  On stage, you are caught between a lingering concern that the whole thing is going to fall apart (which it sometimes does), and the elation that somehow it just seems to be working.  Off stage, it quickly becomes clear that anything could happen at any moment and you watch with greater fascination wondering how it's all going come together, at least that's been my experience.  You don't encounter that many groups who can totally avoid preplanning what happens in the scene because of the associated fear.  It's hard to walk out on stage and just start pantomiming and talking and hope for the best.  However, after you do it for a while, and you learn some simple tricks for making it work, it becomes so enjoyable and natural that you can't imagine relying on those old planned crutches anymore.

Ideally, that's how I'd like to write blog posts.  You get a suggestion or two from the audience, in this case the many blogs and other websites I read, or the daily problems I encounter in my work, and I just sit down and try to create something right there.  In fact, that's more or less how this post was written.  The idea that I was changing my approach to writing posts occurred to me as a starting topic, but the link between my new approach and theatrical improvisation didn't cross my mind until I was a paragraph in.  Unfortunately, my posts and my improv scenes suffer from the same problem: I was never good at ending things.

I am an avid google reader user, as are many people I know.  When I first started using it, I was adding a new feed every few days, but I started realizing that once I added a feed, I felt obligated to read everything.  With a couple dozen feeds, some with multiple posts a day such as Treehugger, it's a losing battle.  I generally have somewhere between 200-1000 unread posts at any given time.  In fact, I use my unread posts count as sort of a barometer of how busy I've been lately.  Weekends are nice because few sites update, and I usually have time to whittle things down a bit, but I rarely get down to zero.  When I do, I wind up in a mental state that I can only describe as hyperinformed, since I've usually crammed hundreds of bits of info from some amalgamation of tech news, pop culture, and sports over the a short period

I've discovered that this secret hope, that I can stay on top of every feed, has made me reluctant to add new feeds, and that's my google reader dilemma.  I ask myself, is the information I'm getting enough to justify the additional feeling of inadequacy I get when I look at a high unread post count?  It's silly, because I should use the magic "mark all as read" button to make it all go away, but that somehow feels like cheating.  Instead, I think Google should add a "don't show me any posts older than X days" feature, since then I could at least cap feeds that tend to have news that "ages out of usefulness" like Techmeme.  When I can read it everyday, it's great, but three day old tech news is usually pretty stale.  

Do others out there suffer from this same problem?   Are you reluctant to add new feeds because you want to stay on top of everything?  How many feeds can one person reasonably be expected to stay on top of anyway?

I came across this brief South Park clip parodying the generally acknowledged ridiculousness of people's dedication to Guitar Hero, given that a similar level of dedication to practicing with a real guitar would lead to having an actual skill rather than a made-up skill that only applies to the game. In short, Stan's dad finds the kids playing Guitar Hero, shows them that he can actually play the songs on a real guitar, and they make fun of him and go back to playing the game. Later, Stan's dad sneaks down to try it out, and is terrible. This matches my experience in that real musicians tend to be terrible at first because the timing of the game is just enough off that you have to adjust your internal tempo to make it work, so if anything, it's slightly regressive to real playing. I'm sure it's worse for guitar players who have to fight the urge to go to the correct fret, versus me as a pianist who is only fighting the timing.

While lots of people have questioned the acquisition of this made-up skill, few have asked the question: why is it that people so readily prefer the fake world of Guitar Hero to the real world of Playing Guitar? In short, my answer is that Guitar Hero is a surprisingly good computer tutor, albeit for a made up skill, that does many of the things a good human tutor does, and it's a lot more fun too.  Good teachers tend to create students that come back for more, and this is what Guitar Hero is doing.

Researchers have been probing the question of why one-on-one tutoring is such an effective learning strategy. The problem is that the difference between a highly-trained tutor and a unskilled tutor is surprisingly small. Upon investigation, it appears that tutors tend to do a few things that seem to make a huge difference:

  1. They help present and frame problems to provide a clear path for tackling subject matter. Often students have trouble knowing how to even approach a problem and knowing how to get off on the right foot makes all the difference.
  2. They act as a source of general knowledge so that students can feel free to get clarification on issues in a way that they might be too embarrassed or hurried to do in a classroom setting, and which they might not have access to if they were studying alone.
  3. This is the big one: they provide affective reinforcement in the form of praise and accolades for successful achievement, and this reinforcement is very close to the time of execution so that it has a very strong effect (unlike a test that you often take and then wait days for results).

It's a pretty simple set of things that almost any tutor seems to implicitly do, considering the difference it makes. Guitar Hero does a lot of this as well:

  1. The problem of learning to play guitar is framed as the playing of and scoring on invidual songs, slowly increasing in difficulty.
  2. Not really much here on general knowledge, but the subject matter of playing a made-up guitar is so contrived so as to make this somewhat irrelevant.
  3. It's huge in this area: with crowd reaction, both in-game and often with people around you in the real world you get instant feedback, not to mention the overall "rock star" experience with star power, on-screen avatars doing all kinds of tricks, and so on.

In addition to this, it offers a few things that real-world guitar instruction can't hope to imitate: the concert experience, and the simplification of the instrument to a few frets while still allowing players to play actual music instead of twinkle twinkle little star.

So here's my modest proposal: someone needs to modify a guitar so that you have to actually play the real notes on the guitar in order for it to count. I did a search on "guitar hero real instrument" to see if this has been done. I've found the inverse and converse of this concept, with people retrofitting real guitars to play guitar hero. as well as using guitar hero controllers to make real music, but no one is stepping up to try to give people the ability to use all the support offered by guitar hero, but wind up with an actual skill. If anyone knows of an effort in this regard, I'd love to hear about it.

As I have been programming for a fairly long time now, I have noticed a very odd thing happen to me emotionally when I am writing code.  I get very invested in writing and debugging it, almost like I would a novel or movie.  When I put in a fix for a problem and rerun the test case, I feel anxious, as if some sort of gruesome fate awaits me if my assert() fails, and not just an angry red bar telling me the test failed.  On the flip side, when it runs cleanly, I get a rush that's I've dubbed the "Coder's High" because it feels like you are the smartest person on the face of the earth for at least 5 seconds.

The effect carries over to the process itself, such as when I'm interrupted by the end of the workday or a phone call when I am in the middle of writing or debugging something.  It's more than the mild discomfort experienced when leaving the flow state associated with programming. It feels like I've been deprived of the last 10 pages of a mystery, or the last 5 minutes of a suspense thriller.  I actually want to return to the keyboard to find out how it ends.

I had one of those moments this morning when I finally conquered the jquery demons that were circling around my project (in case you were wondering, .children doesn't search the DOM recursively, but .find does; I actually knew that but for some reason I looked at the function 10 times before I noticed I was calling the wrong one).  The psychological benefit has carried me through most of my day, and I feel like I was a much better programmer because of it.  I have discovered that I like that effect more than the "find out how it ends" annoyance, and I've learned to work on a problem up until I believe I have the solution, then actually wait until the next day to try it out.  That way I have something to work on right away when I get in instead of procrastinating, and I might just get a little shot of adrenaline and endorphins out of it.

I have been fighting with jquery for the last few hours, and almost forgot to post. I'm still on the fence about whether jquery is a good thing, or a very evil thing. For now, suffice to say that it's nearly impossible to step through with the javascript debugger, and that in and of itself is really a hassle, despite the ultraslick syntax I will post more about my experience and complaints tomorrow when I've had some time to cool down.

Debugging for my Life

The title is a bit overdramatic, but I have just moved into a new office space and have been working frantically to resolve lots of small to medium problems.  Basically, the office is totally wireless. No ethernet or phone cables are strung anywhere, and I want everything to just work and  I bought a couple of new pieces of equipment to make that possible. 

Phones were the simple part.  Between cordless phones and wireless phone jacks, it's a snap to make all the phones talk to one another. The network was not as forgiving.  My general idea is to have one primary access point where the cable service comes in, and then put WDS-enabled access points in every office so that a person's computers can talk to one another over the wired connections, and route all outbound traffic through the main link.  That provides both a nice blanket network so that wireless is available throughout the office, while providing most of the niceties of wired connections.  I picked up some new hardware to make this possible, including a WRTSL54g, which is a fancier version of the standard WRT54g in that it has a USB port and so can have better storage capacity. I also picked up a new standard WRT54gl to act as WDS point. 

I had this system set up at home, but I used WEP to send the data over WDS, and I wanted to go to WPA for better security and ease of use (WPA2 may be my next project).  I also wanted everything to run the latest OpenWRT kernel, Kamikaze.  Of course, they went and changed all the WDS stuff since the previous version (White Russian) so I had to repuzzle out a lot of it.  I don't have the energy to link all the posts that I used to finally get it all working, but suffice to say, it took a lot longer than I had hoped, but I am currently writing this post sending my data out over a WPA encrypted WDS link, so I am feeling pretty good.  I really didn't want to walk into the office tomorrow still staring this same problem in the face. 

API Taxes

Taxes serve two purposes: the first is to collect revenue, and the second is to help incentivize or disincentivize certain activities. For instance, we (supposedly) keep raising cigarette taxes to encourage people to quit, and we offer mortgage deductions to encourage home ownership. I got to thinking about taxes when tackling some recurring problems in the development of an API intended for broad consumption including:

  • You have methods that perform low-level operations that have to be made available for certain uses, but should generally be avoided.
  • You have methods that have serious performance implications, or serious performance implications depending on the arguments passed in.
  • You have methods that you want to discourage use of for whatever reason; maybe you are thinking about removing it but haven't yet gone to deprecated; maybe you think it is buggy and unreliable for many inputs; maybe there's a better way of doing it that people often overlook.

All these problems boil down to one thing: there's no consistent way to indicate to API users what methods they ought to be calling under what circumstances. You can add comments all day long and people are unlikely to change their behavior. Once they find something that works, they will stick with it, and then complain when performance is terrible or they corrupt a data-structure. So I thought of a simple solution: API taxes. It would be easy enough to implement in Java with annotations and/or aspects. Basically, every method can be assigned a tax (the default being zero) annotation. Methods that you want to encourage the use of have negative taxes, and methods that you want to discourage have positive taxes. The annotation could specify a base cost to call, or different costs for different values of inputs, or even different costs for calling in different contexts (for instance, calling a low-level method from a wrapper API is zero-cost, but calling it directly is high). Then you have a "Tax Collector" aspect that is called for every method annotated with a tax, to keep track of what you owe. As the API designer, I can then give out guidelines for what a reasonable cost is different usage patterns. If you are exceeding that cost, you need to revisit your calling patterns to see where you are incurring that unnecessary cost. Perhaps you are sending in incorrect inputs, or just failing to cache results that should be cached. Overall, besides the added burden of coming up with tax values during API creation, it seems like an easy way to catch problems early on, when you only have small data sets and test cases to work with.

Syndicate content