Erik Novales's Blog

Our Magnificent Bastard Tongue

Our Magnificent Bastard Tongue (Kindle version) is a book that my wife gave me awhile back, and to which I finally got around to reading. It’s a bit of a strange book – it purports to challenge existing dogma about the origins of modern English, but does so in a manner that seems too casual for academia proper, and yet still too involved for most laypersons. The capsule summary of the author’s view is that Celts and Vikings are mostly responsible for some of the oddities of the English language, rather than the “punctuated equilibrium” that mainstream linguistic thought champions. The erosion of verb conjugations and the presence of meaningless “do” words are cited as some examples of these, which are found in precious few other languages.

As someone who knows very little about linguistics, I feel that I was able to understand the book’s arguments but not critique them – it seemed pretty reasonable, but without a more thorough background in the subject I’m sort of hesitant to embrace it as truth. I did find it very amusing and interesting that the author, John McWhorter, went on a bit of a tangent to attack one of the more interesting bits of language-related theory that I had read about in college: the Sapir-Whorf Hypothesis. Back in college, I remember having some doubts about its veracity (namely, that it seemed unbelievable that a people could be limited in their ability to participate in the “modern” world just by virtue of quirks of their native tongue), but McWhorter brings up several other reasonable objections to the theory and its formulation.

All in all, I can recommend it as a thoughtful, dense, and short read. It’s unlikely to spark any epiphanies for the average person, but still an interesting book nonetheless.

The Xbox (1) Live Shutdown and Secret Weapons Over Normandy

A few months ago, Microsoft announced that it would be shutting down the Xbox Live service for the original Xbox on April 15, 2010. I haven’t played an Xbox 1 Live game in years, but this news still makes me a little wistful, because for the first time something that I’ve worked on will essentially no longer be available anywhere. I’m referring to the downloadable content for Secret Weapons Over Normandy – there were three packs, each containing a challenge mission and a new plane. According to this list, there were only 53 games on the original Xbox that even had DLC, and ours was in the earliest 20% or so of that. (Wikipedia doesn’t have a dated list for downloadable content, so I’m merely going by release date of the original game, which may not be accurate – I seem to recall that the Yavin Station DLC for KOTOR only came out after the PC version shipped, which was many months after the Xbox version.) I remember at the time that it was still a novelty for a game to support DLC – I think the only game up until that point for which I personally had downloaded DLC was MechAssault.

At that point, publishers and developers hadn’t really figured out their DLC strategies yet – they tended to come out at odd, uncoordinated times, and since you couldn’t bill users for it, there wasn’t a direct financial motivation for producing it. (I’m guessing Microsoft probably paid LucasArts for the DLC, but I have no real knowledge about this.) In hindsight, it seems a little quaint to produce DLC for a game without Xbox Live multiplayer like SWON when there was no ability to monetize it. Then again, the time investment in producing the DLC was pretty modest – creating and setting up a new plane was a fairly simple process, and we had an awesome mission editor, SLED, that really made it fast and efficient to create new missions. Comparing that process to content generation on current-gen games makes my head spin, to be honest.

The DLC was pretty much ready to go from the launch date – we had spent less time in certification and re-submission than we had planned, and so the DLC content was started and finished sooner than expected. I also seem to recall some feedback from Microsoft that our DLC was the first that had passed cert the first time through – I’m not 100% sure if I’m remembering that right and that we were the first, but passing on the first submission was definitely something that they highlighted and was a nice feather in our cap. Our engine design and the relatively self-contained nature of the DLC definitely helped with that. (Certification in general was actually a breeze on that game – we passed SCEA cert on our first submission, and I think we only had minor fixes for SCEE and for MS. The project schedule was built around the assumption of two resubmissions for each platform/region.) I think the DLC came out one pack at a time, a couple of weeks apart from each other, and all of the game content was out by early-to-mid December 2003.

My involvement in the DLC was pretty peripheral – I just played it a few times and gave a little feedback. At the time, I think I was working on the Japanese localization of the game (for PC and PS2), which needed some additional code and tool work. That version, incidentally, is the “final” version of the game – there were a couple of tiny bug fixes I made that missed the US and European releases, and of course none of the Japanese text and font rendering stuff was in the earlier releases.

Thoughts on Dragon Age

I’ve been meaning to write this up for awhile, since I beat the game a month or two ago, but only just now got around to it.

I quite enjoyed the first 10 or 15 hours, tolerated the next 35 or so, and then slogged through the last 20. I would say that once the joy of discovering a new world wore off, the rigid structure of the gameplay took over and gradually wore down my positivity. I played as a human mage, and after the first 20 hours or so, encounters devolved into one of two types:

the party’s alpha strike is successful in obliterating most enemies, and the rest of the battle is just meaningless mop-up which requires no supervision
the alpha strike doesn’t kill many enemies, forcing a long drawn-out battle where you cycle through your spells and skills, chugging potions as necessary.

The first portion of Dragon Age did a great job of introducing the world, and really selling the idea that not only was there a well-established world with norms, customs, and history associated with it, but that learning about this stuff would be necessary for navigating the storyline. Once the amount of “historical/travelogue content” began to slow, and once I had exhausted most of my companions’ dialogue, my opinion of the game began rapidly declining, as it had to hang its hat on its comparatively uninteresting quests and combat instead. It seemed like there was a lot of filler content and battles.

The character dialogues, while they lasted, were for the most part memorable and interesting. However, most of my party members spent half the game sitting at the campfire, with no further dialogue or quests associated with them, beyond “their” one quest – this was pretty disappointing. The voice acting was generally good, though – I only started clicking through the voiceover about halfway through the game, which for me is quite a lot!

My interest picked up a bit when I started the last third of the game (the Landsmeet), but the realization that it was mostly a bunch of FedEx quests to get through the final gate to the endgame was a pretty big disappointment. The endgame itself was also sort of unfocused, and unlike the Ostagar segment, didn’t do a great job of selling the cataclysmic battle that was supposed to be taking place. The fact that the final battle ended with my character unconscious and Alistair engaging in cheese tactics to slay the final enemy didn’t help either…

One thing I had blocked from my memory until now was the excruciatingly painful install sequence. I played the game on the PC, and wanted to link my new EA/Bioware account to a specific old Bioware account that I had already used. The process for doing this involved several rounds of deleting my account, and trying to recreate/relink it as I wished. This process took up two hours of my life that would have been better spent on other things.

Overall, I wasn’t disappointed with the game — I certainly had some fun with it, and got my money’s worth out of it — but at the same time, I feel that there were a lot of things that could have been better about it. I suppose I’ll have to wait and see about the sequel, which is supposed to be coming next year.

More about debugging.

One of the blogs that I have on my feed list recently reposted this tidbit:

There are only two debugging techniques in the universe:

printf.

/* */

Since I recently posted my own little list of some debugging techniques, I can’t resist weighing in on this assertion. While their list is a bit flip, changing behavior and running experiments to see the changes is, of course, a core debugging technique. However, the claim that debuggers are just extensions of this is a bit too reductionist for my tastes – it’s like claiming that a car is just a horse that takes longer to get tired, or that the printing press is just a scribe that works faster.

I also feel that “debugging techniques” shouldn’t be limited to “things you do to the program,” either – thought experiments, code change analyses, and the like are all valid in my book as well. My definition includes any form of investigation that gives you more insight into the problem.

Completely separate from the article linked above, I had a glance at Wikipedia’s article on debugging, and…well, there’s a lot of stuff in there that gets my back up:

Citing language choice as having an impact on the debugging process is silly – your choice of language may make it more difficult to write buggy code, but once a bug is in there, I’m hard pressed to think of a reason why language choice would have a substantive impact on the actual debugging process. (Saying that C++ makes debugging easier than C because it has single-line comments is not funny.)
“Generally, high-level programming languages, such as Java, make debugging easier, because they have features such as exception handling that make real sources of erratic behaviour easier to spot. In lower-level programming languages such as C or assembly, bugs may cause silent problems such as memory corruption, and it is often difficult to see where the initial problem happened.”
1) Exception handling is only as useful as the exceptions that are thrown in the code.
2) The claim that “high-level programming languages” like Java don’t suffer from “memory corruption” is misleading. Any imperative language with side effects is going to be capable of bugs that look like “memory corruption,” and can effectively be treated the same way with regards to the debugging process. (As an aside, it’s interesting to see how the definition of what qualifies as a “low-level language” have shifted over the years…yikes.)
Static code analysis tools are meant to be used to fix code problems before you run into the bugs they cause. Using them as an example of a debugging tool is kind of missing the point of using them entirely. lint isn’t going to help you debug why something is busted – it’s only going to tell you that you’re using an uninitialized variable, and that’s something that shouldn’t have been in your compiled code in the first place.

Addendum to the earlier debugging post: One important quality of the printf and commenting techniques that I didn’t mention in my earlier post, though, is that they are techniques that will work even if you can’t actually attach a debugger to the process (or if there is no debugger for the environment in which you’re working). Sometimes this characteristic can really save your bacon – I remember having to change the screen background color register (the poor man’s printf, which doesn’t even need the C standard library) on a certain console in order to debug the DVD boot loading code of a game. Each line of code was prefaced with a call to set the background color to a distinct value, and the color remaining on the screen when the console froze allowed me to determine the location of the crash (and, eventually, the solution).

Me and Google

Two unrelated stories:

We’ve both recently been the target of hacking attempts originating from China. Of course, the attack they suffered was much more serious and alarming than mine, which appeared to be a bot trying to log into my wireless router (running Tomato) and was mindlessly trying every entry in a password dictionary. I turned off remote access in the admin panel, and that was that.
I own a Garmin nuvi 200, and got accustomed to using the Communicator plugin with Google Maps to easily input addresses into it without having to use the touchscreen interface. Tonight, though, I tried to use it and discovered that Google Maps no longer shows the Send link off of which the browser integration hangs. Not sure why it’s no longer there, and an update to the plugin didn’t change anything, so it seems like it’s just broken. Argh.

Debugging!

Here’s another Reddit stub dealing with a topic that is near and dear to my heart: debugging! Unfortunately, the comments on that article seem to focus more on the “fuzzy” aspects of debugging – the “go home and mull it over while watching TV” kind of stuff, rather than more concrete debugging techniques.

Whenever I run into a bug whose cause is not immediately obvious, I have a standard bag of tricks that I fall back upon. Every programmer has a toolbox like this – I figured I would write about some of the techniques that I use, and why I use them. Some of them are not applicable to every situation, but there are still many that can be applied to any given bug. These are presented in no particular order.

Change the inputs.
Many times, by changing the inputs to a function, you can cause a recognizable change to occur in the output. This helps you to envision what’s actually happening in the function, and where things might be going wrong. This can include changing parameter values, input files, textures, etc.
Do things in a different sequence, or with different timing.
Pretty self-explanatory, and related to the first item. The idea is to observe differences in the behavior of the program in similar circumstances. This is mostly useful for interactive programs.
Run all of your automated test code, even the slow tests, and examine any issues that are reported.
This is helpful if for no other reason than as a sanity check.
Check the logs.
This one is pretty standard. Even though most debug logs tend to be overflowing with spam messages, you might still find a smoking gun in there.
Ensure that you’re validating all of the return values of function calls.
This is also known as the “be paranoid” rule, or perhaps the “re-check all of your assumptions” rule. It’s easy to forget to check return values, but it’s crucial to do so. Code that silently ignores failure can cause problems or symptoms unrelated to the function call that actually failed.

A related problem is returning pointers to objects on the stack – this will result in havoc since they will be freed as the function exits.
Ensure that pointer values get cleared out when the struct or object to which they point is freed.
Using stale pointer values is a surefire way to get in trouble. Clearing them out when the associated object is freed (except in very, very special circumstances) will help keep you sane.
Check for any masked exceptions in managed code.
I wrote about this the other day. Ensure that no unexpected exceptions are being silently masked in your code.
Check the data.
Make sure that the data you’re trying to use is actually valid! The GIGO principle is as true as ever. Check for data out of the expected range, QNANs being generated (which are infamous for screwing up subsequent floating-point operations), the ordering of data, legacy data, and correct offsets/sizes of data.
Put in additional logging or debug visualization code.
This can help provide additional information, but this strategy can also backfire, as it can significantly change the timing of your code. (Disk, socket, and/or pipe I/O are relatively expensive operations.) Use with caution. If you have general-purpose code for validating the state of the application, sprinkle calls to that code throughout the application – this can be useful in determining when things go off the rails.
Check the crash dump, if you have one.
Post-mortem analysis of core dumps is often extremely useful in tracking down bugs that you didn’t personally witness, or for which the steps to reproduce are lengthy or time-consuming.
Step through the code in the debugger.
It can sometimes be quite slow if you’re processing large data sets, but it’s often the easiest way to monitor the control flow of a function.
Inspect the disassembly.
This sounds hardcore, but being able to do this is invaluable in some cases. It’s useful not only for checking the compiler’s output, but also for cases where you’re examining a minidump of managed code. Unless you’re working with CLR4 and Visual Studio 2010, opening these minidumps in Windbg results in callstacks that don’t include line numbers. You do have the instruction pointer value, though, so you can actually print out the disassembly with the !u SOS command, and compare the disassembly with the original source code of the function to figure out the exact point at which the crash occurred.
Inspect memory.
If you have pointer problems, look at the contents of memory in the debugger to try and figure out what’s going on. A frequent problem is an invalid offset, which results in struct member values “shifting” forwards or backwards. It helps to be familiar with memory representations of things like floating-point numbers – knowing a couple of common values (0x3f800000 == 1.0f, etc.) can be very handy.
Use conditional or memory breakpoints to isolate the bug.
If you know that a particular object or memory address is related to the bug, you can set up breakpoints to pick out a particular loop iteration or write to a memory location. In cases where you’re interacting with a large body of unknown code, memory breakpoints can be particularly useful for tracking state changes.
Try a different build configuration (or turn on asserts).
This is intended to provoke behavior changes, add validation, and otherwise provide additional data points for determining exactly what’s going on. Turning on validation such as array bounds checks and heap checking can help find some tough bugs (albeit at a tremendous cost in execution speed).
Run on a different platform, or build with a different compiler.
There are often significant differences in timing and other behavior when you run software on a different platform. Endianness and word size also often differ between platforms, which can expose problems or bad assumptions about data in code. Like changing the inputs, careful observation of these differences can help you get an understanding of what’s actually happening. Additionally, if you are using a different compiler, you may see different warnings or code behavior due to optimization.
Check the version control history for anything suspicious.
Examining changes to the source code can give you a good idea of how the behavior of the program has changed (even if you don’t know much about the changed code to begin with), and can shed some light on a bug. Checking all of the cases where a function is called to ensure that they take into account any changed behavior is essential.
If your codebase easily allows you to do so, try running the buggy code synchronously instead of asynchronously, for testing purposes.
This can help determine if a race condition is what’s causing the bug to occur. (It should be noted that I’m not crazy about just adding random sleeps into an asynchronous function to determine this – it’s too unreliable for my tastes.)
Check (and re-check) your data dependencies in asynchronous code.
Don’t fall into the trap of trying to envision multithreaded code by imagining each possible combination of instruction pointer values. Instead, when trying to prove correctness, focus entirely on data dependencies and ensuring that locks are used correctly and respected. (For deadlock bugs, inspect the order in which locks are taken, and check for proper use of back-off and other algorithms for avoiding deadlock.)

Note that applying these techniques won’t help you write optimal multithreaded code – this requires much broader insight into the particular algorithm in question, and the overall architecture of the code. However, they will help in tracking down correctness issues.
Turn off chunks of the code, or switch to a different implementation of an interface.
If you have multiple providers of an interface available, try using a different one, and see how the behavior of the program changes. (Using null/echo interfaces is a common debugging technique.) Additionally, you can try disabling features of your application to see if they are somehow related to the bug.
Use third-party validation tools and/or debugging information.
This includes things like the Direct3D debug runtime, FxCop, lint, valgrind, the Visual C++ runtime debug heap functions, the Application Verifier, and the checked build of Windows. The more debugging aids you have active, the more likely that you’ll get a clue upon which you can act.
If the code is unfamiliar, find out who wrote it, and start asking them basic questions about it.
This is similar to the “be paranoid” rule, except that by asking all of these basic questions of the author, you’re forcing them to re-check all of their assumptions. It’s not uncommon to have a eureka moment while explaining a bit of code to someone else.
Try turning off optimizations for a chunk of code.
My experience has been that people tend to fall back on the “it must be a compiler bug” ~~excuse~~ explanation way earlier than they really should. Nevertheless, turning off optimizations for a section of code might help you debug a problem that occurs in optimized builds. (Whether it’s a genuine optimizer bug, or, say, a misuse of the C99 restrict type qualifier, is for you to find out. Anyone interested in using the latter, incidentally, should really read this excellent article by Mike Acton on the topic.) Performing a quasi-“binary search” when turning off optimizations can help minimize the time spent searching for the problem code snippet for a genuine optimizer bug.
Try running on a different machine, or piece of hardware.
Hardware failure is another bug explanation of which programmers tend to be a little too fond. However, it does happen occasionally, so it’s definitely something worth testing if you run out of other ideas.

That’s pretty much all I can think of at this point. There are a few more tips that spring to my mind, but they are pretty specific to Windows or Visual Studio development, so I won’t recount them here.

First Chance .NET Exception Handling

I saw this article posted on Reddit’s programming feed*, which talks about a Visual Studio debugging technique for getting first crack at exceptions, before any upstream handlers run. The Debug->Exceptions dialog can be used to set the debugger to break before any exception handlers fire for a particular exception. This is useful not only for debugging code that interfaces with third-party libraries (where it is often unclear why an exception might be thrown), but also your own code. Why?

Imagine that you have some code that runs in an interactive session, but which has a high-level catch block to catch and report errors. This catch block may attempt to continue execution when an error occurs – for example, writing one file in a batch might fail, but the code should continue writing other files. Unfortunately, the exception log produced by this might not provide sufficient information to debug the issue. For example, many of the I/O exception types fail to include information about which file or directory was being modified when the exception was thrown!

In tracking down cases like these, it’s often easier to set the debugger to catch that particular exception at the first chance it gets, and then examine the state of the calling function where the exception was thrown. Having both a call stack and the specific line of code where the exception was thrown often allows you to see the problem immediately, without further investigation.

Another illuminating debugging exercise is to turn on first chance exception handling for all exceptions, and then see where they are thrown. Code that masks exceptions (by using an untyped, empty catch block) is particularly nasty, as it can leave the program in an ill-defined state, but without any feedback to indicate that something failed. A different problem results from the frequent use of exceptions: they have a significant negative performance impact. Turning on first chance exception handling makes it trivial to find these cases, if you hadn’t already noticed the debug window spam that frequent exceptions tend to create.

* There’s a lot of stuff that gets posted there that gets my (figurative) blood pressure up, so we’ll see how long I stick with it. But I figured it would be a good exercise to read it regularly and write about things I see there.

Overdue…

It’s time for the standard “I’ll try to update more often” post. Seriously!

Windows 7 on my netbook

I decided to try running Windows 7 Ultimate on my netbook (an Asus EEE PC), since I had heard that it was nearly as fast as XP (and much more modern). I installed it by burning the ISO (acquired by virtue of being an MSDN subscriber) and then booting the netbook from an external DVD drive. I am happy to report that Windows 7 works like a charm, with the only major wrinkle being that I had to reinstall the ACPI drivers along with some of the other Asus utilities to get the special function keys working as they did before. (The link there is a decent summary of what is required, although I actually came across this information elsewhere.)

Windows 7 takes longer to boot than XP on the EEE PC, but performance is comparable once the OS is loaded — I have no complaints about the experience so far. The only other strange thing is that the video driver seems to have issues with coming out of sleep mode — the nice thing is that the Vista driver model allows the OS to completely restart the graphics system, so the machine actually recovers gracefully from this. I’m hoping that there will be future driver updates to solve this issue, but it’s not a big deal at this point.

Using GPPG and GPLEX with Visual Studio

Here’s a quick note on a problem I ran into awhile back. I was using the GPPG and GPLEX parser tools as part of a Visual Studio project – the input files for these tools generate C# source files which are then compiled into the project. However, I noticed a problem with the recommended project setup (basically, setting up a MSBuild .targets file for GPPG and GPLEX source files, and then including that target file in the project). Changes that I made to the grammar would only take effect the second time I built the project. The samples and documentation for MPPG and MPLEX (earlier versions of GPPG and GPLEX) are silent on this issue. After inserting some debug code into my build targets, I determined that the files being output by the GPPG and GPLEX target handlers were correct, but the compiler was still using the older versions of the files. It seemed like it was using a cached copy of the old version of the grammar.

As it turns out, there is a bit of chicanery going on inside Visual Studio that results in this behavior. Visual Studio actually runs the C# compiler in-process, as an optimization to avoid process start overhead. This in-process compiler gets fouled up, however, when C# source files are generated as part of a build step – it seems to load all of the source files when the build starts, so it winds up using the old version instead of the freshly generated one.

The solution is to add the UseHostCompilerIfAvailable property to the .csproj project file, like this:

<PropertyGroup>
	<UseHostCompilerIfAvailable>False</UseHostCompilerIfAvailable>
</PropertyGroup>

This will force Visual Studio to use the out-of-process compiler, which will cause the correct version of the grammar to be built. Building the project will be a little slower, but it’s better than having to build twice!