Optimization Tradeoffs

This recent post by Raymond Chen highlights some interesting memory optimization scenarios, and the wide-ranging impact of size optimizations. A quick summary:

  • Changing a structure definition such that a bunch of Win32 BOOLs use one bit each saves memory in the struct, because BOOL is typedef’d as an INT.
  • Accessing the bitfield data members requires additional work (a couple of shifts, in the example). This chews up a little time, and some additional memory for the expanded code. The break-even point for overall reduced memory usage depends on the number of struct instances and the number of locations in the code that access the data.
  • Packing data members in this fashion also prevents you from setting a data breakpoint on a particular packed flag or value.
  • The atomic read/write operations work on words, making it more cumbersome to manipulate the packed values safely in a multithreaded environment.

Changing data members to bitfields has other effects, as well:

  • When making this change, you might do a quick grep on your code (as part of your cost-benefit analysis) to see where the data member is touched. “Ah ha,” you say, “it’s only touched in one or two places! This change is a no-brainer!” Of course, if those one or two places are part of inlined functions, the impact of the higher instruction count might be much larger than you think. It’s important to dig a little deeper than a quick grep when making these sorts of changes, to make sure you aren’t psyching yourself out. (And, of course, being able to provide concrete before-and-after measurements is also of paramount importance.)
  • Depending on your struct packing size, you might not see any benefit to bitfield optimization unless you reduce the struct’s size below the next lower multiple of the packing size.
  • One potential positive effect is that reducing the size of your structs may help with cache utilization – if you can cram more data into the cache, the added cost of accessing the packed data may be offset.

The take-away from this exercise is that it’s important to consider many different factors and effects when optimizing. Most of the real-world cases where I made these optimizations were situations where the struct in question had hundreds, or thousands, of instances, and the tradeoff was definitely a win, but it’s still worth taking a moment to think about all of the consequences of such a change.

The comments on the post are amusing, running the gamut from “completely missing the point” (those who say you should always/never use bitfields – there are many of these comments), to those bringing up even more valid considerations (such as how bitfield layout is implementation-dependent behavior, and how trying to map bitfields to hardware structures is a potential endianness minefield). There’s also one rather insightful observation that made me laugh:

I must say: assembly language has no code of honor. Anything goes so long as it works.

So true.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.