Feeling stupid when writing code No Comments
We all make mistakes when coding, and some of them are quite deeply foolish. But this particular snafu I just made takes the cake, in my opinion (dumbed-down code):
// Written as a static inline C function for max // speed; this is called in a very tight loop, usually // at something like 40,000 times per second. static inline uint8_t _readByte(NSUIntger *cursor, const char *bytes, NSUInteger maxlen) { return (*cursor < maxlen) ? bytes[*cursor++] : 0x66; } // Macro for readability, known to be called only from the one function #define readByte() _readByte(&cursor, buffer, bufferlen) // Macro to skip so many bytes #define skipBytes(n) do { \ __typeof__(n) nn = (n); \ for (uint8_t i = 0; i < nn; ++i) \ readByte(); \ } while (0) - (void)processBuffer { // assume the declarations of n, cursor, buffer, bufferlen // ... n = readByte(); // ... else if (n > 0x29) skipBytes((uint8_t[16]){0,0,0,1,1,2,0,0,0,0,2,2,3,3,4,4})(n & 0xF0) >> 4); }
See it? (Hint: Look at the implementation of skipBytes().)
The correct implementation of skipBytes() is:
#define skipBytes(n) cursor += n
You tell me what’s faster for skipping 4 bytes, adding 4 to an unsigned integer, or four repetitions of compare (i < n), branch (for body), compare (cursor < maxlen), branch (compare true), dereference (into bytes), add (cursor++), add (++i), branch (top of for). The compiler might have optimized some or most of this out at -O2 or -O3 in the process of loop unrolling and function inlining, if the optimizer could manage to realize that the value read from bytes was never used, but Clang didn’t catch it, so…
In my defense, this was really an artifact of a previous implementation of the architecture, where the cursor wasn’t directly available to the -processBuffer method. The real moment of stupidity is when I rewrote the macro as this during the refactoring:
#define skipBytes(n) do { \ switch (n) { \ case 4: readInt(); break; \ case 3: readByte(); /* fall through */ \ case 2: readShort(); break; \ case 1: readByte(); break; \ } \ } while (0)
I had just finished writing the while (0) when it hit me how very insane that was. And I wondered why buffer processing was taking user-perceptible amounts of time for small sizes of buffer. Now imagine it the way it looked before…
- (uint8_t)readByte { return (cursor < [buffer length]) ? [buffer bytes][cursor++] : 0x66; }
That’s three calls to objc_msgSend() for every single byte read. Keeping in mind that even readShort() and readInt() were implemented by calling readByte() 2 or 4 times. It was nice only because it meant I didn’t have to endian-swap the resulting integers. This is a 64-bit application which declares itself as requiring Mac OS X 10.6 as a minimum version; why was I writing code reminiscent of parsing data on an 8-bit machine?
Moral of the story: You don’t have to be stupid to do stupid things in code.
January 24, 2010 at 12:03 am