Lua + iPhone = mess

And now one of the rare not-Missions-related posts.

I found myself with the need to run Lua code under iOS. Yes, this is legal according to the current Apple Developer Agreement. Who knew the journey I’d undertake in the process.

Originally, I got it running by just building the Lua source files into a static library and linking it in, then using the C API as usual. Worked like a ruddy charm. But then I decided to get clever. “Wouldn’t it be great,” I thought, “if I could run the bytecode compiler on the source and make them into unreadable bytecode objects?” So originally, I tried piping the source files through luac and running them otherwise as usual. The story of how I got Xcode to automate this process without a damned Run Script build phase is another one entirely.

Bad header in precompiled chunk.

Docs say, “The binary files created by luac are portable only among architectures with the same word size and byte order.” Fine. x86_64 and armv[67] are both little-endian, but sizeof(long) on x86_64 is twice what it is on armv7. Duh! Running Stuff on Apple Systems 101. Universal binary, here I come! I built lua 32-bit and lipo’d myself a nice universal binary of luac, ran it via /usr/bin/arch -arch i386.

Bad header in precompiled chunk.

What? Byte order and word size are correct. So I delved into the Lua source code. lundump.c, lines 214-226. Precompiled chunks are checked for version, format, endianness, size of int, size of size_t, size of Lua opcodes, size of the lua_Number type, and integralness of lua_Number. All of which should have been correct- until I remembered that I’d changed the luaconf.h of the iOS binary’s Lua to make lua_Number a float. It’s a double by default.

Rebuild my universal binary of luac using the same luaconf.h the iOS project uses. Run it through yet again. Lo and behold, it worked that time. It doesn’t really feel right to me running i386-compiled bytecode on armv7, but since Lua doesn’t even remotely have support for cross-compilation and I don’t feel like jumping through the hoops of making a luac utility on the device without jailbreaking, it’s the best I can do.

I would be remiss if I didn’t remark also upon the journey of learning how to make Xcode automate this process for me. There was more than just running luac to be done. I wanted my Lua scripts run through the C preprocessor, to get the benefits of #include and #define. Easier said than done. The C preprocessor doesn’t recognize Lua comments, and while because comment stripping is part of preprocessing I could have used C comments instead, it would have meant changing a goodly bit of code, and messed with the syntax coloring. And more importantly, my sense of code aesthetics. It’s always bothered me that the C preprocessor can be both Turing-complete and damn near impossible to work with (an aptly-named Turing tarpit). So I wrote a pre-preprocessor (also in Lua, naturally) to strip the Lua comments out first. But then I had to parse and handle #include manually. Oh well. The real benefit of C preprocessing is the macros anyway. It was quite an interesting bit of work making Lua talk to clang; the built-in library for doing things is a little bit lacking. Anyway, the upshot was there were three steps to processing Lua files in the iOS project: preprocess, luac, copy to Resources.

I finally caved in to my sense of danger and went looking for the extremely scanty reverse-engineered documentation on Xcode plugins. It was pretty awful. The API looks to be quite a mess of inconsistently named attributes with strange side-effects. It took me two hours to hunt down the cause of a “empty string” exception as being the use of $(OutputPath) inside a cosmetic attribute (Rule name). It was hardly perfect even when I declared it done, and then I realized I had the architecture problem again. I had to run a different luac for x86_64 than for everything else. If i386 was the only other architecture, I could’ve just let it be done by a universal binary, but no, it had to cover armv[67] too. Ultimately it turned out a second plugin was necessary, lest Xcode be sent into a downward spiral of infinite recursion at launch. Ugh. Don’t talk to me about all the horrifying effects the tiniest typo could have on Xcode. I love the one in particular where the application was fully functional, except for builds and the Quit command. Uncaught Objective-C exceptions equals inconsistent program state, people. And it’s not just that the API is entirely undocumented. You get those sorts of weird behaviors even if you never install a single plugin or specification file; the Apple developers don’t always get it right either. The application is a horrid mess, and I find myself desperately hoping that Xcode 4 is a full rewrite. I can’t discuss anything about Xcode 4 due to NDA, of course.

As a side note to all this, compiling extension modules for Lua also turned out to be an unmitigated nusiance. It turns out, you see, that the extension authors out there tend not to run their code on Darwin-based systems, and so all the ludicrous quirks of dyld tend to hit a user of their code smack in the face. Finding out that I needed to pass -bundle -undefined dynamic_lookup to the linker instead of -shared was easy enough from the Lua mailing list posts on the subject. Figuring out why that didn’t work either meant realizing I’d built Lua itself with -fvisibility=hidden for no good reason at all, causing the dlopen() calls to crash and burn. Figuring out why my self-written pcre interface module randomly crashed with a bad free() during Lua garbage collection meant debugging and valgrinding like nuts until I found out you’re not supposed to link liblua.a to the extension module. Static library + linked to both loader and module = two copies of the code. Anyone’s guess which you get at any given time. One could only wish dyld was able to make itself aware of this ugly situation.

If anyone’s interested, give a holler in the comments and I’ll post the pcre extension as an opensource project. It’s not fully featured (in particular it’s missing partial and DFA matching, as well as access to a few of the more esoteric options of the API), but it does work quite nicely otherwise. It’s built versus PCRE 8.10 and Lua 5.1, and is upwardly compatible with the still-in-progress Lua 5.2 as it currently stands.

I promise I’ll get some work on Missions done this week!

The utility of a scripting language.

I feel like quite a geek. I had some text copied from my IRC client that I wanted to transform to XML for my XSLT sheet to display all nicely on the Web interface. Format of a line copied from the client:

altered nickname<tab><tab>message<tab>hh:mm:ss<space><AM or PM><carriage return>

Correctly formatted XML for the XSLT sheet:

<message><time>unix timestamp</time></time><type>2</type><sender>correct nickname</sender><content>message</content></message>

How to transform this? I could’ve done the majority of the work with a PCRE regexp and search/replace, but that wouldn’t have fixed the nicknames (since you can’t make if/else decisions in a replace in most editors) or calculated the correct UNIX timestamps. So I turned to scripting, of course. Some would have chosen to use Ruby, others Python, or Perl, or possibly even bash for some masochistic reason. I chose PHP.

Took five minutes, most of which was spent constructing the regexp. The code:

<?php

$conversation = file_get_contents(__FILE__, false, NULL, __COMPILER_HALT_OFFSET__);
$valid_nicks = "nick1|nick2|nick3|nick4|nick5";
preg_match_all('/^('.$valid_nicks.')(?:\t+)([^\n\t]+)(?:\t+)(\d+):(\d+):(\d+)[ ]([AP]M)$/mSu', $conversation, $matches, PREG_SET_ORDER);
$xml = "";
$time = time();
foreach ($matches as $splitline) {
    $nick = $splitline[1];
    $message = $splitline[2];
    $hour = $splitline[3];
    $minute = $splitline[4];
    $second = $splitline[5];
    $meridian = $splitline[6];
    if ($nick === 'nick1' || $nick === 'nick2') {
        $nick = 'real_nick1and2';
    } else if ($nick === 'nick3' || $nick === 'nick4') {
        $nick = 'real_nick3and4';
    }
    ++$time; //mktime($hour + ($meridian === 'PM' ? 12 : 0), $minute, $second, date('n'), $meridian === 'PM' ? 1 : 2, date('Y')));
    $xml .= "<message><time>{$time}</time><type>2</type><sender>{$nick}</sender><content>{$message}</content></message>\n";
}

echo $xml;

__halt_compiler();
// the conversation was pasted here

I daresay that was a pretty cheaply elegant bit of work, if I may be allowed to pat myself on the back. Entirely trivial stuff, but it shows how useful scripting can be for some tasks. How inane would that conversion have been, replacing the nicks by hand and calculating the timestamps one at a time? The conversation was about 500 lines long. Yay scripting.

Please, don’t comment with a one line Perl script to do the same thing from STDIN, I’m well aware you can use Perl to compress any complexity down to what looks like a couple hundred bps of line noise :-D.

P.S.: I am fully aware that the code has several inefficiencies, odd-seeming decisions, things that could’ve been done better, and so on, and so on. Who cares? It works. It’s not meant to win design awards.