Tag Archives: php

The utility of a scripting language.

I feel like quite a geek. I had some text copied from my IRC client that I wanted to transform to XML for my XSLT sheet to display all nicely on the Web interface. Format of a line copied from the client:

altered nickname<tab><tab>message<tab>hh:mm:ss<space><AM or PM><carriage return>

Correctly formatted XML for the XSLT sheet:

<message><time>unix timestamp</time></time><type>2</type><sender>correct nickname</sender><content>message</content></message>

How to transform this? I could’ve done the majority of the work with a PCRE regexp and search/replace, but that wouldn’t have fixed the nicknames (since you can’t make if/else decisions in a replace in most editors) or calculated the correct UNIX timestamps. So I turned to scripting, of course. Some would have chosen to use Ruby, others Python, or Perl, or possibly even bash for some masochistic reason. I chose PHP.

Took five minutes, most of which was spent constructing the regexp. The code:


$conversation = file_get_contents(__FILE__, false, NULL, __COMPILER_HALT_OFFSET__);
$valid_nicks = "nick1|nick2|nick3|nick4|nick5";
preg_match_all('/^('.$valid_nicks.')(?:\t+)(&#91;^\n\t&#93;+)(?:\t+)(\d+):(\d+):(\d+)&#91; &#93;(&#91;AP&#93;M)$/mSu', $conversation, $matches, PREG_SET_ORDER);
$xml = "";
$time = time();
foreach ($matches as $splitline) {
    $nick = $splitline&#91;1&#93;;
    $message = $splitline&#91;2&#93;;
    $hour = $splitline&#91;3&#93;;
    $minute = $splitline&#91;4&#93;;
    $second = $splitline&#91;5&#93;;
    $meridian = $splitline&#91;6&#93;;
    if ($nick === 'nick1' || $nick === 'nick2') {
        $nick = 'real_nick1and2';
    } else if ($nick === 'nick3' || $nick === 'nick4') {
        $nick = 'real_nick3and4';
    ++$time; //mktime($hour + ($meridian === 'PM' ? 12 : 0), $minute, $second, date('n'), $meridian === 'PM' ? 1 : 2, date('Y')));
    $xml .= "<message><time>{$time}</time><type>2</type><sender>{$nick}</sender><content>{$message}</content></message>\n";

echo $xml;

// the conversation was pasted here

I daresay that was a pretty cheaply elegant bit of work, if I may be allowed to pat myself on the back. Entirely trivial stuff, but it shows how useful scripting can be for some tasks. How inane would that conversion have been, replacing the nicks by hand and calculating the timestamps one at a time? The conversation was about 500 lines long. Yay scripting.

Please, don’t comment with a one line Perl script to do the same thing from STDIN, I’m well aware you can use Perl to compress any complexity down to what looks like a couple hundred bps of line noise :-D.

P.S.: I am fully aware that the code has several inefficiencies, odd-seeming decisions, things that could’ve been done better, and so on, and so on. Who cares? It works. It’s not meant to win design awards.

Highlighting source code in WordPress with Pygments

So I was playing around with the rather nice SyntaxHighlighter WordPress plugin. It worked pretty nicely. But it had some issues, IMO:

  1. It repurposes the class attribute to act like a pseudo-CSS ruleset with custom rules. To me that’s just ugly.
  2. It does all its work client-side. For some things that’s a nice touch, but for a blog with mostly static content it’s wasted time.
  3. It does almost everything in JavaScript. I have a personal distaste of working with JavaScript for no especial reason whatosever.
  4. It’s LGPLv3. I prefer less restrictive licenses for OSS.

On the other hand, I know the Pygments syntax highlighter pretty well, and I already had a written style plugin for it to get the syntax coloring I like. But there’s no mature Pygments plugin for WordPress that I could find. So I did what any good programmer would do and set out to write my own. I have the working knowledge of PHP (I’m a core dev, for mercy’s sake) and Python to do it with, after all. Not so much WordPress’ plugin API, though, so I took the SyntaxHighlighter plugin as a starting point.

Continue reading