Free software by Richard A. O'Keefe

Stupid Shell Tricks

stupid-shell-tricks.tar is a Unix tar(1) file containing a README and 5 little C programs. Compiling and running the program sets up some tiny example directories. The README file gets you to see some of the strange things that can happen with Unix file names. The first example requires a UTF-8 locale; if you don't have en_NZ.UTF-8, any other UTF-8 locale will do.

Stream processing for Erlang

stream.erl is an Erlang module implementing Coutts-style streams in Erlang.

A faster random number generator for Java.

Random48.java uses the same 48-bit linear congruential generator as java.util.Random, but designing it not to be shared between threads (which doesn't really make sense for a random number generator anyway) makes it 5 times faster. It also provides a useful .toString() and a constructor that takes a String() argument, with the text representation looking like a Java identifier, so you can easily save and restore Random48 states using plain text files. RandomTime.java is the program I used to measure the speed of the generator.

Linear congruential generators have known limitations, so if you seriously want the best available generator, this isn't it. This is for you if you were happy with the results from java.util.Random but wished it were not so sluggish.

Lists with O(1) concatenation and reversal

The Reverse-Append-Unit-eMpty data structure is a list-like data structure that has constant time append and reverse, and linear time batch operations. Head and tail are not constant time; they are not even logarithmic time in the worst case. This is not a very sophisticated data structure, but that's the point: to show that quite an unsophisticated data structure can do surprisingly well. This version is in SML and works with SML/NJ 110.70 and MLton June 15, 2009.

ANSI Smalltalk Compiler and library

For some years I have been building a static compiler for ANSI Smalltalk. This compiles a variant of ANSI Smalltalk -- all the ANSI classes and methods in ansi.st, plus a lot more in other files -- to ANSI C89, which can then be compiled by gcc or Sun's C compiler. It is routinely tested on SPARC/Solaris and Intel/OSX; it is also periodically built and tested on Intel/OpenSolaris, Intel/Linux, and Intel/Windows Vista + SUA. Some things are still missing or incomplete:

The main thing that's woefully incomplete is the documentation.

Perhaps the main departure from Smalltalk common practice is that I will not use (though I do provide) #shouldNotImplement. Of course, having a batch compiler instead of an IDE, and allowing embedded C instead of calling primitives, aren't traditional either.

While it's not finished it is already useful and all of the code is completely free to anyone who wants it for any purpose as long as they don't claim credit for it. In particular, Smalltalk code in the libraries may be used freely by the maintainers of any Smalltalk system, including commercial ones.

The code is in astc-1406.tar.

One thing I was particularly interested in was just how well a fairly naïve compilation-via-C strategy would go, especially considering that dynamic dispatch is done by binary search -- an idea swiped from SmallEiffel/SmartEiffel. The answer is that it goes very well, about as well as VisualWorks non-commercial, sometimes better, sometimes worse. The latest test, processing a 180 MB XML file, took 12 times as long as the corresponding C code, but considering all the work it takes to stuff a character into a string, that's going to be hard to improve much without flow sensitive type inference, which currently isn't tried. This was after all started to provide a naïve reference point.

The major practical problem is that the C file it produces tends to be large. The system is no longer a single big blob of code that happens to be spread across multiple files. It is now easy to pull in just the pieces you want. Some pieces are still large -- the ANSI core will never be small -- and some are quite small.

Comparing files against model answers

pcfpcmp was written for Programming Contest judges to use, so that problems with floating point numbers in their output could be used. Details are in the file README.pcfpcmp and source code and an example are in an uncompressed `tar' file pcfpcmp.tar (33kB).

Token styling and extracting for several programming languages

m2h is a program that can tokenize most of the programming languages I use, after a fashion. It can be used for at least five purposes:

The sources are provided as a gzipped tar file, m2h.tgz. Documentation is in the README file, plus in core.c.

Beware: this is not polished code, to put it kindly. There are a couple of features that are not implemented yet. It's fairly easy to add new languages except for a minor issue (which has held up support for Lisp and Haskell) and a major one (which hasn't been a problem for me yet, but will be). The minor issue is that the tokenising framework doesn't handle nesting comments. The major issue is that there is no support at all for wide characters or for encodings other than ISO Latin 1, and as part of this, C/C++/Java Universal Character Names (\uxxxx \Uxxxxxxxx) are not processed correctly. It's free software and worth what you paid for it, but I have found it very useful and you may too.

Beware: this is not a pretty-printer, just a token styler. It does not add or remove line breaks or indentation for any language. Although I've had some ideas about adjusting inter-token spacing, they have NOT been fully implemented. All this does is filter tokens and maybe add some markup.

UnRolled Strict Lists

Unrolled strict lists are spine-strict lists which have been unrolled, in this case by a factor of four, so that each "box" contains 4 elements, not 1. This saves space, and should also save time. I've provided versions for Haskell (Ursl.hs) and Clean (Ursl.dcl and Ursl.icl). The Clean versions were for Clean 1.3 and have not been tried in Clean 2.x. If anyone is interested I'll be happy to whip up an SML version. An Erlang version also exists, but has not yet been tested.

Four XML tools:

The files are