McWalter.org :: Java Hexdump


Writing a hexdump program seems to be just about the first serious program lots of us do when learning a new language. As an experienced programmer in that language, revisiting the problem can bring many of the language features (and misfeatures, often) into focus. Regardless of the language, high or low level, there's always plenty of room for optimisation.

Presented on this page is my own fairly elementary effort at hexdump. The program itself isn't very interesting - a competent java programmer should be able to throw something better together in half an hour. What is interesting, however, is the degree of improvement possible when one really tries. The first version of the program seems fine, and there's no obvious problems in the code - but see the benchmarks section below to discover how much faster it could be made.

Java hexdump software

v1Jun 13th 1999sourceInitial version
v2Feb 15th 2000sourceCleanup, optimised
v3Mar 20th 2003sourceDramatic speedup - make each line a single println call rather than lots of single char prints - now nine times faster than v2.
v4Mar 23rd 2003sourceAdded printing of 8 digit offset. Changed output code to store all output in a giant StringBuffer and then print it all in one go. This makes things a little faster.
v5Mar 28th 2003sourceChanged read code to read in a larger block at one time - a buffer of 1K seems to be the sweet spot. Also made output StringBuffer bounded (periodically flushed) so we can handle huge files (which crashed v4), and also makes us another 10% faster. Compared with v3, this version is 20% faster for small files and twice the speed for very large files. v5 is also speed compatable with od -c for small files, and twice its speed for very large files.
v5multiApr 9th 2003sourceThis is a multithreaded version of v5, with seperate threads for reading data from the source stream and for writing out to the output stream. A queue is used to transmit data from one to the other. This wasn't successful - there wasn't any speed improvement.

Benchmarks

Benchmarks for each version of the java Hexdump program are shown, dumping both a reasonably small file, where startup time should predominate, and for a pretty large one, where only raw IO performance should be important. For comparison, the results of the same benchmarks run by the cygwin version of "od -c" are shown. This is no slight on od, which is a more general purpose program with lots of features that the java hexdump lacks, and which hasn't been so monomanically engineered to a single benchmark suite. The comparison is intended, however, to show how differing versions of the java hexdump compare with a well written, sensible C program running on a fairly efficient platform.

The two benchmarks are:

The results of running these benchmarks against each revision of java hexdump (and the reference cygwin od) are shown in the following table - times are shown in the following format minutes:seconds.tenths.


  program       benchmark1          benchmark2

  od -c         0:00.5              0:48

  hexdump v1    0:06.4              7:20
  hexdump v2    0:06.4              7:20
  hexdump v3    0:00.9              0:42
  hexdump v4    0:00.7              [failed]
  hexdump v5    0:00.5              0:21

Design notes

v4

V4 was something of a gamble - it creates a single string buffer for the entire output file in one in-memory buffer. This is great for small files, but suicidal for large ones (in my defense, the original testfile I used for benchmark 2 was a lot smaller). V4 fails on benchmark2 as it can't allocate the RAM necessary for that large file. Sure, I could run the JVM with a bigger heap, but that's only postponing the problem.

Multithreading

After v5, I produced a multithreaded version of hexdump, hoping that additional speed improvements could be realised by avoiding circumstances where the read and write paths block one another, particularly when reading from one disk and writing to another. At least on Windows XP and Linux 2.4, this wasn't the case.

Possible improvements

Here's some possible improvements, some of which I'll probably do and most of which I won't:

Conclusions

On older Java platforms, JVMs typically featured a sizeable delay between being started and actually starting to run the target program. This initialisation gap was mostly due to classloading and quickening of dozens or hundreds of individual classes needed to get even the most elementary Java environment running. This was one of the major reasons why writing unix-style single-function command line utilities in java was a problem (where each invocation would need around ten seconds to get going). Later JDK versions, particularly JDK1.4, start up very quickly.

This also shows that non-synchronised StringBuffers, and economical use of the synchronised in and out objects, can make for major improvements in I/O performance. Lastly, it shows that, with modern JVM software, there's no reason why Java software has to be slow.