June 18, 2010

Bringing in the experts

There are numerous articles, reviews, and technical reports on the JPEG 2000 format, many free to view online. Despite this, we found it difficult to determine how we could make best use of the format in a practical way. There are 13 "parts" to JPEG 2000 - from basic image formats to a metadata format, and even a digital cinema format. Mostly these parts are extensions to the core specification. Through our own reading, we were able to determine that Parts 1 and 2 were the ones we needed to look at. But which one to use? Part 1 specifies both a compression algorithm, and a format. Part 2 specifies a different algorithm, and extensions to the format. We could find little - short of becoming a technical expert - that would allow us to adequately weigh up the pros and cons of the various options, and even less on how others have made their decisions.

In Spring 2009 we turned to Simon Tanner, Director of Kings Digital Consultancy Services, for some advice. Simon agreed to search out the experts and provide us with a report setting out clear recommendations: primarily which format and compression to use for preservation and access, and what features we should implement. We provided him with a brief of our requirements, the background to our intended digitisation activities, and some sample images.

Simon did find an expert to work on the report: Robert Buckley, colour digital imaging expert and member of the JPEG Committee. Rob carried out a number of tests on the images we supplied looking at the implications of lossless v. lossy compression, how we might get the best out of certain JPEG 2000 features, how we should manage technical metadata, and more. This provided the evidence, set out in the report, that backed up his final recommendations.

The key recommendation was that we use the Part 1 compression and JP2 format for our digitisation projects, for both the archival master format as well as the access copy. Also important was the recommendation that we use a lossy rather than a lossless format - maintaining a high quality that could be considered "visually lossless". Although this results in a loss of information that is non-recoverable, the data that is lost was never visible to the human eye, and therefore simply unnecessary for our needs. The Wellcome Library intends to follow the recommendations as closely as possible for future digitisation projects, although exact compression levels used would need to be determined on a collection-by-collection basis with further tests.

The report is available to view on our website.

No comments: