Wednesday, January 21, 2009

J2ME: Thumbnails Extraction of JPEG (Exif) Images made with Mobile Phone Camera

Mobile Phones Cameras, being digital, make pictures/images in JPEG format particularly ‘Exif’ flavor of JPEG format. Exif has been decided as a standard for Digital Cameras. We are not going to discuss the process involved in the creation of JPEG images i.e. the encoding process of a raw image into compressed JPEG image which essentially involves three steps in order of Discrete Cosine Transform(DCT), Quantization and Entropy (Huffman) Encoding. Here we are concerned about how data is organized that is the ‘FORMAT’ once a JPEG image has been created. However we do need a JPEG decoder for thumbnail extraction, there is one freely available. First we talk a little about JPEG format and its Exif flavor.
Few Words on JPEG format
A JPEG (jpg) file is organized in order of markers along with their contents. Each marker itself takes 2 bytes. The very first marker (0xFFD8) stands for Start of Image (SOI). This declares that this is a JPEG file. The second marker is (APPn) that depends upon the application using JPEG hence the marker contains an identifier in its contents indicating the application. The marker have any value from APP0 (0xFFE0) to APP15 (0xFFEF) both inclusive.
In JPEG format, immediate 2 bytes after each marker contain the length of the marker’s contents including the length field itself, so is the case with APPn marker. APP0 (0xFFE0) belongs to ‘JFIF’ marker while APP1 (0xFFE1) to ‘Exif’ marker (We are interested in APPn marker particularly APP1 (0xFFE1) marker for extracting thumbnail which is already embedded in the file within this marker). In APPn, after length field, following bytes contain the ASCII code equivalent of the identifier name (5 bytes for ‘JFIF’ and 6bytes for ‘Exif’). Please see the NOTE below. The Exif identifier is 45, 78, 69, 66, 00, 00 (6 bytes). From there on, Exif format is same as TIFF image format is. More detail on Exif (and its embedded TIFF) please see here. After reading a specific number of offset bytes when thumbnail offset is reached, it could be in one of three formats JPEG compressed (most commonly used), RGB TIFF or YCbCr TIFF (The number of offset bytes depends upon ‘byte align’ discussed below). If JPEG compressed, it is just like another JPEG image of a cut down scale which is decoded for display. (I have tried on Sony Ericson K800i and Nokia 6630, both of them had the thumbnail in JPEG Compressed format hence our discussion pertains to this only). Another important thing about TIFF header (8 bytes), embedded in Exif format, is that its first 2 byte informs you about the byte align of TIFF data to be followed that is either little endian (used by Intel) or big endian (used by Motorola). So you have got to look for this thing to calculate the offset while reading Exif file in general and TIFF file in particular. JPEG generally uses big endian however Exif allows both of them. Moreover, most of the digital cameras using Exif format follow little endian. Also remember that all the offset in TIFF are calculated from the first byte of the TIFF header.
In case of JFIF, last byte out of 5 contains zero while last two bytes in case of ‘Exif’ contain zero, for example, 4A, 46, 49, 46, 00 (5 bytes)for JFIF. The JFIF v1.02 and above have JFIF extension according to which APP0 has extension part which again starts with application marker hence making two APP markers. The thumbnail may be located either under first marker or under second marker For more detail on JFIF , please see here.

1 comment:

Peter Perhac said...

this looks good. Thanks for sharing this. However, the link at the end no longer works. Still, was a good read.