MpoContainer

From DaphneWiki

Jump to: navigation, search

Contents

MpoContainer Format

This is a generic container format I came up which allows one to store different chunks of data in a single file. I developed it mainly as the foundation for my LDImage video/audio design.

The container is designed to hold large chunks of data, so 64-bit numbers are used in most places.

mpolib (the cross-platform library that I wrote) contains the code to manage these containers.

All integers are little endian unless otherwise specified.

Container Structure

Container Header
Blob Header
Blob Contents
Blob Header
Blob Contents
...
...
Last Blob Header
Last Blob Contents
Container Table of Contents

Container Header

Number of Bytes Description
4 Version. Right now, the only version is 0x4E4F4331 which spells "1CON" in ASCII (little endian). The last digit can be incremented to increase the version.
16 MD5 hash of all bytes that occur after this point (including the remaining header bytes). This is so we can verify the maximum amount of the file for integrity.
8 Total number of blobs.
8 Offset to table of contents (so that any blob can be jumped to instantly).

Blobs

Following the container header are 1 or more blobs. A blob is just a chunk of data. Each blob has its own header.

Blob Header

Number of Bytes Description
8 Blob's size in bytes.
16 MD5 hash of the blob's contents (data).
4 Reserved (must be all 0's)
4 Arbitrary blob ID. This can be whatever you want, or you don't have to use it at all. It's to help you identify a blob without having to parse its contents.

Blob Contents

After the blob header comes the blob's contents (ie, its data).

Blob Footer

There is no blob footer. A new blob header starts right after the previous blob's last byte of data.

Container Table of Contents

The table of contents comes at the very end of the container and begins right after the last blob ends. It is a blob lookup table. Its purpose is to store the offsets of every blob so that any blob can be read as quickly as possible.

The format of the table of contents is as follows:

Byte Offset Size in Bytes Description
0 8 Offset from the beginning of the file to where the first blob starts.
8 8 Offset from the beginning of the file to where the second blob starts.
... ... ...
N * 8 8 Offset from the beginning of the file to where the Nth blob starts.

Container Example

Simple container.png

Header version

MD5 hash of all bytes that come after

Blob count

Offset to Table of Contents

Length of blob 0's data

MD5 of blob 0's data

4 bytes of reserved, followed by blob 0's arbitrary ID (7 in this case)

Blob 0's data ("ABC" C-styled string)

Length of blob 1's data

MD5 of blob 1's data

4 bytes of reserved, followed by blob 1's arbitrary ID (9 in this case)

Blob 1's data ("DEFG" C-styled string)

The Table of Contents (blob 0 start at offset 0x24, blob 1 starts at offset 0x48)

Personal tools