MpoContainer
From DaphneWiki
Contents |
MpoContainer Format
This is a generic container format I came up which allows one to store different chunks of data in a single file. I developed it mainly as the foundation for my LDImage video/audio design.
The container is designed to hold large chunks of data, so 64-bit numbers are used in most places.
mpolib (the cross-platform library that I wrote) contains the code to manage these containers.
All integers are little endian unless otherwise specified.
Container Structure
Container Header |
Blob Header |
Blob Contents |
Blob Header |
Blob Contents |
... |
... |
Last Blob Header |
Last Blob Contents |
Container Table of Contents |
Container Header
Number of Bytes | Description |
4 | Version. Right now, the only version is 0x4E4F4331 which spells "1CON" in ASCII (little endian). The last digit can be incremented to increase the version. |
16 | MD5 hash of all bytes that occur after this point (including the remaining header bytes). This is so we can verify the maximum amount of the file for integrity. |
8 | Total number of blobs. |
8 | Offset to table of contents (so that any blob can be jumped to instantly). |
Blobs
Following the container header are 1 or more blobs. A blob is just a chunk of data. Each blob has its own header.
Blob Header
Number of Bytes | Description |
8 | Blob's size in bytes. |
16 | MD5 hash of the blob's contents (data). |
4 | Reserved (must be all 0's) |
4 | Arbitrary blob ID. This can be whatever you want, or you don't have to use it at all. It's to help you identify a blob without having to parse its contents. |
Blob Contents
After the blob header comes the blob's contents (ie, its data).
There is no blob footer. A new blob header starts right after the previous blob's last byte of data.
Container Table of Contents
The table of contents comes at the very end of the container and begins right after the last blob ends. It is a blob lookup table. Its purpose is to store the offsets of every blob so that any blob can be read as quickly as possible.
The format of the table of contents is as follows:
Byte Offset | Size in Bytes | Description |
0 | 8 | Offset from the beginning of the file to where the first blob starts. |
8 | 8 | Offset from the beginning of the file to where the second blob starts. |
... | ... | ... |
N * 8 | 8 | Offset from the beginning of the file to where the Nth blob starts. |
Container Example
Header version
MD5 hash of all bytes that come after
Blob count
Offset to Table of Contents
Length of blob 0's data
MD5 of blob 0's data
4 bytes of reserved, followed by blob 0's arbitrary ID (7 in this case)
Blob 0's data ("ABC" C-styled string)
Length of blob 1's data
MD5 of blob 1's data
4 bytes of reserved, followed by blob 1's arbitrary ID (9 in this case)
Blob 1's data ("DEFG" C-styled string)
The Table of Contents (blob 0 start at offset 0x24, blob 1 starts at offset 0x48)