LDImage

From DaphneWiki

(Difference between revisions)
Jump to: navigation, search
(added link to Mpo Container)
(Laserdisc Image Format)
Line 97: Line 97:
=== Why 44100 Hz audio instead of 48000 Hz? ===
=== Why 44100 Hz audio instead of 48000 Hz? ===
Daphne is already built on 44100 Hz audio, and changing Daphne to use 48000 Hz audio would require either changing all existing .OGG audio files that are out there (not worth it), or supporting both 48kHz and 44.1kHz (which I don't have a good enough reason to consider at this point).  I feel that 48 kHz audio is a good conservative choice for preservation, but for presentation, 44.1 kHz audio should be more than adequate, and it reduces the overall file size (which is important to me).
Daphne is already built on 44100 Hz audio, and changing Daphne to use 48000 Hz audio would require either changing all existing .OGG audio files that are out there (not worth it), or supporting both 48kHz and 44.1kHz (which I don't have a good enough reason to consider at this point).  I feel that 48 kHz audio is a good conservative choice for preservation, but for presentation, 44.1 kHz audio should be more than adequate, and it reduces the overall file size (which is important to me).
 +
 +
=== Future Extensions ===
 +
Someone suggested that I extend this format to include multiple audio tracks.  I've thought about how I would do this and my thinking right now is to create a new version header that also would indicate how many audio tracks are present.  Then each audio blob would contain each audio track appended. (so if there were two audio tracks, each audio blob would be twice as big)  This should be enough info to figure out where the correct audio track data is.

Revision as of 23:13, 17 August 2010

Contents

Laserdisc Image Format

The laserdisc image format will use the container format I've written for mpolib (not discussed here).

Blob Index Blob ID Description
0 0xF0 The header, which needs to contain at minimum a version identifier.

At this time, it will contain will be the four ASCII bytes '1', 'L', 'D', 'I' in that order, followed by a 4-byte little-endian pixel width integer, a 4-byte little-endian pixel height integer, and a 4-byte little-endian integer which will be 1 if the frame rate is 29.97f. A total of 16 bytes.

1 0xE0 a common JPEG header (ie the 'tables')
2 0x10 VIDEO FIELD: an "abbreviated" JPEG of field 0 of track 0
3 0x20 AUDIO: uncompressed 44100 Hz 16-bit PCM audio spanning the time occupied by field 0 of track 0
4 0x10 VIDEO FIELD: an "abbreviated" JPEG of field 1 of track 0
5 0x20 AUDIO: uncompressed 44100 Hz 16-bit PCM audio spanning the time occupied by field 1 of track 0.
6 0x10 VIDEO FIELD: an "abbreviated" JPEG of field 0 of track 1
... ... And so on until the final laserdisc track has been stored
Last Blob 0xD0 the VBI data (stored in my VBI format)

So the algorithm to search for a track will be:

blob index = (track index * 4) + 2

Because there are 2 blobs at the beginning.

About Blob ID's

The container API (from mpolib) allows each container to have an arbitrary 32-bit ID. I did this so I could support having completely blank frames and completely empty audio to save space. Laserdiscs often have periods of blank video and audio which show up as noisy black frames and noisy analog audio, which we do not want to store! I also want to use these ID's so that I can use different compression schemes in the future; for example, I may want to try compressing the audio in the future.

Therefore, the ID's are as follows:

ID Description
0 Undefined
0x10 Regular JPEG video field
0x11 Blank video field (blob will contain three bytes that represent the YUV color that the blank frame should be filled with. The first byte will be Y, the second byte will be U, the third byte will be V. YUV was chosen because it is optimized.)
0x20 Regular uncompressed AUDIO (44100 Hz, 16-bit, PCM)
0x21 Blank audio (blob will contain one little-endian, 32-bit unsigned integer which indicates how many blank bytes (44100 Hz, 16-bit) this blank audio represents)
0xD0 VBI blob
0xE0 JPEG header
0xF0 Version header

Why does the VBI blob need to come last?

This is actually kind of important. The VBI is the data that I feel is most likely to change if anything changes, and by putting it at the end, it ensures that any required changes to the VBI will have a minimal effect on the overall image file. If the VBI was at the beginning and its size needed to be changed, this would impact the entire file which would be costly.

Why does the JPEG data come before the audio data?

This is deliberate. The plan is that the JPEG data will be read first, then handed off to another thread (ideally, another CPU) to be decompressed. This allows the first thread to continue working on audio.

Why 44100 Hz audio instead of 48000 Hz?

Daphne is already built on 44100 Hz audio, and changing Daphne to use 48000 Hz audio would require either changing all existing .OGG audio files that are out there (not worth it), or supporting both 48kHz and 44.1kHz (which I don't have a good enough reason to consider at this point). I feel that 48 kHz audio is a good conservative choice for preservation, but for presentation, 44.1 kHz audio should be more than adequate, and it reduces the overall file size (which is important to me).

Future Extensions

Someone suggested that I extend this format to include multiple audio tracks. I've thought about how I would do this and my thinking right now is to create a new version header that also would indicate how many audio tracks are present. Then each audio blob would contain each audio track appended. (so if there were two audio tracks, each audio blob would be twice as big) This should be enough info to figure out where the correct audio track data is.

Personal tools