LDImage
From DaphneWiki
Contents |
The need for a Laserdisc Image Format
Problems to solve:
- File size needs to be as small as possible while still providing a good experience when playing a game
- Audio must be synced up as exactly as possible with video to cater to some game designs (Firefox for example)
- Video frames need to be stored separately as two fields to cater to some game designs
- Any field needs to be able to be immediately decoded without relying on previous fields (a problem with mpeg2)
- Playing backward must be supported and offer equivalent performance to playing forward (a problem with mpeg2)
Laserdisc Image Format
The laserdisc image format will use the container format I've written for mpolib (not discussed here).
Blob Index | Blob ID | Description |
0 | 0xF0 | The header, which needs to contain at minimum a version identifier.
The most current version will be the four ASCII bytes '2', 'L', 'D', 'I' in that order, followed by a UTF-8 string of JSON text which is described below. The previous (deprecated) version was the four ASCII bytes '1', 'L', 'D', 'I' in that order, followed by a 4-byte little-endian pixel width integer, a 4-byte little-endian pixel height integer, and a 4-byte little-endian integer which will be 1 if the frame rate is 29.97f. A total of 16 bytes. (the height refers to a frame, not a field, so for NTSC it would be 480) |
1 | 0xE0 | a common JPEG header (ie the 'tables') |
2 | 0x10 | VIDEO FIELD: an "abbreviated" JPEG of field 0 of track 0 |
3 | 0x20 | AUDIO: uncompressed 44100 Hz 16-bit PCM audio spanning the time occupied by field 0 of track 0 |
4 | 0x10 | VIDEO FIELD: an "abbreviated" JPEG of field 1 of track 0 |
5 | 0x20 | AUDIO: uncompressed 44100 Hz 16-bit PCM audio spanning the time occupied by field 1 of track 0. |
6 | 0x10 | VIDEO FIELD: an "abbreviated" JPEG of field 0 of track 1 |
... | ... | And so on until the final laserdisc track has been stored |
Last Blob | 0xD0 | the VBI data (stored in my VBI format) |
So the algorithm to search for a track will be:
blob index = (track index * 4) + 2
Because there are 2 blobs at the beginning.
Future Extensions
Someone suggested that I extend this format to include multiple audio tracks. I've thought about how I would do this and my thinking right now is to create a new version header that also would indicate how many audio tracks are present. Then each audio blob would contain each audio track appended. (so if there were two audio tracks, each audio blob would be twice as big) This should be enough info to figure out where the correct audio track data is.
JSON header
Blob 0 contains a version ID ("2LDI") followed by JSON text which describes the attributes such as width and height of the video.
A typical sample of what this JSON text may look like is:
{ "id": 1, "name": "Dragon's Lair", "note": "NTSC, captured in 2001", "w": 640, "h": 480, "fps": 29.97, "interlaced": true, }
Discs can have multiple audio tracks and can be a different audio frequency. Here is how that may look:
{ "id": 0, "name": "Esh's Aurunmilla", "w": 720, "h": 480, "fps": 29.97, "interlaced": true, "audio": { "freq": 44100, "bits": 16, "channels" : 2, "track": [ "English", "Japanese" ] } }
Here is a description of each possible element in the JSON:
Name | Description | Default value | |||||||||||||||||
id | A canonical ID arbitrary assigned to known laserdiscs by myself (Matt O) for the purpose of allowing software to auto-detect a disc type without having to clumsily try to parse an English string. If your disc does not match one of my ID's, you can set ID to 0 or omit it entirely. | 0 | |||||||||||||||||
name | An arbitrary name for the disc, for the purpose of displaying something interesting to a human. | None, not required. | |||||||||||||||||
w | Width of a video frame. Must be set. | None | |||||||||||||||||
h | Height of a video frame (not field). Must be set. | None | |||||||||||||||||
fps | Frames per second that the video should run at. 29.97 for NTSC, 25.0 for PAL. Anything else is legal but may be unsupported.
A few goals to explicitly specify:
About Blob ID'sThe container API (from mpolib) allows each container to have an arbitrary 32-bit ID. I did this so I could support having completely blank frames and completely empty audio to save space. Laserdiscs often have periods of blank video and audio which show up as noisy black frames and noisy analog audio, which we do not want to store! I also want to use these ID's so that I can use different compression schemes in the future; for example, I may want to try compressing the audio in the future. Therefore, the ID's are as follows:
Why does the VBI blob need to come last?This is actually kind of important. The VBI is the data that I feel is most likely to change if anything changes, and by putting it at the end, it ensures that any required changes to the VBI will have a minimal effect on the overall image file. If the VBI was at the beginning and its size needed to be changed, this would impact the entire file which would be costly. Why does the JPEG data come before the audio data?This is deliberate. The plan is that the JPEG data will be read first, then handed off to another thread (ideally, another CPU) to be decompressed. This allows the first thread to continue working on audio. Why 44100 Hz audio instead of 48000 Hz?Daphne is already built on 44100 Hz audio, and changing Daphne to use 48000 Hz audio would require either changing all existing .OGG audio files that are out there (not worth it), or supporting both 48kHz and 44.1kHz (which I don't have a good enough reason to consider at this point). I feel that 48 kHz audio is a good conservative choice for preservation, but for presentation, 44.1 kHz audio should be more than adequate, and it reduces the overall file size (which is important to me). |