I'm playing around with the Ogg Vorbis wire formats, and am puzzled by something.
The spec, and as far as I understand, the code of the implementations I've looked at,
indicates that for the Ogg Page, there's a length of page_segments, then a
segment_table, a sequence of bytes the length of that value. So if there's a 1 for
page_segments, there should be one byte in the segment_table following, which contains the
length of that single segment, etc. And then, the actual payload starts after that.
But that's not what I'm seeing. Here's a peice of a valid file, encoded with
the Xiph reference oggenc and plays perfectly everywhere I tried it:
00000000: 4f67 6753 0002 0000 0000 0000 0000 4cea OggS..........L.
00000010: 8175 0000 0000 6f9c 089f 011e 0176 6f72 .u....o......vor
00000020: 6269 7300 0000 0002 44ac 0000 0000 0000 bis.....D.......
00000030: 80b5 0100 0000 0000 b801 4f67 6753 0000 ..........OggS..
00000040: 0000 0000 0000 0000 4cea 8175 0100 0000 ........L..u....
00000050: 54f5 6fa8 113d ffff ffff ffff ffff ffff T.o..=..........
00000060: ffff ffff ff07 0376 6f72 6269 732d 0000 .......vorbis-..
00000070: 0058 6970 682e 4f72 6720 6c69 6256 6f72 .Xiph.Org libVor
00000080: 6269 7320 4920 3230 3130 3131 3031 2028 bis I 20101101 (
00000090: 5363 6861 7566 656e 7567 6765 7429 0000 Schaufenugget)..
000000a0: 0000 0105 766f 7262 6973 2542 4356 0100 ....vorbis%BCV..
000000b0: 4000 0024 7318 2a46 a573 1684 101a 4250 @..$s.*F.s....BP
000000c0: 19e3 1c42 ce6b ec19 424c 1182 1c32 4c5b ...B.k..BL...2L[
I see the 01 value at 0x1a, the page_segments, indicating there is one (1) segment. I see
the 0x1e value at the next address 0x1b, the first segment in the segment_table indicating
that there 30 bytes in the segment. All good.
What's baffling me is, in that first page, the mysterious 01 value at 0x1c!
What's that doing there? Can't find any documenation of it, and it causes the
vorbis header to start one byte after the end of the segment values. And I see the value
0x76 the ASCII character 'v' to start the Vorbis packet at 0x1d, instead of at
0x1c where I'd expect it. There's also an additional trailing 01 at the end of the
Vorbis packet, at 0x39, which confuses me as well.
You'll see the next packet follows the same pattern: at 0x54 is page_segments,
there's a nice 0x11 indicating 17 segments to follow, I see all the 0xff's there
for the segment_table, followed by the 07 value for the length of the last segment, but
then... there's that spurious weird extra byte, this time it is 0x03, at 0x66, before
the payload (0x76, 'v') starts.
Not any major big deal, but I'm OCD enough that wondering about stuff like this keeps
me up at night.
-ken