Monday, June 29, 2009

mpegtsmux status report (1)

It's been a while since the last post ;) People may wonder where I've gone during these days... Actually I spent some time on a graduation trip to Yun Nan province in China, and also the preparing for my thesis defense; the defense is in late July, wish me luck :)

My GSoC project on mpegtsmux is going on. Since the kicking off, I've been reading existing code and related specifications, also debugging for some use cases. As mpegtsmux is kind of mature already, my main task is to ensure robust support on common codecs, video or audio, and also add some useful features. I think this is not taking too much time, and after this my focus would turn to the new mpeg PS muxer.

I've proposed to add more features to the TS muxer in my GSoC application, but "more" seems too vague to implement. The first thing to do is to make this exact. After some research, I propose to add the following two major capabilities to the mpegtsmuxer:
1. Multiple program support
TS streams have the capability to handle multiple programs in a single stream by definition (ISO 13818-1), however mpegtsmux is currently not supporting this. (Neither do ffmpeg and VLC; If we have it then it's a unique feature! XD )
2. Constant bit rate (CBR) output
In some applications, like digital video broadcasting, the data sink needs a constant bit rate stream. This is usually archived by stuffing: if the current bit rate is less than the defined value, the muxer puts null packets in the stream to compensate for the stream. DVB people may need this feature.

Not sure if this would be interesting to other people. Any idea? I think the first feature is not hard to implement, as libtsmux has already provided the APIs for multiple program TS and we only need to define an interface and wire the APIs there. The second feature requires the functionality of bit rate detection, and can even be more complicated if we want some "auto bandwidth adjustment", that is, the muxer informing other components in the pipeline of bandwidth overflow, so that other elements like encoders can compress harder to fit the stream into the bandwidth. (Thanks Edmund Humenberger for rising this issue.) But after all I think such advanced features can be put aside for the time being.

Besides the planning, there is a note worthy finding for the case when mpegtsmux muxes an h264 + AAC combination like:
gst-launch \
filesrc location=v.264 ! h264parse ! mux. \
filesrc location=a.aac! aacparse ! mux. \
mpegtsmux name=mux ! filesink

The output stream of this pipeline has been reported to be problematic (see here). I used to suspect that the bug lies in PAT / PMT tables, but they turns out to be innocent. (damn the buggy analyzer I was using!) I was a bit worrying about the h264 handling code, until I noticed the interesting way the packets are arranged in this erroneous stream: all video packets takes the fore part, with the audio ones staying in the hind part. It's like a concatenation of two chunks of data, instead of interlaced according to their timestamps.

I thus look into h264parse and found that the parser is writing GST_CLOCK_TIME_NONE at the timestamp of every buffer. Then its' clear why the video packets goes entirely before the audio part, as the scheduling algorithm of mpegtsmux favours GST_CLOCK_TIME_NONE than other time stamps. This scheduling rule hopes to drain the GST_CLOCK_TIME_NONE stream to get more useful timestamps, but fails to detect the case when useful timestamps never come.

To verify that this is indeed the cause, I did a simple crazy test: if the scheduling algorithm is modified to interlace video and audio packets in a naive way, the stream plays on mplayer with both audio and video there (although not synchronized).

I'm going to look into h264parse and make it chopping the right timestamp this week. I'll file a bug and try to come up with a patch that fixes it. On the other hand, it might also be good if we enhance the scheduling algorithm of mpegtsmux to detect obvious timestamp error / bitrate overflow. Maybe this can be combined with the bit rate detection.

Finally I'm considering to implement different "service modes" for mpegtsmux. A service mode is a configuration of the muxer under which the output multiplex is compliant to some existing standards. What we have now can be called the basic service level, where we can mux video and audio streams together. We can also have DVB-T mode in which the stream is fully DVB-T compliant, or PS3 mode in which the stream is useful on PS3. (Thanks Edmund again for rising this requirement.) This is a more demanding task, and there is already some discussion on Bugzilla on this issue. The discussion there is really informative, and I need to consider a bit more before elaborating the goal. I'll probably write about this in later posts. Any suggestions are welcome.