Excerpts from Robin Gareus's message of 2011-12-10 15:26:11 +0100:
Hash: SHA1
Hoi Philipp,
On 12/09/2011 09:20 PM, Philipp wrote:
Hi there,
I could use some advise.
You may or may not heard of replaygain. It's reasonably widely used in
consumer audio, but sometimes I wish it was available for video as well.
By this I mean I wish it was available for the audio part of the video.
Well, I need a programming project for a university course and this is
just one of my ideas that I want to propose to my teacher and
prospective teammates. In order to do this I'd like to narrow it down a
bit further and especially want to find out whether I have the right
idea of how it can be achieved.
Scanning/tagging
Since replaygain works on whole audio files I think I need to extract
the whole audio track from the container. How easily this can be
achieved I don't know.
#ffmpeg -i YOUR_VIDEO_FILE -vn -acodec pcm_f32le -f wav /tmp/file.wav
#sndfile-info /tmp/file.wav # prints max gain.
Actually, you just need to decode the audio-track, not extract it; Alas
sndfile-info only calculates the peak-value for seekable audio files not
pipes.
Thanks, guess I'd need the whole track anyway before making
calculations.
https://github.com/avuserow/ffmpeg-replaygain seems to
do what you want
but requires ffmpeg av* libs <=0.5 and/or a bit of patching.
My ffmpeg is rather recent (20111123) but the program doesn't build, so
I guess it needs patching. Seems like my ffmpeg is too new. The
implementation looks insanely complex to me.
After that,
the scanning process should work as
with any audio file. Afterwards the calculated replaygain values have to
be added to the metadata of the file. I have no idea how hard it is to
add new metadata fields to video formats.
Theoretically easy. Practically it depends on the video-codec,
container-format and library support to re-write the meta-data in-place.
Worst case: you'll have to create a new file, copy the codecs, but
rewrite the container.
Note that you may need to calculate and store the gain for _each_
audio-track. Many films include two or more tracks: e.g. stereo and 5.1.
# ffmpeg-options -ac <channel-count> -map <audio-track>
I don't know any command-line apps to modify meta-data on a per track
basis in-place (maybe ffmpeg can do that?), but the new liav* meta-data
API is pretty nifty.
Do you refer to
http://www.libav.org/ ?
Playback
Video players need to be aware of those tags, read the metadata and scale
the playback volume accordingly. This is probably not hard per se, but
there are many players out there.
Spot on. Have you checked #replaygain-video on
irc.freenode.net ?
Might be dead by now..
http://forum.xbmc.org/showthread.php?t=39320
Yes, the channel doesn't exist.
However, I
plan to start with a single
player, even with a single file format, and go from there.
Question 1: Is there anything better than replaygain that should be used
instead?
Yes. the big knob on your PA labeled "Volume" :)
I don't think it'll be particularly useful to have replay gain for movies.
A compressor plugin (for video-players that don't yet have one) would be
much more useful in the real world because
- purists won't care for neither replaygain nor compression
Agreed, if purists
are people who only watch professionally produced
movies or similar. In such a case there'd be no need for it.
- consumers will complain that replay-gain is not
sufficient for
long[er] films and won't care about sound-degradation by compression
Why would it not be sufficient for longer films? It can't do anything to
improve bad mixing, that's true, but there's not much one can do about
that anyway (except maybe apply some eq to make dialogs understandable).
Question 2:
Which player would be easiest to hack to add such
functionality? Could it be a gstreamer plugin? mplayer?
gstreamer has all the building blocks (audio-gain, meta-data parser);
for a quick/dirty prototype.
But adding a gain-factor is trivial in every player [that has already
mete-data parsing]. The most difficult will be to find an entry-point
and the place where to add the gain scaling code. esp. in larger
projects like mplayer, VLC or totem.
Unless you want this to remain a school project, check for
"easiest player to get it included upstream"; that'll indirectly make it
easy to hack :)
Yes, those are the big questions. Maybe gstreamer is the way to go as I
won't have much time.
Question 3:
How much work would it be?
The project should be done in C++ if possible, otherwise C. Group size
2-4 Students, all rather new at C/C++ and rather inexperienced in
general.
rather inexperienced -> /build one to throw [it] away/.
Re-build it with then experienced students :)
On a scale from 1(trivial) to 10 (very hard). I'd place it on 6.
One one side there's a lot of code that can be re-used (libreplaygain,
ffmpeg,..) yet their documentation is very compact for experts. On the
other side it requires diligence and some maths knowledge. Integration
in desktop-enviroments also needs some digging (e.g. hooks on mime-types
for file-manager integration, trigger re-scan on demand, in the
background,..)
It's a great educational project with many problems that one will
encounter in other /real-world/ projects.
Thank you very much for that, I fear that it might be too hard. Most
students have very little coding experience. Most courses require little
coding and that is mostly done in java. The course in question is a
beginner C/C++ course and for many students this was the first experience
with those languages. I think it would be the first real world project
too. Thanks a lot for your input, it gets increasingly clear to me that
it most likely is too hard for this project.
For an experienced coder the analyzer, meta-data
re-write, mplayer gain
filter code would be less than a day of work.
Documentation, [social] interaction with upstream, maintenance, etc can
easily take over a year.
Other ideas I have are in short:
- A CLI (using readline) connection manager for jack audio/midi and alsa
midi that can handle large numbers of ports.
Did you try
http://www.akjmusic.com/software/jackctl20110317.py ??!
I thought I knew them all...
Thanks, this is great and pretty close to what I/we imagined. I don't
fully understand the alsa midi mode, I think it doesn't use a pager for
many ports and I guess a way to connect a range of ports to another
would be useful as well, but it's really close to what I had in mind.
Julien will be happy to hear about this.
That pretty much takes care of this idea :)
More detailed
ideas exist
thanks to Julien Claasen.
- A simple but hopefully sane mplayer GUI
RFTL.
I sadly don't what that means :)
It looks reasonably nice, similar to smplayer, but the devil is in the
details. And this one seems to be for OSX only.
- A new GUI
for ecasound
Another problem I might have is that most students in the course are
Windows users, not sure whether I can go solo.
You can x-compile for windows and test with wine. ffmpeg is x-platform,
as is mplayer and gstreamer. ..just be prepared to throw in a day or two
to set up a x-platform build-environment.
That's another possibility, thanks. I guess I'll need to do something
like this no matter what the project will be.
Thanks a lot Robin,
I'm pretty much back at square one now, but that's ok. Maybe I can
develop the ecasound GUI idea, maybe I should find a team first and try
to come up with an idea together with the others.
Regards,
Philipp