[LAU] ABX test (was Re: What is the best MP3 encoder?)

Peder Hedlund peder at musikhuset.org
Tue Apr 2 20:43:47 UTC 2013


Quoting Ralf Mardorf <ralf.mardorf at alice-dsl.net>:
> Compare ABX with ABX box 1 and compare ABX with box 2. Is there a
> difference between X and A and B for one of those boxes. I wouldn't ask
> for the sound quality and I wouldn't ask if they should notice a
> difference, what does sound better. I bet doing the test like that,
> people would hear a cowbell in A of box 1, when there is no cowbell,

I don't know if you're intentionally trying to misunderstand what an  
ABX test or if you just do it anyway, but the test isn't to try and  
trick the person doing it. In the case of testing an encoder it's  
taking two identical files, only that one of them has been run through  
the encoder, and see if the listener can tell which one is which. If  
s/he can then the encoder isn't transparent to that person. If not, it  
is.
And in that single test it doesn't matter is the person doing it is an  
experienced sound engineer or a bum from behind the trash can. The  
purpose is to see if the encoder is transparent for as wide an  
audience as possible and the bum night happen to be sensitive to  
pre-echo, smearing and similar artifacts while the engineer might not.

> The so called blind and double blind tests are already manipulated by
> the question. So if people think most of us won't hear a difference, we
> just guess that we hear the difference, than the question of the test
> would manipulate us too. Tests are good to get a rough impression, but
> they likely say less about real usage and it's similar for statistics.
> They are helpful, but you must be able to understand that a test is a
> test and that a statistic is a statistic.

If you can't tell if file X is the same as file A or file B when you  
get to compare them how are you going to be able to tell which one  
you're listening to if you only get to hear one file, as in a real  
usage situation?

The test is a challenge, and it seems it's one you for some reason  
don't want to take yourself from the way you're bashing it. However  
it's not a challenge where everyone is going to laugh at you if you  
fail; the people making the encoder will thank you for confirming that  
their work is good enough for one more person.

As for the statistical verification of the test it's obviously not all  
black and white.
If you're correct in 50% or less of the test cases it means you're not  
hearing any difference, are basically just guessing and that the  
encoder is transparent to you. If you're correct in 75% it's a bit  
trickier; you're probably hearing some differences and that means the  
encoder isn't entirely transparent. If you're 90% correct the encoder  
isn't transparent to your ears.

But again; it doesn't matter if you're 100, 82, 75, 67 or 50 percent  
correct. Given enough test data the people making the encoder will  
have learned something about their work and you have, to some extent,  
learned something about your ability to hear the difference between  
the original and their encoded audio.

- Peder


More information about the Linux-audio-user mailing list