Quoting Ralf Mardorf <ralf.mardorf(a)alice-dsl.net>et>:
Compare ABX with ABX box 1 and compare ABX with box 2.
Is there a
difference between X and A and B for one of those boxes. I wouldn't ask
for the sound quality and I wouldn't ask if they should notice a
difference, what does sound better. I bet doing the test like that,
people would hear a cowbell in A of box 1, when there is no cowbell,
I don't know if you're intentionally trying to misunderstand what an
ABX test or if you just do it anyway, but the test isn't to try and
trick the person doing it. In the case of testing an encoder it's
taking two identical files, only that one of them has been run through
the encoder, and see if the listener can tell which one is which. If
s/he can then the encoder isn't transparent to that person. If not, it
is.
And in that single test it doesn't matter is the person doing it is an
experienced sound engineer or a bum from behind the trash can. The
purpose is to see if the encoder is transparent for as wide an
audience as possible and the bum night happen to be sensitive to
pre-echo, smearing and similar artifacts while the engineer might not.
The so called blind and double blind tests are already
manipulated by
the question. So if people think most of us won't hear a difference, we
just guess that we hear the difference, than the question of the test
would manipulate us too. Tests are good to get a rough impression, but
they likely say less about real usage and it's similar for statistics.
They are helpful, but you must be able to understand that a test is a
test and that a statistic is a statistic.
If you can't tell if file X is the same as file A or file B when you
get to compare them how are you going to be able to tell which one
you're listening to if you only get to hear one file, as in a real
usage situation?
The test is a challenge, and it seems it's one you for some reason
don't want to take yourself from the way you're bashing it. However
it's not a challenge where everyone is going to laugh at you if you
fail; the people making the encoder will thank you for confirming that
their work is good enough for one more person.
As for the statistical verification of the test it's obviously not all
black and white.
If you're correct in 50% or less of the test cases it means you're not
hearing any difference, are basically just guessing and that the
encoder is transparent to you. If you're correct in 75% it's a bit
trickier; you're probably hearing some differences and that means the
encoder isn't entirely transparent. If you're 90% correct the encoder
isn't transparent to your ears.
But again; it doesn't matter if you're 100, 82, 75, 67 or 50 percent
correct. Given enough test data the people making the encoder will
have learned something about their work and you have, to some extent,
learned something about your ability to hear the difference between
the original and their encoded audio.
- Peder