On Sun, Jan 04, 2009 at 11:28:51AM -0800, Ken Restivo
wrote:
What Linux Audio problem does this patch fix? Is
it the RT lockup
problems with alsa_seq?
Yup. To be more specific:
I've seen three distinct types of reproducible behaviour from the
kernels I've tested...
Kernels that work:
2.6.24.7-rt25 (latest 2.6.24 RT patch)
2.6.25.8-rt7 (latest 2.6.25 RT patch)
2.6.26.5 (no RT patch)
These kernels work fine on my machine from a MIDI perspective --
although 2.6.24 occasionally locks up while I'm running RT applications,
which is why I was trying 2.6.26 in the first place.
Kernels that are partly broken:
2.6.26-rt1
2.6.26.3-rt3 (others reported this one "OK" on l-a-u)
2.6.26.3-rt7
2.6.26.5-rt8
These kernels behave very oddly for me: the MIDI sequencer can be opened
and closed successfully, but events are only delivered when I'm typing
on my (PS/2) keyboard! (That is, I can play some notes on my MIDI
keyboard, then go and tap the shift key on the PS/2 keyboard and they'll
all appear in aseqdump -- or I can wedge down a key so it autorepeats,
and then play fluidsynth through MIDI more-or-less happily...)
I tried generating interrupts by fiddling with other devices (e.g. a USB
mouse), but only the keyboard seems to have an effect.
Kernels that are completely broken:
2.6.26.5-rt9 (others reported this one broken on l-a-u)
2.6.26.6-rt11
2.6.26.8-rt12
With these kernels, no events are delivered even if I'm typing, and
closing /dev/snd/seq hangs the closing process; it then can't be opened
again by anything else. (This doesn't seem to have any other adverse
effects on the system, and the kernel otherwise seems solid; I ran
-rt11 on my desktop for a month under fairly heavy load after wedging
the sequencer for the first time.)
Running aseqdump under strace doesn't show anything unusual, aside from
the close() call never returning:
3562 write(1, "Waiting for data. Press Ctrl+C to end.\n", 39) = 39
3562 write(1, "Source Event Ch Data\n", 40) = 40
3562 rt_sigaction(SIGINT, {0x8048f24, [INT], SA_RESTART}, {SIG_DFL}, 8)
= 0
3562 rt_sigaction(SIGTERM, {0x8048f24, [TERM], SA_RESTART}, {SIG_DFL},
8) = 0
3562 poll([{fd=3, events=POLLIN|POLLERR|POLLNVAL}], 1, -1) = ?
ERESTART_RESTARTBLOCK (To be restarted)
3562 --- SIGINT (Interrupt) @ 0 (0) ---
3562 sigreturn() = ? (mask now [])
3562 close(3
It then looks like this in ps:
PID S WCHAN CMD
3562 D msleep aseqdump -p 20:0
(The full strace output for working and non-working invocations of
aseqdump is in the log-* files under the link above.)
Right, so the last partly-broken version is -rt8, and the first
completely-broken version is -rt9. Fortunately those are both against
the same upstream kernel (which I tested, and works fine without the RT
patch), and the interdiff between -rt8 and -rt9 is pretty small. One
change immediately stuck out as being interrupt-related:
diff -u linux-2.6.26.5-rt8/kernel/softirq.c linux-2.6.26.5-rt9/kernel/softirq.c
--- linux-2.6.26.5-rt8/kernel/softirq.c 2008-09-09 21:50:36.000000000 -0400
+++ linux-2.6.26.5-rt9/kernel/softirq.c 2008-09-11 11:56:06.000000000 -0400
@@ -536,7 +536,7 @@
unsigned long flags;
local_irq_save(flags);
- __tasklet_common_schedule(t, &__get_cpu_var(tasklet_vec), HI_SOFTIRQ);
+ __tasklet_common_schedule(t, &__get_cpu_var(tasklet_hi_vec), HI_SOFTIRQ);
local_irq_restore(flags);
}
... so I tried -rt9, but with that change reverted, and it behaves
just like -rt8 -- i.e. that's the change that causes the "partly-broken"
behaviour to become "completely broken".
(... and then while trying to figure out what that change did, I spotted
the typo that the patch fixes, and changing that made 2.6.26's MIDI
work like 2.6.24/25 for me.)
Totally freakin' awesome! Fantastic deductive work and troubleshooting.
I've built a 2.6.26.8-rt12 with your patch and will boot into it tonight or tomorrow
morning and try it out!
-ken