I don't
have the time now to analyze the results but it would seem the
problem is freqtweak in combination with jackd. When the system freezes
after starting stuff from a text console, <alt>sysrq-p prints
information about the current registers and the printout of repeated
dumps only shows the jackd and freqtweak processes over and over again.
Good news (apparently - most probably the machine I'm testing on will
freeze while I'm typing this message :-)
It eventually did after I sent the message :-(
The problem with 2.4.20 appears to have been ext3. I
finally got a trace
of the deadlocked processes through the sysrq key (after retyping lots
of boring numbers from the screen) and ksymoops is pointing to something
stuck in ext3. With that clue I went to the ext3 site:
http://www.zip.com.au/~akpm/linux/ext3/
Well, the machine keeps locking even with those patches. A sure way for
me to kill it is to run jack, ardour, freqtweak, qjackconnect and a
process doing "tar cvf usr.tar /usr" (in the home partition). Sometimes
I cannot even start all the stuff and I don't get to the tar process,
sometimes it takes a while to die. Adding some heavy disk activity to
the mix seems to help kill it faster. But I've seen it die with just
jack and ardour running.
This is what I'm currently testing:
2.4.20 + capabilities + preempt + lowlat +
[from Con Koliva's page]
Read latency2 disk hack (Andrew Morton) + ACPI + variable HZ (1000) +
[from an older jl patch]
drm low latency +
[from ext3 page]
ext3 patches for 2.4.20
Same results on a quick test with 2.4.21-pre2.
I built the icebox kernel debugger to try to get some info on where the
program is hanging and this is what I get in terms of what's happening
(four instances of breaking into the debugger with the Sysrq-d key),
this is the list of tasks:
1)
__switch_to +3e
schedule +269
sleep_on +45
__mark_inode_dirty +d9
pipe_write +1b9
sys_write +9f
system_call +33
2)
sleep_on +6f
bread +20
__mark_inode_dirty +d9
pipe_write +1b9
poll_freewait +44
sys_write +9f
system_call +33
3)
__wake_up +55
bread +20
__mark_inode_dirty +d9
pipe_write +1b9
poll_freewait +44
sys_write +9f
system_call +33
4)
schedule +1ab
sleep_on +45
bread +20
__mark_inode_dirty +d9
pipe_write +1b9
poll_freewait +44
sys_write +9f
system_call +33
So the system seems to be stuck in __mark_inode_dirty, whatever that is.
Each time I break into the debugger I see one of the jack related
processes as the current process. No other processes, so I assume the
SCHED_FIFO ring is still running but everything else is being blocked by
the mark_inode_dirty call.
Any interested kernel hackers out there to help with this?
Awaiting instructions... :-)
-- Fernando