Lee Revell wrote:
On Mon, 2006-06-26 at 21:44 +0200, Pieter Palmers
wrote:
Lee Revell wrote:
On Mon, 2006-06-26 at 21:05 +0200, Pieter Palmers
wrote:
Lee Revell wrote:
> On Mon, 2006-06-26 at 16:51 +0200, Pieter Palmers wrote:
>>
>> Of course. My monday-morning bad temper is over by now, and I hope I
>> didn't transfer it to any of you. I'll provide the panic, one way or
>> another.
>>
> Can you reproduce the problem on a non-RT kernel?
>
No, it only occurs with RT kernels, and only with those configured for
PREEMPT_RT. If I use PREEMPT_DESKTOP, there is no problem. (with
threaded IRQ's etc... only switched over the preemption level in the
kernel config).
I've uploaded the photo's of the panic here:
http://freebob.sourceforge.net/old/img_3378.jpg (without flash)
http://freebob.sourceforge.net/old/img_3377.jpg (with flash)
both are of suboptimal quality unfortunately, but all info is readable
on one or the other.
Can you add debug printk's before and after
tasklet_kill() in
ohci1394_unregister_iso_tasklet to see where it locks up?
That's the first thing I did: the printk before tasklet_kill succeeds,
the one right after the tasklet_kill doesn't.
OK that's what I suspected.
It seems that the -rt patch changes tasklet_kill:
Unpatched 2.6.17:
void tasklet_kill(struct tasklet_struct *t)
{
if (in_interrupt())
printk("Attempt to kill tasklet from interrupt\n");
while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
do
yield();
while (test_bit(TASKLET_STATE_SCHED, &t->state));
}
tasklet_unlock_wait(t);
clear_bit(TASKLET_STATE_SCHED, &t->state);
}
2.6.17-rt:
void tasklet_kill(struct tasklet_struct *t)
{
if (in_interrupt())
printk("Attempt to kill tasklet from interrupt\n");
while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
do
msleep(1);
while (test_bit(TASKLET_STATE_SCHED, &t->state));
}
tasklet_unlock_wait(t);
clear_bit(TASKLET_STATE_SCHED, &t->state);
}
You should ask Ingo & the other -rt developers what the intent of this
change was. Obviously it loops forever waiting for the state bit to
change.
because you are not allowed to yield() in an RT context?
I wish I had been a little more elaborate on my initial mail, as it
would have saved us some time, and communication troubles (on my part
that is). I already spotted the msleep() change in the patch, and I
already tried reverting it. That gives you a nice new panic message,
something like 'BUG: yield()'ing in ...'.
I'm wondering why a patched, but not 'complete preemption' configured
kernel works fine. This change is present in them too, so it probably
has something to do with the msleep() implementation.
Another strange thing is: why doesn't the tasklet finish, so that it can
be 'unscheduled'? I have my IRQ priorities higher than any other RT
threads, so I would expect that the tasklet can finish. Or is
tasklet_kill not-preemtible? that would be very strange as I would
expect that busy waiting on something in a non-preemptible code path on
a single-cpu system always deadlocks.
Greets,
Pieter