* Rui Nuno Capela <rncbc(a)rncbc.org> wrote:
I've been testing 2.6.9-rc1-mm1-S3 SMP on a P4 HT
SUSE 9.1 Pro box. In
fact I've been struggling with the VP's ever since O3, while on
SMP/HT. It hasn't been a pleasent experience, to say the least.
My latest findings, S3 inclusive, goes something like this:
- can only have it stable or even bootable if CONFIG_SCHED_SMT is not
set.
- systematic hardlocks when jackd -R is started while on
softirq-preempt=1 and/or hardirq-preempt=1.
- can't get to burn CD/DVDs at all, even with that scsi_ioctl.c
command verification bypass.
- can only pray while having the boot/init process reach the xdm login
prompt successfully.
More times than I find fair and enough, it simply dies on the beach,
so to speak. And to make things worst, right after when this happens,
some hardware components start behaving erratically. Only a (very)
cold boot (i.e. extreme power down) seems to be the solution to this
strange side-effect(s).
OTOH just by rebooting on a stable vanilla kernel ends all this crappy
annoyances ;)
i tried to reproduce your problems before with no luck. I tried your
.config on my dual P4/HT box. I am also testing the VP patchset on a
dual Celeron, an 8-way Xeon, an on an AMD64 box. So it is something
special to your setup.
the initial question is whether the VP patchset is stable if you apply
-S3 but disable all the following options:
CONFIG_PREEMPT_VOLUNTARY
CONFIG_PREEMPT_SOFTIRQS
CONFIG_PREEMPT_HARDIRQS
CONFIG_PREEMPT_BKL
CONFIG_PREEMPT_TIMING
CONFIG_LATENCY_TRACE
this should give you a close to vanilla kernel. Then once you have
established a 'known point of stability', could you enable the above
options one by one (and keeping the previous options enabled) and see
which one introduces the instability? My guess would be
PREEMPT_HARDIRQS.
to debug the hard-lockups, the most reliable way is to use
nmi_watchdog=1 and a serial console to another box. It's quite painful
to debug lockups without a serial console.
Ingo