On Mon, 13 Feb 2006, Florian Schmidt wrote:
On Sun, 12 Feb 2006 20:46:08 -0800 (PST)
"Kjetil S. Matheussen" <kjetil(a)ccrma.stanford.edu> wrote:
Das_Watchdog
============
ABOUT
-----
Das_Watchdog is a program heavily and shamefully inspired by the
rt_watchdog program made by Florian Schmidt:
http://tapas.affenbande.org/?page_id=38
Hehe, why shamefully? This is open source, baby. So i'm glad there's
some alternative to my messy code ;) And btw: the two programs are still
a bit different. rt_watchdog is a daemon. I have wondered about how to
make it known to the user that it has kicked in. The only solution i
found was to write into the logs. Opening an xwindow is an interesting
solution. Does linux maybe even have a standardized way for this kinda
stuff?
Don't know. It should. Actually, I did not try the program very hard
before releasing, so starting the program outside X won't start the
program. X just refuse connection...
I eventually found a work-around though, but it involves setting up
password-less ssh connection for root (secure, but its a bit work to set
up), and let an X-program run "xhost local:root" after X has started. Not
very nice, but it works.
However, this
one has some improvements:
1. It works with 2.4 kernels as well as 2.6. (well, at least I think it
works with 2.6...)
2. Instead of permanently setting all realtime processes to run
non-realtime, das_watchdog only sets them temporary.
3. When the watchdog kicks in, an X window should pop up that tells you
whats happening. (just close it after reading the message).
INSTALLING
----------
make
cp das_watchdog /usr/local/sbin/
echo '/usr/local/sbin/das_watchdog & >/dev/null' >>/etc/rc.local
This assumes an initscript style that's not used on all linux systems.
Well, this was just an example. /etc/rc.sysinit can also be used.
reboot
Also i wonder: Is it safe to simply use a static int as "event counter"?
Yes.
Might this not fail on SMP boxes?
Nope, its safe. One thread increases the variable, and another check that
it has been increased. If that fails, something is wrong with the
machine.