On Mon, 13 Feb 2006, Florian Schmidt wrote:
  On Sun, 12 Feb 2006 20:46:08 -0800 (PST)
 "Kjetil S. Matheussen" <kjetil(a)ccrma.stanford.edu> wrote:
  Das_Watchdog
 ============
 ABOUT
 -----
 Das_Watchdog is a program heavily and shamefully inspired by the
 rt_watchdog program made by Florian Schmidt:
 
http://tapas.affenbande.org/?page_id=38 
 Hehe, why shamefully? This is open source, baby. So i'm glad there's
 some alternative to my messy code ;) And btw: the two programs are still
 a bit different. rt_watchdog is a daemon. I have wondered about how to
 make it known to the user that it has kicked in. The only solution i
 found was to write into the logs. Opening an xwindow is an interesting
 solution. Does linux maybe even have a standardized way for this kinda
 stuff?
 
Don't know. It should. Actually, I did not try the program very hard
before releasing, so starting the program outside X won't start the
program. X just refuse connection...
I eventually found a work-around though, but it involves setting up
password-less ssh connection for root (secure, but its a bit work to set
up), and let an X-program run "xhost local:root" after X has started. Not
very nice, but it  works.
   However, this
one has some improvements:
 1. It works with 2.4 kernels as well as 2.6. (well, at least I think it
     works with 2.6...)
 2. Instead of permanently setting all realtime processes to run
     non-realtime, das_watchdog only sets them temporary.
 3. When the watchdog kicks in, an X window should pop up that tells you
     whats happening. (just close it after reading the message).
 INSTALLING
 ----------
 make
 cp das_watchdog /usr/local/sbin/
 echo '/usr/local/sbin/das_watchdog & >/dev/null' >>/etc/rc.local
 This assumes an initscript style that's not used on all linux systems.
 
 
Well, this was just an example. /etc/rc.sysinit can also be used.
   reboot 
 Also i wonder: Is it safe to simply use a static int as "event counter"? 
 
Yes.
  Might this not fail on SMP boxes?
 
Nope, its safe. One thread increases the variable, and another check that
it has been increased. If that fails, something is wrong with the
machine.