ArsDigita Keepalive

for AOLserver by Ben Adida and Philip Greenspun, part of ArsDigita Free Tools
ArsDigita Keepalive is a system that monitors your web services at regular, short intervals, and takes action to resolve problems found. If Keepalive fails to reach a page, depending on how many consecutive previous failures it has seen and the configuration parameters, it will take one of the following actions: Keepalive is built using AOLserver (free) and takes advantage of AOLserver's built-in scheduler (like Unix cron but lighter weight) and Tcl API (includes a call to HTTP GET a page from another server). However, unlike most of our AOLserver products, you don't need to install an RDBMS in order to use Keepalive. Web servers generally get stuck because of problems with the RDBMS, so a monitor that depended on an RDBMS would be self-defeating.

Although we generally use Keepalive to monitor AOLserver-based Web services, it will work fine to monitor any HTTP service on a Unix machine.

Installation

Which Shell Command?

You might well ask yourself which shell command will restart a Web server. It depends. In the case of AOLserver, we run the server by inserting a line in /etc/inittab:
nsjw:34:respawn:/home/nsadmin/bin/nsd -i -c /home/nsadmin/nsd.ini
which tells Unix to restart nsd if it should die for any reason. Thus keepalive just needs to kill the existing nsd process. The problem is that Web servers must be owned by root if they are to grab Port 80 and Keepalive can't kill a Web server unless it runs as root (a security risk). The solution at ArsDigita is to build a setuid Perl script that Keepalive can call: restart-aolserver
#!/usr/local/bin/perl

## Restarts an AOLserver. Takes as its only argument the name of the server to kill.

## This is a perl script because it needs to run setuid root, 
## and perl has fewer security gotchas than most shells.


$ENV{'PATH'} = '/sbin:/bin';

# uncomment this stuff if you're at an installation where a server 
# takes a long time to restart or keeps important state

# if (scalar(@ARGV) == 0) {
#     die "Don't run this without any arguments!";
# }

$server = shift;

$< = $>; # set realuid to effective uid (root)

sub getpids {
    ## get the PIDs of all jobdirect servers
    my $ps_output = `/usr/bin/ps -ef`;
    my @pids;
    foreach (split(/\n/, $ps_output)) {
	next unless /^\s*\S+\s+(\d+).*nsd.*$server.ini/;
	push(@pids, $1);
    }
    @pids;
}

@pids = &getpids;
print "Killing ", join(" ", @pids), "\n";
kill 'KILL', @pids;

License

This is open-source software, copyright 1998 ArsDigita, LLC and licensed under the GNU General Public License.

Support and Customization

If you want a extended version of Keepalive or support, you can hire the programmer of your choice to install, maintain, and customize keepalive. ArsDigita offers support as well, but probably not at a price that you'd be happy to pay.
ben@arsdigita.com

Reader's Comments

I think using aolserver to keep another aolserver alive is a bit risky. Even without the RDBMS - if both your aolservers hang for the same reason then what? I would feel much more comfortable using cron and a shell script.

-- David Cotter, September 21, 2001

Advertisements