[OPEN-ILS-DEV] ***SPAM*** Fix for shutdown problems

Scott McKellar mck9 at swbell.net
Tue May 4 09:59:37 EDT 2010


I just committed a fix to OSRF trunk for a shutdown bug.

Every once in a while when I shut down OSRF (using the command
"osrf_ctl.sh -l -a stop_all"), the script would fail when shutting down
the C services, and I would have to kill the remaining processes by
other means.

After some head-scratching I came up with a theory for what was going
wrong, and a fix for it.  It's hard to be sure whether it will really
fix the problem because the problem was intermittent and unpredictable.
However the new design at least seems to be no worse than the old.

------------------

If you don't care about the technical details you can skip this section,

The osrf_ctl.sh script had been using ps + grep to capture
the process ID (PID) of the opensrf-c daemon so that it could
send a SIGINT signal to it later to shut it down.  However the
script was also capturing the PIDs of the daemon's child processes
(i.e. the listener processes), which hadn't yet changed to
application-specific names.

[Explanation: the ps command reports the command line used to invoke
each process.  Both the opensrf-c daemon and the listeners that it
spawns change this command line by overwriting their argv arrays,
in order to change what ps reports.]

As a result, when shutting down, the listener processes would
receive signals from two different sources: from the opensrf-c
daemon and from the surrounding shell script.  If the signal
from opensrf-c got there first, the kill from the script would
fail, and the script would abort, even though the process had
already been successfully killed.

The solution is for opensrf-c to write the daemon's PID directly
to a file, instead of relying on ps + grep to capture it.  The
file name is specified by an additional command line parameter,
which (for upward compatibility) is currently optional.

----------------

If you install or reinstall OSRF from scratch, this change will be
included automatically.  If you are updating an existing
instance of OSRF, then pay attention to the following paragraphs:

Because this change involves a change to the osrf_ctl.sh
script, it will be necessary to run configure before the
usual make and make install.  If you are using the usual
configuration, run the following from within the OSRF
trunk directory:

./configure --prefix=/openils --sysconfdir=/openils/conf

If course if you are using some custom configuration, change
the above command accordingly.

If you don't run configure, the old osrf_ctl.sh script will
continue to work as it has in the past, and you won't get
the benefit of the change.

Scott McKellar



More information about the Open-ils-dev mailing list