[OPEN-ILS-DEV] Crashing problem with 1.2.3.1

Garry Dunn garry at trellisconsulting.ca
Fri Nov 21 08:27:22 EST 2008


To all,

We (the Innisfil Public Library) are in need of some help.  We've been 
running Evergreen (for real!) for about 40 days and continue to have 
unpredictable crashes (on average once or twice a day).  It's done it 
from day one (and did it once on the test system before going live), but 
it's difficult to reproduce.  I run a third test system (much smaller in 
terms of memory/CPU) but I and can't make it crash.  I only have 1 or 2 
staff clients connected at a time so it's much lower volume of activity. 
  All three systems (1 live and 2 test) are loaded with the same data.). 
  Here are some details:

1) to fix the problem, we simply issue the stop_all command, then run 
the 3 start commands (start_router/start_perl/start_c) and everything is 
fine for a while.  We never have to touch PostGres or Apache.
2) the system is running on 1 server 
(memcache/ejabber/apache/opensrf/postgres/...).  Debian Etch OS with 
Evergreen 1.2.3.1.  Postgres 8.1.  4G of RAM on a new Dell PowerEdge 
server (lots of hard drive space on RAID).  The server is not running 
anything else--just Evergreen.
3) It tends to happen when staff is dealing with a patron who has a lot 
of holds/books out/fines, although that's not a guarantee.  Once the 
system is restarted, staff can go back to the problem patron and do what 
they'd like and it will be fine.

We've got lots of logs captured from when it happens and I can provide 
snippets of those if interested.  The only thing I see in the logs is a 
message similar to this, approaching each failure:

In osrfsys.log:

[2008-11-12 12:15:21] open-ils.circ [ERR 
:17229:CStoreEditor.pm:86:12265090521716244] editor[1|11434]
error starting database transaction
[2008-11-12 12:15:21] open-ils.circ [ERR 
:17229:CStoreEditor.pm:269:12265090521716244] CStoreEditor lost it's 
connection!!

The logs show quite a few error messages like this leading up to the 
complete crash (sometimes hours before the actual crash).

We've been thinking it's a performance issue so we've played a bit with 
Dan's tweaks for PostGres (found here: 
http://www.coffeecode.net/archives/156-Tuning-PostgreSQL-for-Evergreen-on-a-test-server.html). 
  It doesn't seem to make a difference (but we've only tried a couple of 
different combinations).

If anyone can provide some guidance about how to further examine/resolve 
the problem, we'd greatly appreciate it.

Thanks,

Garry


More information about the Open-ils-dev mailing list