[OPEN-ILS-GENERAL] Evergreen 2.9 and PostgreSQL 9.4 - A Cautionary Tale

Chris Sharp csharp at georgialibraries.org
Wed Feb 10 14:41:45 EST 2016


Hello all,

As watchers of the Evergreen IRC channel, know, the PINES upgrade from Evergreen 2.7.2/PostgreSQL 9.3 to Evergreen 2.9.1/PostgreSQL 9.4 was a bit bumpy.  Our system slowed to a crawl on go-live day, and we were quick to determine that the culprit was the database.  After consulting with Mike Rylander of Equinox Software, we implemented several tweaks to our system that I'll share here in hopes that they may help others in the future:

The slowness appears to have been due to two problem areas:

Bad Query Plans:

- we upped the join_collapse_limit from the default value of 8 to 12 - Dan Wells discusses this possibility in his bug report about slow copy load in the OPAC (https://bugs.launchpad.net/evergreen/+bug/1527731).
- we altered the actor.org_unit_descendants, adding a realistic estimate of the number of rows to be expected:
     ALTER FUNCTION actor.org_unit_descendants(int) ROWS 300;
     ALTER FUNCTION actor.org_unit_descendants(int,int) ROWS 300;
- we dropped a duplicate index on the action.hold_request table that was apparently added to address an issue in the past (ahr_current_copy_open_idx)
- we added an index to action.circulation to address an issue with slow collections queries: 
     CREATE INDEX CONCURRENTLY circ_circ_lib_open_idx ON action.circulation
     (usr,circ_lib) WHERE xact_finish IS NULL;

PostgreSQL Memory Usage:

- we reduced our shared_buffers size to 32GB from 128GB - we had calculated the higher value from the PostgreSQL documentation (http://www.postgresql.org/docs/9.4/static/runtime-config-resource.html#GUC-SHARED-BUFFERS) recommendation of 1/4 of installed RAM (512GB in our case).  We immediately saw a reduction of resource consumption on the server.
- on the underlying OS, we turned off transparent huge pages (THP - see http://unix.stackexchange.com/questions/99154/disable-transparent-hugepages for details on how one might do this).  Further research revealed that turning off THP is generally recommended by Oracle and PostgreSQL.

We also followed Mike's recommendation to run the hold targeter cron job nightly rather than every 15 minutes, but that probably didn't have much to do with the root cause (aside from adding queries to the pile-up when slowness struck).

Since we made the above changes, our database servers have increased speed and are performing very happily.  I just wanted to share our experience with the Evergreen Community in hopes that others can benefit.

-- 
Chris Sharp
PINES System Administrator
Georgia Public Library Service
1800 Century Place, Suite 150
Atlanta, Georgia 30345
(404) 235-7147
csharp at georgialibraries.org
http://pines.georgialibraries.org/


More information about the Open-ils-general mailing list