[OPEN-ILS-GENERAL] Simple database replication for PostgreSQL 9.0

Mon Aug 1 10:56:49 EDT 2011

On Thu, Jul 28, 2011 at 3:23 PM, Dan Scott <dan at coffeecode.net> wrote:
> If you are running PostgreSQL 9.0 (and if you're on Evergreen 2.0 or
> above, you probably should be - even though Evergreen 2.0 complains
> about not finding a 9.0-specific schema, it just works)
>
> ... and you want a simple recipe for a replicated hot standby database
> server (so that you can point your reporter at it and run big scary
> queries that would otherwise bog down your production database server)
>
> ... then the following instructions for setting up PostgreSQL 9.0's
> built-in binary replication might be for you:
> http://bzr.coffeecode.net/replication9/postgresql-9-replication.html
> (the article is CC-BY-SA 3.0 Unported, hi DIG!)
>
> I found most of the tutorials for setting up replication with
> PostgreSQL 9.0 include WAL archiving / shipping as well. While you
> definitely want to set up WAL archiving / shipping to support
> point-in-time-recovery (a hot standby server isn't going to help you
> very much if someone issues a DROP TABLE biblio.record_entry CASCADE
> on your production server and you don't have logs that you can use to
> recover to just before that statement was issued), I thought it would
> be helpful to distil the replication instructions down to the bare
> minimum. As far as I can tell, PostgreSQL 9.0's streaming replication
> avoids any of the weird obstacles that Slony throws in your path - so
> for many Evergreen sites, it should be a big win.
>
> Of course, if other sites have already experimented with PostgreSQL
> 9.0 replication and have any further advice (or warnings!) to offer,
> it would be great to hear from you.
>

Hot Standby in 9.0+ is indeed an option for a reporting server.  One
thing to keep in mind, however, is that certain actions on the primary
database can cause queries on the secondary to fail.  In particular,
some vacuum actions which prune deleted tuples that are still in use
by a query on the secondary will cause reports to be canceled after a
configurable timeout.  The reports can, of course, be re-run, and may
very well succeed.  This is not a problem with Slony because the
replication mechanism can block behind queries on the secondary server
indefinitely.  So, there are trade-offs, but if your reporting queries
are generally short-lived, or you don't mind cleaning up after a
forced cancellation (or you want to fund or develop functionality to
address this situation by automatically retrying in the specific case
of a replication-canceled query), then the up-side may be much greater
than the drawbacks.

Thanks, Dan, for pointing out the option!

-- 
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com