[OPEN-ILS-DEV] Large Bibliographic Imports

Mike Rylander mrylander at gmail.com
Wed Aug 6 17:02:16 EDT 2008


On Wed, Aug 6, 2008 at 4:30 PM, Brandon W. Uhlman
<brandon.uhlman at bclibrary.ca> wrote:
> I have about 960 000 bibliographic records I need to import into an
> Evergreen system. The database server is dual quad-core Xeons with 24GB of
> RAM.
>
> Currently, I've split the bibliographic records into 8 batches of ~120K
> records each, did the marc_bre/direct_ingest/parellel_pg_loader dance, but
> one of those files has been chugging along in psql now for more than 16
> hours. How long should I expect these files to take? Would more smaller
> files load more quickly in terms of total time for the same full recordset?
>
> I notice that the insert into metabib.full_rec seems to be taking by far the
> longest. It does have more records than any of the other pieces to import,
> but the time taken still seems disproportionate.
>
> I notice that metabib.full_rec has this trigger --
> zzz_update_materialized_simple_record_tgr AFTER INSERT OR DELETE OR UPDATE
> ON metabib.full_rec FOR EACH ROW EXECUTE PROCEDURE
> reporter.simple_rec_sync().
> Is the COPY INTO calling this trigger every time I copy in a new record? If
> so, can I remove to trigger to defer this update, and do it en masse
> afterward? Would it be quicker?
>
> Just looking for any tips I can use to increase the loading speed of
> huge-ish datasets.

There is a new materialized view (a of 1.2.2) based on the trigger you
noted.  However, as of 1.2.2.1 there are management functions to
dis/enable that when needed.  So, inside the transaction started in
the main pg_loader sql output file and before the first COPY
statement, you'll want to add:

SELECT reporter.disable_materialized_simple_record_trigger();

and then at the end, before you commit, add:

SELECT reporter.enable_materialized_simple_record_trigger();

The latter will take a few minutes (perhaps 30) but not the 16 hours
you're suffering through right now.

-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com


More information about the Open-ils-dev mailing list