[OPEN-ILS-DEV] Large Bibliographic Imports

Dan Scott denials at gmail.com
Wed Aug 6 16:47:54 EDT 2008


Hey Brandon:

The full text indexes are absolutely the key - check out this thread
from July 2nd: http://list.georgialibraries.org/pipermail/open-ils-dev/2008-July/003265.html
- I think it addresses your questions for the most part.

And yeah, as Mike notes, we really should document that in the
appropriate section of the wiki. Especially as I'm about to embark on
a refresh of our several-million records :0

Dan

2008/8/6 Brandon W. Uhlman <brandon.uhlman at bclibrary.ca>:
> I have about 960 000 bibliographic records I need to import into an
> Evergreen system. The database server is dual quad-core Xeons with 24GB of
> RAM.
>
> Currently, I've split the bibliographic records into 8 batches of ~120K
> records each, did the marc_bre/direct_ingest/parellel_pg_loader dance, but
> one of those files has been chugging along in psql now for more than 16
> hours. How long should I expect these files to take? Would more smaller
> files load more quickly in terms of total time for the same full recordset?
>
> I notice that the insert into metabib.full_rec seems to be taking by far the
> longest. It does have more records than any of the other pieces to import,
> but the time taken still seems disproportionate.
>
> I notice that metabib.full_rec has this trigger --
> zzz_update_materialized_simple_record_tgr AFTER INSERT OR DELETE OR UPDATE
> ON metabib.full_rec FOR EACH ROW EXECUTE PROCEDURE
> reporter.simple_rec_sync().
> Is the COPY INTO calling this trigger every time I copy in a new record? If
> so, can I remove to trigger to defer this update, and do it en masse
> afterward? Would it be quicker?
>
> Just looking for any tips I can use to increase the loading speed of
> huge-ish datasets.
>
> Cheers,
>
> Brandon
>
> ======================================
> Brandon W. Uhlman, Systems Consultant
> Public Library Services Branch
> Ministry of Education
> Government of British Columbia
> 850-605 Robson Street
> Vancouver, BC  V6B 5J3
>
> Phone: (604) 660-2972
> E-mail: brandon.uhlman at gov.bc.ca
>        brandon.uhlman at bclibrary.ca
>
>



-- 
Dan Scott
Laurentian University


More information about the Open-ils-dev mailing list