[OPEN-ILS-DEV] import data from voyager

Jason Zou qzou at lakeheadu.ca
Fri May 25 11:34:51 EDT 2007


Dan Scott wrote:
> On 24/05/07, Jason Zou <qzou at lakeheadu.ca> wrote:
>> Don Hamilton wrote:
>> > Congratulations, Jason... You beat me by a long shot.... But now
>> > that you have, (and now that Rene has our system starting with NO
>> > logged errors!) can you give us some details of what exactly you did?
>> > Once I build on what you've done and loaded some records, I'll
>> > volunteer for a wiki account and add a 'voyager data load' section
>> > once we're done.
>> >
>> > Also, could you define "very slow"? As I mentioned in a query a week
>> > or so ago, I'd like to iterate through loading some 5 million bibs,
>> > and if 10,000 are very slow, I'm afraid to think what 5m would be. I
>> > have some experience with bulk loading to fresh data bases, and will
>> > take a stab at that if the 'regular' load is too slow.
>> >
>> > don
>> >
>> > >>> qzou at lakeheadu.ca 5/24/2007 10:44 AM >>>
>> > Hi everyone,
>> >
>> > By using scripts in the Open-ILS/src/extra/import/, I imported about
>> > 11,000 MARC records. Although the process was very slow, it seems that
>> > it is working. And records have been added into the following tables:
>> >           biblio.record_entry
>> >           metabib.rec_descriptor
>> >           metabib.full_rec
>> >           metabib.title_field_entry
>> >           metabib.author_field_entry
>> >           metabib.subject_field_entry
>> >           metabib.keyword_field_entry
>> >           metabib.series_field_entry
>> >
>> > But when I tried to use OPAC to find some records, I always got 
>> nothing.
>> > I am wondering whether there are something that I missed out.
>> >
>> > Any suggestions are highly appreciated.
>> >
>> > Jason
>> >
>> > Lakehead University
>> >
>> Hi Don,
>>
>> Thanks. I just want to see how fast the importing will be. I exported
>> 12,343 records from our Voyager database. It took me about 1.5 hours to
>> load them into Evergreen.
>> In comparison with the converting process, surprisingly, it only took
>> one minute or less to store records into Pg database.
>>
>> By the way, my server is on a P4 2.4 GHz machine with 1.5G RAM 
>> running FC5.
>>
>> Jason
>>
>> Lakehead University
>>
>>
>
> Strange. Here's the `time` results of running an import of the 14,449
> Gutenberg records (including all of the steps described in my previous
> email in this thread) on the Gentoo VMWare image with 512MB of RAM:
>
> real    51m7.623s
> user    39m24.824s
> sys     2m58.639s
>
> I'm not sure how my virtual machine with 1/3 of your physical
> machine's RAM could possibly outperform your physical machine.
> Something seems weird there. But I concur that the bulk of the time is
> spent in the marc2bre.pl / direct_ingest.pl / pg_loader.pl processes.
>
Hi Don,

More RAM does not guarantee loading data faster. Probably, I have a slower
hard disk than yours. It seems that those scripts do not use a lot of 
memory (only about 50M).

Although it may not be comparable, MarcEdit converts my test MARC file 
to MARCXML in 3 seconds.
It takes marc2bre.pl about 10 minutes.

Jason

Lakehead University



More information about the Open-ils-dev mailing list