[OPEN-ILS-DEV] ***SPAM*** Re: script to break up big marc load to avoid cataloging conflict

Melissa Belvadi mbelvadi at upei.ca
Thu Apr 8 16:07:14 EDT 2010


Heehee, I'm not literally loading the *same* set of records every time!
I'm re-downloading the complete set of a vendors' records, which may
include additions, changes to existing records (if they fixed some bad
marc data for instance) and even deletions.
It seems to me to be much more efficient to batch remove/re-load such a
set rather than try to cope with multiple separate "records added",
"records changed", and "records deleted" sets every month even from the
vendors who supply such "change" lists, and not all do. "Just replace
the biblio.record_entry.marc" is easy to say, but getting from that
binary MARC set to a 100% perfect mapping onto existing marcxml records
is a much more complicated thing to do than writing the script I want to
write and would be a sequence of steps I'd want to figure out how to
automate for repeat use so I'd be back to scripting work again.

And it's not just ebooks, actually, and not just reloads - I was just
using that as a quick example to provide a sample context for the
question. 
I'm trying to work out a generalized solution which will work the next
time we get an entirely new huge set of marc records from somewhere too,
without shutting cataloging/reserves' work down.


So back to my original question.  Any idea if such a batch script type
sequence of steps would be feasible? Or will Vandeley actually solve my
main problem for me?

Melissa



---
Melissa Belvadi
Emerging Technologies & Metadata Librarian
University of Prince Edward Island
mbelvadi at upei.ca
902-566-0581 



>>> On 4/7/2010 at 02:08 PM, Dan Scott <dan at coffeecode.net> wrote: 

> On Wed, 2010-04-07 at 13:23 -0300, Melissa Belvadi wrote:
>> Hi,
>> 
>> We're faced with wanting to reload a set of about 55,000 ebook
records
>> on a probably quarterly basis. I've got the whole source thing
figured
>> out to handle the problems of adding the new/removing the old
records,
>> but am trying to avoid the ritual of having to ask cataloging and
>> reserves to stop adding bib records for multi-hour stretches while
this
>> loads (because of the bib record id conflict problem), and we can't
run
>> it overnight because of conflicts with other EG things that run
>> overnight.
> 
> Can you back up a step or two and explain why you're reloading these
> records on a quarterly basis?
> 
> My initial thought is that you should just replace the
> biblio.record_entry.marc field in each existing record, then
reingest
> them - that way you wouldn't be creating new bib record IDs and
wouldn't
> have to stop cataloguing and reserves.



More information about the Open-ils-dev mailing list