[OPEN-ILS-DEV] direct_ingest.pl, biblio_fingerprint.js and Unicode chars

Dan Scott dan at coffeecode.net
Sun Dec 13 16:12:23 EST 2009


On Fri, 2009-12-11 at 11:18 -0500, Warren Layton wrote:
> On Tue, Dec 1, 2009 at 10:45 AM, Dan Scott <dan at coffeecode.net> wrote:
> >> If not, and if rolling the try/catch blocks of the script into the
> >> Perl Ingest function is fine, I can go ahead with that and post a
> >> patch here soon.
> >
> > I'd love to see that patch.
> 
> Hello again (and sorry for the delay).
> 
> I'm attaching my patch to include the biblio_fingerprint code directly
> in Ingest.pm. As suggested by Dan, I've included a
> "legacy_script_support" option in opensrf.xml that lets you run the
> old biblio_fingerprint.js script instead.
> 
> A few notes:
> * The Perl code is structured very similarly to the JavaScript code
> (lots of try/catch blocks). There may be a better way to write it...
> * For the quality value, I left most of the increments in (length of
> datafields, matching certain values of the 039 field, etc). The
> exception is the quality bump for language. For some reason that I
> didn't understand, the old script incremented the quality value for
> English records only. If that's still needed, it can be added to the
> Perl code.
> * The other script called by Ingest.pm, biblio_descriptor.js, will
> still be called, regardless of the legacy_script_support setting.
> 
> This is my first stab at this. It has solved the problems I was having
> with direct_ingest.pl for our records but comments and feedback are
> definitely welcome.

Thanks for this, Warren. My plan is to run the records in
Open-ILS/tests/datasets through marc2bre before and after your patch to
compare the results.

I'll time the results, too, just out of interest.



More information about the Open-ils-dev mailing list