[OPEN-ILS-DEV] Issues with direct_ingest.pl

Warren Layton warren.layton at gmail.com
Wed Nov 19 21:15:50 EST 2008


On Mon, Nov 17, 2008 at 9:46 PM, Warren Layton <warren.layton at gmail.com> wrote:
> What I suspect is happening is that direct_ingest.pl rejects records
> that have an accented character between square brackets ("[" and "])
> in a field. For example, a record with the following 260 subfield will
> be rejected:
>
>  <subfield code=\"b\">[Bibliothe&#x300;que nationale du Canada],</subfield>
>
> However, if I remove _either_ the square brackets _or_ the "&#x300;",
> the record will be successfully processed.


Just a quick follow-up to this problem. It also occurs in two other scenarios:
1) The openening and closing square brackets can be spread over
multiple subfields.
2) The problem also occurs if the accented character/diacritic is
placedbetween two escaped double-quotes (\"). For example, a record
containing the following subfield will produce the same error:
   <subfield code=\"b\">\"Syste&#x300;mes solaires\", </subfield>

I have traced the execution of the script through
/openils/lib/perl5/OpenILS/Application/Ingest.pm
and the script is dying in "sub biblio_fingerprint" (API name:
open-ils.ingest.fingerprint.xml). Specifically,
"biblio_fingerprint.js" seems to be where the problem is occurring.
I'm suspecting that some regular expression is getting tripped up
somewhere.

Cheers,
  Warren


More information about the Open-ils-dev mailing list