[OPEN-ILS-DEV] Bib that blows up the import
Frances Dean McNamara
fdmcnama at uchicago.edu
Thu Jul 3 14:21:45 EDT 2008
We are using yaz to convert (we already have a setup using that for our AquaBrowser weekly dumps of the db, so they used that) So this happened when it was running with the xml parameter on a yaz file, then I reproduced the problem with a straight marc file using the perl.
I'll ask Dale to look at your yaz command line as opposed to the one we have been using. Thanks.
I guess what we have discovered is that we may have to spend some time on a custom conversion bib program if we went with this as all sorts of interesting issues may show up in such a big file. Turns out the process would skip that record and go on but I don't think it writes an error which we would need.
That was LC cataloging, so apparently sometimes the do add a 500 with no subfield code. The problem looks like it happens when the subfield delimeter and code are missing AND the text start with a quotation mark. We won't try to fix right now, just note it as an issue
Frances
-----Original Message-----
From: open-ils-dev-bounces at list.georgialibraries.org [mailto:open-ils-dev-bounces at list.georgialibraries.org] On Behalf Of Dan Scott
Sent: Thursday, July 03, 2008 1:12 PM
To: Evergreen Development Discussion List
Subject: Re: [OPEN-ILS-DEV] Bib that blows up the import
Hi Frances:
If you import straight MARC21 format records, then the importer relies
on Perl's MARC::Record and MARC::File::XML modules to convert the
record to MARC21XML format. You might have turned up a bug (or perhaps
we could call it a strictness feature) in one of those modules.
However, I would suggest using yaz-marcdump to convert the files to
MARC21XML (and from MARC8 to UTF8 encoding) first, before importing.
yaz-marcdump will be much faster and generally more capable of
handling less-than-savoury MARC records like your example.
In the case of your example record, I ran the following command with
yaz-marcxml version 2.1.56 (a relatively ancient version) to convert
it to the attached MARC21XML record. Note that the conversion
automatically supplied the subfield 'a' for that problem 500 field:
bash$ yaz-marcdump -f marc8 -t utf8 -i marc -o marcxml -l9=97
~/Documents/Downloads/1567233.mrc > 1567233.xml
Dan
2008/7/3 Frances Dean McNamara <fdmcnama at uchicago.edu>:
> Testing importing bibs we ran into a snag. The attached bib record failed.
>
>
>
> It has a 500 field that lacks a subfield a, and that field also starts with
> a quote:
>
>
>
> :
>
> 01721nam 2200505 a 4500
>
> 005:
>
> 19940513000000.0
>
> 008:
>
> 940208s1993 onca b f000 0 eng d
>
> 009:
>
> b81
>
> 010:
>
> $a cn 93099386
>
> 015:
>
> $a C93-99386-1
>
> 020:
>
> $a 0660142570
>
> 035:
>
> $a (ICU)BID18160662
>
> 035:
>
> $a (OCoLC)28672884
>
> 040:
>
> $a CaOOS $b eng $c NLC $d ICU
>
> 041:
>
> 0 $a eng
>
> 043:
>
> $a n-cn---
>
> 055:
>
> 0 $a COP.C.CS92-311E
>
> 055:
>
> 2 $a HA741.5*
>
> 082:
>
> 0 $a 304.6/0971 $2 20
>
> 086:
>
> 1 $a DSS Cat. no. CS92-311E
>
> 245:
>
> 00 $a 1991 census geography : $b a historical comparison / $c Statistics
> Canada.
>
> 260:
>
> $a Ottawa : $b Statistics Canada, $c 1993.
>
> 300:
>
> $a ii, 51 p. : $b ill. ; $c 28 cm.
>
> 490:
>
> 1 $a Geographic reference
>
> 500:
>
> $a Issued also in French under title: Géographie du recensement de 1991,
> comparaison historique.
>
> 500:
>
> $a "91 census"--Cover.
>
> 500:
>
> $" August 1993"
>
> 500:
>
> $a "Catalogue No. 92-311 E".
>
> 504:
>
> $a Includes bibliographical references: p. 40-44.
>
> 650:
>
> 0 $a Census districts $z Canada.
>
> 650:
>
> 0 $a Metropolitan areas $z Canada.
>
> 650:
>
> 0 $a Election districts $z Canada.
>
> 650:
>
> 0 $a Population density $z Canada.
>
> 650:
>
> 6 $a Districts de recensement $z Canada.
>
> 650:
>
> 6 $a Agglomérations urbaines $z Canada.
>
> 650:
>
> 6 $a Circonscriptions électorales $z Canada.
>
> 651:
>
> 0 $a Canada $x Census, 1991.
>
> 651:
>
> 6 $a Canada $x Population $x Densité.
>
> 651:
>
> 6 $a Canada $x Recensement, 1991.
>
> 710:
>
> 20 $a Statistics Canada
>
> 830:
>
> 0 $a Geographic reference (Canada. Statistics Canada)
>
> 900:
>
> $a ICU:94219272 $b OST:70 $c HST:500 $d Copy:1
>
> 920:
>
> $a 19940512 $b mak/ub
>
> 923:
>
> $a 061694 $b OCLC
>
>
>
>
>
> I exported the bib and stripped the 500s out and then it would load.
>
>
>
> Exactly who fussy is the load program about coding? We have a lot of
> records from a lot of sources and some of them may have errors like this.
>
>
>
> Frances McNamara
>
> University of Chicago
--
Dan Scott
Laurentian University
More information about the Open-ils-dev
mailing list