[OPEN-ILS-GENERAL] Import issues

Dan Scott denials at gmail.com
Fri Aug 7 00:00:03 EDT 2009

Hi Dibyendra:

First, I think you'll want to tell marc2bre.pl that you're working
with records encoded in UTF8, or it will try to convert them from
MARC8 to UTF8 (and that will really mess up your records). Just add
the "--encoding UTF8" flag to your marc2bre.pl options to do that. You
may also want to add "--idfield 852 --idsubfield x" if you want to
force your records in Evergreen to have record IDs that match your
accession numbers in the old system - just a friendly suggestion.

When I ran marc2bre.pl / direct_ingest.pl against your 8000.mrc file,
I only had 40 records out of the original 100 make it through the
ingest stage. It looks like you're running into a limitation of the
fingerprint algorithm in /openils/var/catalog/biblio_fingerprint.js
(which, as a gross simplification, largely tries to concatenate author
+ title together to identify works) -- the fingerprint algorithm
doesn't understand anything except for ASCII; all non-ASCII characters
end up being thrown away for the purposes of the fingerprint. If the
first character of the author or title subfields that the algorithm
finds are not in plain ASCII, it generates no fingerprint, and the
record can't be ingested.

I don't have a quick fix for you, unfortunately. It's a significant problem.


2009/8/6 Dibyendra Hyoju <dibyendra at gmail.com>:
> Hello everyone,
> Today, I again tried to import a new set of parallel records, which I
> thought there would be no errors. But, I am getting messages like 'Use of
> uninitialized value $value in substitution (s///) at marc2bre.pl line 412.'
> repeatedly while executing 'perl marc2bre.pl --db_user evergreen --db_host
> localhost --db_pw evergreen --db_name evergreen 5300.mrc > ~/5300.bre'.
> I continued converting the bre into ingest file ignoring the message. But I
> got error like 'Couldn't process record...'. I've attached the error message
> herewith as the error message was very long, and a record which I had tried
> is also attached herewith. If anyone has faced similar problems and had
> already found the solution, please suggest. I would be very much thankful.
> Any help will be appreciated.
> Thank you.
> With kind regards,
> Dibyendra
> On Wed, Aug 5, 2009 at 5:06 PM, Dibyendra Hyoju <dibyendra at gmail.com> wrote:
>> Hi all,
>> I have finished installing Evergreen 1.4 on Debian-lenny, and trying to
>> import MARC records. I have successfully imported the gutenberg records as
>> shown in the Importing bibliographic records. But, I couldn't do the same
>> for our library records which have both local script and roman
>> transliteration. I am getting lots of similar warnings while converting the
>> MARC records into Evegreen BRE JSON format. Like I am getting 'no mapping
>> found for [0xCC] at position 3 in Bhārgav Bhūshaṇ Presa) g0=ASCII_DEFAULT
>> g1=EXTENDED_LATIN at /usr/share/perl5/MARC/Charset.pm line 210.'  while
>> executing 'perl marc2bre.pl --db_user evergreen --db_host localhost --db_pw
>> evergreen --db_name evergreen 8500.mrc > ~/8500.bre'. Though, it generates
>> Open-ILS JSON ingest file after executing perl direct_ingest.pl ~/8000.bre >
>> ~/8000.ingest'.  But, it couldn't generate a SQL file after executing 'perl
>> pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe -or mkfe -or
>> msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
>> --output=~/8500.sql < ~/8500.ingest'. I followed the same process for the
>> gutenberg records and they were successfully imported.
>> I've also attached the file containing some of our library records if
>> somebody is interested to import the records for testing. If it can be
>> imported by any other methods, please let me know. Any help will be highly
>> appreciated.
>> Thank you.
>> --
>> Dibyendra Hyoju
>> Madan Puraskar Pustakalaya
>> Lalitpur, Nepal
> --
> Dibyendra

Dan Scott
Laurentian University

More information about the Open-ils-general mailing list