[OPEN-ILS-GENERAL] Import issues
Dibyendra Hyoju
dibyendra at gmail.com
Sat Aug 8 06:51:37 EDT 2009
Hello all,
I imported the sample records again on the another machine having Evergreen
1.4, and after importing that record, I couldn't import any records from
other MARC files. I have attached the record '8500.mrc' and '8400.mrc'
herewith. I first executed the SQL generated from 8500.mrc' successfully.
Then, I couldn't execute the SQL generated from '8400.mrc'. The error is
same as before like 'ERROR: duplicate key violates unique constraint
"biblio_record_unique_tcn"'. I only used the option "--encoding UTF8" this
time. I tried few other records, but I got the same error. Few of the tested
records are attached herewith if somebody wants to volunteer the test. If
anyone has faced this problem before and have found the solution, please
help. Any help will be highly appreciated.
Thank you very much.
With kind regards,
Dibyendra
On Fri, Aug 7, 2009 at 9:35 PM, Dibyendra Hyoju <dibyendra at gmail.com> wrote:
> Hi Dan,
>
> Many thanks to you for your kind response and testing our library's sample
> records which I greatly appreciate. I followed your instructions with the
> options '--encoding UTF8 --idfield 852 --idsubfield x' to a new set of
> records, and I was quite happy to see the result that It imported all the 97
> records from the file! After the successful import, I applied the same
> options again to the different sets of records. But, after the first
> successful SQL execution, while executing every other generated SQL, it gave
> error like "psql:/root/8400.sql:92: ERROR: duplicate key violates unique
> constraint "biblio_record_unique_tcn"...".I thought the option 'idfield 852
> --idsubfield x' generated the keys that would cause duplicate constraint.
> So, I tried without those options as well and only with' --encoding UTF8',
> but it didn't work.
>
> And yes, I also tested the '8000.mrc' again, and it only parsed 40 records.
> But, the records were not imported because of the same SQL error as above. I
> tried several times but failed to execute the SQL file. I will be validating
> the MARC records and will try importing again. I will let you know the
> result, soon.
>
> I executed the following commands while importing the bibliographic record
> '8400.mrc' and I did the same by replacing the existing file name with the
> respective file name:
>
> #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
> --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x 8400.mrc >
> ~/8400.bre
>
> OR
>
> #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
> --db_name evergreen --encoding UTF8 8400.mrc > ~/8400.bre
>
> #perl direct_ingest.pl ~/8400.bre > ~/8400.ingest
> #perl pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe -or
> mkfe -or msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
> --output=8400.sql < ~/8400.ingest
> #cp 8400.sql ~
> #psql -U evergreen evergreen
> #\i ~/8400.sql
> # \i
> /home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import/quick_metarecord_map.sql
>
> I have attached the errors that were shown herewith, as the errors were
> quite long.
>
> I look forward to hearing from you.
>
> Thank you.
>
> --
> Dibyendra Hyoju
> Madan Puraskar Pustakalaya
> Lalitpur, Nepal
>
> On Fri, Aug 7, 2009 at 9:45 AM, Dan Scott <denials at gmail.com> wrote:
>
>> Hi Dibyendra:
>>
>> First, I think you'll want to tell marc2bre.pl that you're working
>> with records encoded in UTF8, or it will try to convert them from
>> MARC8 to UTF8 (and that will really mess up your records). Just add
>> the "--encoding UTF8" flag to your marc2bre.pl options to do that. You
>> may also want to add "--idfield 852 --idsubfield x" if you want to
>> force your records in Evergreen to have record IDs that match your
>> accession numbers in the old system - just a friendly suggestion.
>>
>> When I ran marc2bre.pl / direct_ingest.pl against your 8000.mrc file,
>> I only had 40 records out of the original 100 make it through the
>> ingest stage. It looks like you're running into a limitation of the
>> fingerprint algorithm in /openils/var/catalog/biblio_fingerprint.js
>> (which, as a gross simplification, largely tries to concatenate author
>> + title together to identify works) -- the fingerprint algorithm
>> doesn't understand anything except for ASCII; all non-ASCII characters
>> end up being thrown away for the purposes of the fingerprint. If the
>> first character of the author or title subfields that the algorithm
>> finds are not in plain ASCII, it generates no fingerprint, and the
>> record can't be ingested.
>>
>> I don't have a quick fix for you, unfortunately. It's a significant
>> problem.
>>
>> Dan
>>
>> 2009/8/6 Dibyendra Hyoju <dibyendra at gmail.com>:
>> > Hello everyone,
>> >
>> > Today, I again tried to import a new set of parallel records, which I
>> > thought there would be no errors. But, I am getting messages like 'Use
>> of
>> > uninitialized value $value in substitution (s///) at marc2bre.pl line
>> 412.'
>> > repeatedly while executing 'perl marc2bre.pl --db_user evergreen
>> --db_host
>> > localhost --db_pw evergreen --db_name evergreen 5300.mrc > ~/5300.bre'.
>> >
>> > I continued converting the bre into ingest file ignoring the message.
>> But I
>> > got error like 'Couldn't process record...'. I've attached the error
>> message
>> > herewith as the error message was very long, and a record which I had
>> tried
>> > is also attached herewith. If anyone has faced similar problems and had
>> > already found the solution, please suggest. I would be very much
>> thankful.
>> >
>> > Any help will be appreciated.
>> >
>> > Thank you.
>> >
>> > With kind regards,
>> > Dibyendra
>> >
>> >
>> > On Wed, Aug 5, 2009 at 5:06 PM, Dibyendra Hyoju <dibyendra at gmail.com>
>> wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I have finished installing Evergreen 1.4 on Debian-lenny, and trying to
>> >> import MARC records. I have successfully imported the gutenberg records
>> as
>> >> shown in the Importing bibliographic records. But, I couldn't do the
>> same
>> >> for our library records which have both local script and roman
>> >> transliteration. I am getting lots of similar warnings while converting
>> the
>> >> MARC records into Evegreen BRE JSON format. Like I am getting 'no
>> mapping
>> >> found for [0xCC] at position 3 in Bhārgav Bhūshaṇ Presa)
>> g0=ASCII_DEFAULT
>> >> g1=EXTENDED_LATIN at /usr/share/perl5/MARC/Charset.pm line 210.' while
>> >> executing 'perl marc2bre.pl --db_user evergreen --db_host localhost
>> --db_pw
>> >> evergreen --db_name evergreen 8500.mrc > ~/8500.bre'. Though, it
>> generates
>> >> Open-ILS JSON ingest file after executing perl direct_ingest.pl
>> ~/8000.bre >
>> >> ~/8000.ingest'. But, it couldn't generate a SQL file after executing
>> 'perl
>> >> pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe -or
>> mkfe -or
>> >> msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
>> >> --output=~/8500.sql < ~/8500.ingest'. I followed the same process for
>> the
>> >> gutenberg records and they were successfully imported.
>> >>
>> >> I've also attached the file containing some of our library records if
>> >> somebody is interested to import the records for testing. If it can be
>> >> imported by any other methods, please let me know. Any help will be
>> highly
>> >> appreciated.
>> >>
>> >> Thank you.
>> >>
>> >> --
>> >> Dibyendra Hyoju
>> >> Madan Puraskar Pustakalaya
>> >> Lalitpur, Nepal
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Dibyendra
>> >
>>
>>
>>
>> --
>> Dan Scott
>> Laurentian University
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20090808/f4eee76c/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 8500.mrc
Type: application/octet-stream
Size: 135225 bytes
Desc: not available
Url : http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20090808/f4eee76c/attachment-0002.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 8400.mrc
Type: application/octet-stream
Size: 118720 bytes
Desc: not available
Url : http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20090808/f4eee76c/attachment-0003.obj
More information about the Open-ils-general
mailing list