[OPEN-ILS-GENERAL] Import issues

Dan Scott denials at gmail.com
Thu Aug 13 23:19:44 EDT 2009


A quick peek suggests that the duplicate TCN values are "NEW" and "i",
after which they resort to just s + record ID (for example, "s8602").

This seems strange to me, it looks like the --tcnfield and
--tcnsubfield options are being completely ignored, as "NEW" is found
in the 001 field of each of your records, and "i" comes from the 020
field (which doesn't have an "a" subfield, so no number gets
assigned). This is the behaviour that results when a TCN value isn't
found; however, all of the records you previously sent most definitely
have a value in 852$x (and in fact the ID field is being correctly set
to that numeric value). When I ran the
marc2bre/direct_ingest/pg_loader process, I saw TCNs with values like
"Accession number: 8600" as you would expect from your records. Very
strange, it's like the BRE files come from a previous run of
marc2bre.pl that didn't have the --tcnfield/--tcnsubfield options.

Ah well. One way around this would be to create a "used TCN file";
just a text file containing the following lines:

NEW
i

and then point at it in marc2bre.pl using the --used_tcn_file option;
for example, "--used_tcn_file tcns.txt". This will force it to use the
s + record ID approach to deriving a TCN.

Dan

2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
> Hi Dan,
>
> Thank you for your prompt response.
>
> Please find the attachment herewith. The attachment includes outputs of
> three MARC records containing 250-300 bibliographic information.
>
> Thank you.
>
> With kind regards,
> Dibyendra
>
> On Thu, Aug 13, 2009 at 11:11 PM, Dan Scott <denials at gmail.com> wrote:
>>
>> Hi Dibyendra:
>>
>> Can you please attach some zipped output (BRE, ingest, and SQL) files
>> as well as the errors to help us work out where the duplication is
>> occurring?
>>
>> Dan
>>
>> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>> > Hello Dan,
>> >
>> > Thanks for your prompt response. I tried those options with marc2bre.pl
>> > and
>> > now the 'marc2bre.pl' produced no error. But, the problem about
>> > duplication
>> > still persists while executing the SQL. I rebuilt the database several
>> > times, and tried 'marc2bre.pl' with all the suggested options like you
>> > have
>> > suggested.
>> >
>> > I performed the following commands during the import process and they
>> > produced SQL for each MARC records:
>> >
>> > #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>> > evergreen
>> > --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x
>> > --tcnfield
>> > 852 --tcnsubfield x 8500.mrc > 8500.bre
>> > #perl direct_ingest.pl ~/8500.bre > ~/8500.ingest
>> > #perl pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe
>> > -or
>> > mkfe -or msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
>> > --output=8500.sql < ~/8500.ingest
>> > #cp 8500.sql ~
>> > #psql -U evergreen evergreen
>> > evergreen=# \i ~/8500.sql
>> >
>> > After rebuilding the database, any generated SQL is executed
>> > successfully.
>> > But, after the first successful execution of any generated SQL, any
>> > other
>> > generated SQL files are not executed. I've attached the generated SQL
>> > errors
>> > herewith.
>> >
>> > I look forward to hearing from you.
>> >
>> > Thank you.
>> >
>> > --
>> > Dibyendra Hyoju
>> > Madan Puraskar Pustakalaya
>> > Patan Dhoka, Lalitpur
>> > Nepal
>> >
>> > On Thu, Aug 13, 2009 at 12:14 PM, Dan Scott <denials at gmail.com> wrote:
>> >>
>> >> Sorry Dibyendra, those options should have been --tcnfield and
>> >> --tcnsubfield
>> >>
>> >> Try those, they should work in 1.4.
>> >>
>> >> Dan
>> >>
>> >> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>> >> > Hello Dan,
>> >> >
>> >> > Thank you very much for your response.
>> >> >
>> >> > I used the options: 'tcn_field' and 'tcn_subfield', but they are not
>> >> > recognized by the marc2bre.pl. I executed the marc2bre.pl in
>> >> > following
>> >> >  way and the output is as follows:
>> >> >
>> >> >
>> >> >
>> >> > dibyendra-laptop:/home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import#
>> >> > perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>> >> > evergreen --db_name evergreen --encoding UTF8 --idfield 852
>> >> > --idsubfield x --tcn_field 852 --tcn_subfield x 8500.mrc > 8500.bre
>> >> > Unknown option: tcn_field
>> >> > Unknown option: tcn_subfield
>> >> >
>> >> > I guess, these options: 'tcn_field' and 'tcn_subfield' are not
>> >> > implemented on EG 1.4 yet. Is that so? I couldn't find EG 1.6 on the
>> >> > download page. We're planning to migrate our 10,000 validated MARC
>> >> > records into Evergreen.
>> >> >
>> >> > Thank you.
>> >> >
>> >> > --
>> >> > Dibyendra Hyoju
>> >> > Madan Puraskar Pustakalaya
>> >> > Patan Dhoka, Lalipur
>> >> > Nepal
>> >> >
>> >> > On Tue, Aug 11, 2009 at 10:15 AM, Dan Scott <denials at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> 2009/8/8 Dibyendra Hyoju <dibyendra at gmail.com>:
>> >> >> > Hello all,
>> >> >> > I imported the sample records again on the another machine having
>> >> >> > Evergreen
>> >> >> > 1.4, and after importing that record, I couldn't import any
>> >> >> > records
>> >> >> > from
>> >> >> > other MARC files. I have attached the record '8500.mrc' and
>> >> >> > '8400.mrc'
>> >> >> > herewith. I first executed the SQL generated from 8500.mrc'
>> >> >> > successfully.
>> >> >> > Then, I couldn't execute the SQL generated from '8400.mrc'. The
>> >> >> > error
>> >> >> > is
>> >> >> > same as before like 'ERROR:  duplicate key violates unique
>> >> >> > constraint
>> >> >> > "biblio_record_unique_tcn"'. I only used the option "--encoding
>> >> >> > UTF8"
>> >> >> > this
>> >> >> > time. I tried few other records, but I got the same error. Few of
>> >> >> > the
>> >> >> > tested
>> >> >> > records are attached herewith if somebody wants to volunteer the
>> >> >> > test. If
>> >> >> > anyone has faced this problem before and have found the solution,
>> >> >> > please
>> >> >> > help. Any help will be highly appreciated.
>> >> >>
>> >> >> Sorry for the delayed reply, I'm on leave at the moment and not
>> >> >> connected very often.
>> >> >>
>> >> >> marc2bre.pl doesn't really deal with automatically generated TCNs
>> >> >> all
>> >> >> that well, so you're best off explicitly identifying a source for
>> >> >> the
>> >> >> TCN. To avoid getting duplicate TCN values and record ID values if
>> >> >> all
>> >> >> of your records follow the same pattern with the 852 $x field /
>> >> >> subfield accession number identifier, you should also use the
>> >> >> --tcn_field / --tcn_subfield options for marc2bre.pl:
>> >> >>
>> >> >> perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>> >> >> evergreen --db_name evergreen --encoding UTF8 --idfield 852
>> >> >> --idsubfield x --tcn_field 852 --tcn_subfield x 8400.mrc > 8400.bre
>> >> >>
>> >> >> I tried importing both your 8000.mrc and 8400.mrc files with the and
>> >> >> these options worked fine for me with no duplicate value warnings.
>> >> >> This is with Evergreen rel_1_6_0 on Ubuntu 8.04, but marc2bre.pl
>> >> >> (the
>> >> >> most critical script for parsing out TCN and record number) shows no
>> >> >> significant differences between 1.4 and rel_1_6_0.
>> >> >>
>> >> >> --
>> >> >> Dan Scott
>> >> >> Laurentian University
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Dan Scott
>> >> Laurentian University
>> >
>>
>>
>>
>> --
>> Dan Scott
>> Laurentian University
>



-- 
Dan Scott
Laurentian University


More information about the Open-ils-general mailing list