[OPEN-ILS-GENERAL] Import issues

Dibyendra Hyoju dibyendra at gmail.com
Thu Aug 20 01:38:52 EDT 2009


Hi all,

I am trying to load the Library records which don't have ISBN. So, Dan was
recommending to use the 'marc2bre.pl' with the option "--used_tcn_file
tcns.txt" to generate TCNs for the MARC records. The file 'tcns.txt'
contains just two lines: 'NEW' and 'i'. While executing the SQL output
generated from the BRE and ingest, I am still getting the error " ERROR:
duplicate key violates unique constraint "biblio_record_unique_tcn".  If
anyone has solved this issue before, please share your knowledge. Any help
will be appreciated.

I have a plan to import our 12000 validated MARC records in Evergreen as
soon as possible.

Thank you.

With kind regards,
Dibyendra

On Tue, Aug 18, 2009 at 7:49 AM, Dibyendra Hyoju <dibyendra at gmail.com>wrote:

> Hello Dan,
> Thank you very much once again for your kind response and for the detail
> explanations, which is really appreciable. I am sorry that I couldn't get
> back to this issue in a prompt way, because I was on leave for a few days
> due to some urgent tasks.
>
> After reading your email, I understood why our records were creating
> duplicate TCNs while converting into BRE format. More than 50% of our
> library records don't have ISBN numbers. Currently, there are around 23000
> records, and we have only validated our 12000 MARC records. I have a plan to
> migrate all the validated MARC records into Evergreen within this week if
> the import process goes without problem.
>
> Like the alternate way that you had described to convert the MARC records
> into BRE using "--used_tcn_file" option, I used 'marc2bre.pl' in following
> ways:
>
> #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
> --db_name evergreen --encoding UTF8 --used_tcn_file tcns.txt> 8500.bre
>
> and
>
>  perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
> --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x
> --used_tcn_file tcns.txt> 8500.bre
>
> and
>
> perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
> --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x --tcnfield
> 852 --tcnsubfield x 8500.mrc --used_tcn_file tcns.txt> 8500.bre
>
>
>
> But, the SQL output generated from all the bre and ingest file from above
> commands still cannot be executed and it gives the same error as before. The
> file 'tcns.txt' contains just two lines like you have said.
>
> Looking forward to hearing from you.
>
> Thank you once again.
>
> With kind regards,
> Dibyendra
>
> On Fri, Aug 14, 2009 at 9:04 AM, Dan Scott <denials at gmail.com> wrote:
>
>> A quick peek suggests that the duplicate TCN values are "NEW" and "i",
>> after which they resort to just s + record ID (for example, "s8602").
>>
>> This seems strange to me, it looks like the --tcnfield and
>> --tcnsubfield options are being completely ignored, as "NEW" is found
>> in the 001 field of each of your records, and "i" comes from the 020
>> field (which doesn't have an "a" subfield, so no number gets
>> assigned). This is the behaviour that results when a TCN value isn't
>> found; however, all of the records you previously sent most definitely
>> have a value in 852$x (and in fact the ID field is being correctly set
>> to that numeric value). When I ran the
>> marc2bre/direct_ingest/pg_loader process, I saw TCNs with values like
>> "Accession number: 8600" as you would expect from your records. Very
>> strange, it's like the BRE files come from a previous run of
>> marc2bre.pl that didn't have the --tcnfield/--tcnsubfield options.
>>
>> Ah well. One way around this would be to create a "used TCN file";
>> just a text file containing the following lines:
>>
>> NEW
>> i
>>
>> and then point at it in marc2bre.pl using the --used_tcn_file option;
>> for example, "--used_tcn_file tcns.txt". This will force it to use the
>> s + record ID approach to deriving a TCN.
>>
>> Dan
>>
>> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>> > Hi Dan,
>> >
>> > Thank you for your prompt response.
>> >
>> > Please find the attachment herewith. The attachment includes outputs of
>> > three MARC records containing 250-300 bibliographic information.
>> >
>> > Thank you.
>> >
>> > With kind regards,
>> > Dibyendra
>> >
>> > On Thu, Aug 13, 2009 at 11:11 PM, Dan Scott <denials at gmail.com> wrote:
>> >>
>> >> Hi Dibyendra:
>> >>
>> >> Can you please attach some zipped output (BRE, ingest, and SQL) files
>> >> as well as the errors to help us work out where the duplication is
>> >> occurring?
>> >>
>> >> Dan
>> >>
>> >> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>> >> > Hello Dan,
>> >> >
>> >> > Thanks for your prompt response. I tried those options with
>> marc2bre.pl
>> >> > and
>> >> > now the 'marc2bre.pl' produced no error. But, the problem about
>> >> > duplication
>> >> > still persists while executing the SQL. I rebuilt the database
>> several
>> >> > times, and tried 'marc2bre.pl' with all the suggested options like
>> you
>> >> > have
>> >> > suggested.
>> >> >
>> >> > I performed the following commands during the import process and they
>> >> > produced SQL for each MARC records:
>> >> >
>> >> > #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>> >> > evergreen
>> >> > --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x
>> >> > --tcnfield
>> >> > 852 --tcnsubfield x 8500.mrc > 8500.bre
>> >> > #perl direct_ingest.pl ~/8500.bre > ~/8500.ingest
>> >> > #perl pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe
>> >> > -or
>> >> > mkfe -or msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
>> >> > --output=8500.sql < ~/8500.ingest
>> >> > #cp 8500.sql ~
>> >> > #psql -U evergreen evergreen
>> >> > evergreen=# \i ~/8500.sql
>> >> >
>> >> > After rebuilding the database, any generated SQL is executed
>> >> > successfully.
>> >> > But, after the first successful execution of any generated SQL, any
>> >> > other
>> >> > generated SQL files are not executed. I've attached the generated SQL
>> >> > errors
>> >> > herewith.
>> >> >
>> >> > I look forward to hearing from you.
>> >> >
>> >> > Thank you.
>> >> >
>> >> > --
>> >> > Dibyendra Hyoju
>> >> > Madan Puraskar Pustakalaya
>> >> > Patan Dhoka, Lalitpur
>> >> > Nepal
>> >> >
>> >> > On Thu, Aug 13, 2009 at 12:14 PM, Dan Scott <denials at gmail.com>
>> wrote:
>> >> >>
>> >> >> Sorry Dibyendra, those options should have been --tcnfield and
>> >> >> --tcnsubfield
>> >> >>
>> >> >> Try those, they should work in 1.4.
>> >> >>
>> >> >> Dan
>> >> >>
>> >> >> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>> >> >> > Hello Dan,
>> >> >> >
>> >> >> > Thank you very much for your response.
>> >> >> >
>> >> >> > I used the options: 'tcn_field' and 'tcn_subfield', but they are
>> not
>> >> >> > recognized by the marc2bre.pl. I executed the marc2bre.pl in
>> >> >> > following
>> >> >> >  way and the output is as follows:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> dibyendra-laptop:/home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import#
>> >> >> > perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>> >> >> > evergreen --db_name evergreen --encoding UTF8 --idfield 852
>> >> >> > --idsubfield x --tcn_field 852 --tcn_subfield x 8500.mrc >
>> 8500.bre
>> >> >> > Unknown option: tcn_field
>> >> >> > Unknown option: tcn_subfield
>> >> >> >
>> >> >> > I guess, these options: 'tcn_field' and 'tcn_subfield' are not
>> >> >> > implemented on EG 1.4 yet. Is that so? I couldn't find EG 1.6 on
>> the
>> >> >> > download page. We're planning to migrate our 10,000 validated MARC
>> >> >> > records into Evergreen.
>> >> >> >
>> >> >> > Thank you.
>> >> >> >
>> >> >> > --
>> >> >> > Dibyendra Hyoju
>> >> >> > Madan Puraskar Pustakalaya
>> >> >> > Patan Dhoka, Lalipur
>> >> >> > Nepal
>> >> >> >
>> >> >> > On Tue, Aug 11, 2009 at 10:15 AM, Dan Scott <denials at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> 2009/8/8 Dibyendra Hyoju <dibyendra at gmail.com>:
>> >> >> >> > Hello all,
>> >> >> >> > I imported the sample records again on the another machine
>> having
>> >> >> >> > Evergreen
>> >> >> >> > 1.4, and after importing that record, I couldn't import any
>> >> >> >> > records
>> >> >> >> > from
>> >> >> >> > other MARC files. I have attached the record '8500.mrc' and
>> >> >> >> > '8400.mrc'
>> >> >> >> > herewith. I first executed the SQL generated from 8500.mrc'
>> >> >> >> > successfully.
>> >> >> >> > Then, I couldn't execute the SQL generated from '8400.mrc'. The
>> >> >> >> > error
>> >> >> >> > is
>> >> >> >> > same as before like 'ERROR:  duplicate key violates unique
>> >> >> >> > constraint
>> >> >> >> > "biblio_record_unique_tcn"'. I only used the option "--encoding
>> >> >> >> > UTF8"
>> >> >> >> > this
>> >> >> >> > time. I tried few other records, but I got the same error. Few
>> of
>> >> >> >> > the
>> >> >> >> > tested
>> >> >> >> > records are attached herewith if somebody wants to volunteer
>> the
>> >> >> >> > test. If
>> >> >> >> > anyone has faced this problem before and have found the
>> solution,
>> >> >> >> > please
>> >> >> >> > help. Any help will be highly appreciated.
>> >> >> >>
>> >> >> >> Sorry for the delayed reply, I'm on leave at the moment and not
>> >> >> >> connected very often.
>> >> >> >>
>> >> >> >> marc2bre.pl doesn't really deal with automatically generated TCNs
>> >> >> >> all
>> >> >> >> that well, so you're best off explicitly identifying a source for
>> >> >> >> the
>> >> >> >> TCN. To avoid getting duplicate TCN values and record ID values
>> if
>> >> >> >> all
>> >> >> >> of your records follow the same pattern with the 852 $x field /
>> >> >> >> subfield accession number identifier, you should also use the
>> >> >> >> --tcn_field / --tcn_subfield options for marc2bre.pl:
>> >> >> >>
>> >> >> >> perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>> >> >> >> evergreen --db_name evergreen --encoding UTF8 --idfield 852
>> >> >> >> --idsubfield x --tcn_field 852 --tcn_subfield x 8400.mrc >
>> 8400.bre
>> >> >> >>
>> >> >> >> I tried importing both your 8000.mrc and 8400.mrc files with the
>> and
>> >> >> >> these options worked fine for me with no duplicate value
>> warnings.
>> >> >> >> This is with Evergreen rel_1_6_0 on Ubuntu 8.04, but marc2bre.pl
>> >> >> >> (the
>> >> >> >> most critical script for parsing out TCN and record number) shows
>> no
>> >> >> >> significant differences between 1.4 and rel_1_6_0.
>> >> >> >>
>> >> >> >> --
>> >> >> >> Dan Scott
>> >> >> >> Laurentian University
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Dan Scott
>> >> >> Laurentian University
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Dan Scott
>> >> Laurentian University
>> >
>>
>>
>>
>> --
>> Dan Scott
>> Laurentian University
>>
>


-- 
Dibyendra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20090820/46f3d0e4/attachment-0001.htm 


More information about the Open-ils-general mailing list