[OPEN-ILS-GENERAL] Import issues

Wed Aug 26 01:13:28 EDT 2009

Hi Dan,
I couldn't solve the problem about the duplication error " ERROR:  duplicate
key violates unique constraint "biblio_record_unique_tcn", even after using
the option "--used_tcn_file". The file that is used by the option contains
two lines like you had suggested. I searched for the solution for couple of
days related to this issue, but couldn't find the relevant solutions. So,
would you please kindly send me the file required for the
option "--used_tcn_file", or suggest me some alternative solution by which I
can import our library records into Evergreen? We are currently testing the
features of Evergreen 1.4.

I look forward to hearing from you.

Thank you.

With kind regards,
Dibyendra Hyoju
Madan Puraskar Pustakalaya

On Thu, Aug 20, 2009 at 11:23 AM, Dibyendra Hyoju <dibyendra at gmail.com>wrote:

> Hi all,
>
> I am trying to load the Library records which don't have ISBN. So, Dan was
> recommending to use the 'marc2bre.pl' with the option "--used_tcn_file
> tcns.txt" to generate TCNs for the MARC records. The file 'tcns.txt'
> contains just two lines: 'NEW' and 'i'. While executing the SQL output
> generated from the BRE and ingest, I am still getting the error " ERROR:
> duplicate key violates unique constraint "biblio_record_unique_tcn".  If
> anyone has solved this issue before, please share your knowledge. Any help
> will be appreciated.
>
> I have a plan to import our 12000 validated MARC records in Evergreen as
> soon as possible.
>
> Thank you.
>
> With kind regards,
> Dibyendra
>
> On Tue, Aug 18, 2009 at 7:49 AM, Dibyendra Hyoju <dibyendra at gmail.com>wrote:
>
>> Hello Dan,
>> Thank you very much once again for your kind response and for the detail
>> explanations, which is really appreciable. I am sorry that I couldn't get
>> back to this issue in a prompt way, because I was on leave for a few days
>> due to some urgent tasks.
>>
>> After reading your email, I understood why our records were creating
>> duplicate TCNs while converting into BRE format. More than 50% of our
>> library records don't have ISBN numbers. Currently, there are around 23000
>> records, and we have only validated our 12000 MARC records. I have a plan to
>> migrate all the validated MARC records into Evergreen within this week if
>> the import process goes without problem.
>>
>> Like the alternate way that you had described to convert the MARC records
>> into BRE using "--used_tcn_file" option, I used 'marc2bre.pl' in following
>> ways:
>>
>> #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
>> --db_name evergreen --encoding UTF8 --used_tcn_file tcns.txt> 8500.bre
>>
>> and
>>
>>  perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
>> --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x
>> --used_tcn_file tcns.txt> 8500.bre
>>
>> and
>>
>> perl marc2bre.pl --db_user postgres --db_host localhost --db_pw evergreen
>> --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x --tcnfield
>> 852 --tcnsubfield x 8500.mrc --used_tcn_file tcns.txt> 8500.bre
>>
>>
>>
>> But, the SQL output generated from all the bre and ingest file from above
>> commands still cannot be executed and it gives the same error as before. The
>> file 'tcns.txt' contains just two lines like you have said.
>>
>> Looking forward to hearing from you.
>>
>> Thank you once again.
>>
>> With kind regards,
>> Dibyendra
>>
>> On Fri, Aug 14, 2009 at 9:04 AM, Dan Scott <denials at gmail.com> wrote:
>>
>>> A quick peek suggests that the duplicate TCN values are "NEW" and "i",
>>> after which they resort to just s + record ID (for example, "s8602").
>>>
>>> This seems strange to me, it looks like the --tcnfield and
>>> --tcnsubfield options are being completely ignored, as "NEW" is found
>>> in the 001 field of each of your records, and "i" comes from the 020
>>> field (which doesn't have an "a" subfield, so no number gets
>>> assigned). This is the behaviour that results when a TCN value isn't
>>> found; however, all of the records you previously sent most definitely
>>> have a value in 852$x (and in fact the ID field is being correctly set
>>> to that numeric value). When I ran the
>>> marc2bre/direct_ingest/pg_loader process, I saw TCNs with values like
>>> "Accession number: 8600" as you would expect from your records. Very
>>> strange, it's like the BRE files come from a previous run of
>>> marc2bre.pl that didn't have the --tcnfield/--tcnsubfield options.
>>>
>>> Ah well. One way around this would be to create a "used TCN file";
>>> just a text file containing the following lines:
>>>
>>> NEW
>>> i
>>>
>>> and then point at it in marc2bre.pl using the --used_tcn_file option;
>>> for example, "--used_tcn_file tcns.txt". This will force it to use the
>>> s + record ID approach to deriving a TCN.
>>>
>>> Dan
>>>
>>> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>>> > Hi Dan,
>>> >
>>> > Thank you for your prompt response.
>>> >
>>> > Please find the attachment herewith. The attachment includes outputs of
>>> > three MARC records containing 250-300 bibliographic information.
>>> >
>>> > Thank you.
>>> >
>>> > With kind regards,
>>> > Dibyendra
>>> >
>>> > On Thu, Aug 13, 2009 at 11:11 PM, Dan Scott <denials at gmail.com> wrote:
>>> >>
>>> >> Hi Dibyendra:
>>> >>
>>> >> Can you please attach some zipped output (BRE, ingest, and SQL) files
>>> >> as well as the errors to help us work out where the duplication is
>>> >> occurring?
>>> >>
>>> >> Dan
>>> >>
>>> >> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>>> >> > Hello Dan,
>>> >> >
>>> >> > Thanks for your prompt response. I tried those options with
>>> marc2bre.pl
>>> >> > and
>>> >> > now the 'marc2bre.pl' produced no error. But, the problem about
>>> >> > duplication
>>> >> > still persists while executing the SQL. I rebuilt the database
>>> several
>>> >> > times, and tried 'marc2bre.pl' with all the suggested options like
>>> you
>>> >> > have
>>> >> > suggested.
>>> >> >
>>> >> > I performed the following commands during the import process and
>>> they
>>> >> > produced SQL for each MARC records:
>>> >> >
>>> >> > #perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>>> >> > evergreen
>>> >> > --db_name evergreen --encoding UTF8 --idfield 852 --idsubfield x
>>> >> > --tcnfield
>>> >> > 852 --tcnsubfield x 8500.mrc > 8500.bre
>>> >> > #perl direct_ingest.pl ~/8500.bre > ~/8500.ingest
>>> >> > #perl pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or
>>> msfe
>>> >> > -or
>>> >> > mkfe -or msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a
>>> msefe
>>> >> > --output=8500.sql < ~/8500.ingest
>>> >> > #cp 8500.sql ~
>>> >> > #psql -U evergreen evergreen
>>> >> > evergreen=# \i ~/8500.sql
>>> >> >
>>> >> > After rebuilding the database, any generated SQL is executed
>>> >> > successfully.
>>> >> > But, after the first successful execution of any generated SQL, any
>>> >> > other
>>> >> > generated SQL files are not executed. I've attached the generated
>>> SQL
>>> >> > errors
>>> >> > herewith.
>>> >> >
>>> >> > I look forward to hearing from you.
>>> >> >
>>> >> > Thank you.
>>> >> >
>>> >> > --
>>> >> > Dibyendra Hyoju
>>> >> > Madan Puraskar Pustakalaya
>>> >> > Patan Dhoka, Lalitpur
>>> >> > Nepal
>>> >> >
>>> >> > On Thu, Aug 13, 2009 at 12:14 PM, Dan Scott <denials at gmail.com>
>>> wrote:
>>> >> >>
>>> >> >> Sorry Dibyendra, those options should have been --tcnfield and
>>> >> >> --tcnsubfield
>>> >> >>
>>> >> >> Try those, they should work in 1.4.
>>> >> >>
>>> >> >> Dan
>>> >> >>
>>> >> >> 2009/8/13 Dibyendra Hyoju <dibyendra at gmail.com>:
>>> >> >> > Hello Dan,
>>> >> >> >
>>> >> >> > Thank you very much for your response.
>>> >> >> >
>>> >> >> > I used the options: 'tcn_field' and 'tcn_subfield', but they are
>>> not
>>> >> >> > recognized by the marc2bre.pl. I executed the marc2bre.pl in
>>> >> >> > following
>>> >> >> >  way and the output is as follows:
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> dibyendra-laptop:/home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import#
>>> >> >> > perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>>> >> >> > evergreen --db_name evergreen --encoding UTF8 --idfield 852
>>> >> >> > --idsubfield x --tcn_field 852 --tcn_subfield x 8500.mrc >
>>> 8500.bre
>>> >> >> > Unknown option: tcn_field
>>> >> >> > Unknown option: tcn_subfield
>>> >> >> >
>>> >> >> > I guess, these options: 'tcn_field' and 'tcn_subfield' are not
>>> >> >> > implemented on EG 1.4 yet. Is that so? I couldn't find EG 1.6 on
>>> the
>>> >> >> > download page. We're planning to migrate our 10,000 validated
>>> MARC
>>> >> >> > records into Evergreen.
>>> >> >> >
>>> >> >> > Thank you.
>>> >> >> >
>>> >> >> > --
>>> >> >> > Dibyendra Hyoju
>>> >> >> > Madan Puraskar Pustakalaya
>>> >> >> > Patan Dhoka, Lalipur
>>> >> >> > Nepal
>>> >> >> >
>>> >> >> > On Tue, Aug 11, 2009 at 10:15 AM, Dan Scott <denials at gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> 2009/8/8 Dibyendra Hyoju <dibyendra at gmail.com>:
>>> >> >> >> > Hello all,
>>> >> >> >> > I imported the sample records again on the another machine
>>> having
>>> >> >> >> > Evergreen
>>> >> >> >> > 1.4, and after importing that record, I couldn't import any
>>> >> >> >> > records
>>> >> >> >> > from
>>> >> >> >> > other MARC files. I have attached the record '8500.mrc' and
>>> >> >> >> > '8400.mrc'
>>> >> >> >> > herewith. I first executed the SQL generated from 8500.mrc'
>>> >> >> >> > successfully.
>>> >> >> >> > Then, I couldn't execute the SQL generated from '8400.mrc'.
>>> The
>>> >> >> >> > error
>>> >> >> >> > is
>>> >> >> >> > same as before like 'ERROR:  duplicate key violates unique
>>> >> >> >> > constraint
>>> >> >> >> > "biblio_record_unique_tcn"'. I only used the option
>>> "--encoding
>>> >> >> >> > UTF8"
>>> >> >> >> > this
>>> >> >> >> > time. I tried few other records, but I got the same error. Few
>>> of
>>> >> >> >> > the
>>> >> >> >> > tested
>>> >> >> >> > records are attached herewith if somebody wants to volunteer
>>> the
>>> >> >> >> > test. If
>>> >> >> >> > anyone has faced this problem before and have found the
>>> solution,
>>> >> >> >> > please
>>> >> >> >> > help. Any help will be highly appreciated.
>>> >> >> >>
>>> >> >> >> Sorry for the delayed reply, I'm on leave at the moment and not
>>> >> >> >> connected very often.
>>> >> >> >>
>>> >> >> >> marc2bre.pl doesn't really deal with automatically generated
>>> TCNs
>>> >> >> >> all
>>> >> >> >> that well, so you're best off explicitly identifying a source
>>> for
>>> >> >> >> the
>>> >> >> >> TCN. To avoid getting duplicate TCN values and record ID values
>>> if
>>> >> >> >> all
>>> >> >> >> of your records follow the same pattern with the 852 $x field /
>>> >> >> >> subfield accession number identifier, you should also use the
>>> >> >> >> --tcn_field / --tcn_subfield options for marc2bre.pl:
>>> >> >> >>
>>> >> >> >> perl marc2bre.pl --db_user postgres --db_host localhost --db_pw
>>> >> >> >> evergreen --db_name evergreen --encoding UTF8 --idfield 852
>>> >> >> >> --idsubfield x --tcn_field 852 --tcn_subfield x 8400.mrc >
>>> 8400.bre
>>> >> >> >>
>>> >> >> >> I tried importing both your 8000.mrc and 8400.mrc files with the
>>> and
>>> >> >> >> these options worked fine for me with no duplicate value
>>> warnings.
>>> >> >> >> This is with Evergreen rel_1_6_0 on Ubuntu 8.04, but marc2bre.pl
>>> >> >> >> (the
>>> >> >> >> most critical script for parsing out TCN and record number)
>>> shows no
>>> >> >> >> significant differences between 1.4 and rel_1_6_0.
>>> >> >> >>
>>> >> >> >> --
>>> >> >> >> Dan Scott
>>> >> >> >> Laurentian University
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Dan Scott
>>> >> >> Laurentian University
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dan Scott
>>> >> Laurentian University
>>> >
>>>
>>>
>>>
>>> --
>>> Dan Scott
>>> Laurentian University
>>>
>>
>
>
> --
> Dibyendra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20090826/1b85d628/attachment-0001.htm