[OPEN-ILS-GENERAL] Import issues

Tue Sep 15 07:54:05 EDT 2009

Hi Dan,

Thanks a lot for your detail explanations.

Last time, I remember that I had generated and executed the output SQL
immediately after processing the MARC records and generated SQL for other
MARC records only after executing the SQL.

Today, I vacuumed the database by rebuilding the database again according to
your instruction. I cleaned all the old files generated by the marc2bre.pl,
direct_ingest.pl, and pg_loader.pl. I tried the option '--startid 8401' and
imported 8400.mrc successfully into Evergreen 1.4.0.4. I repeated the same
option for 8500.mrc with startid as 8501, but couldn't import the sql
output.

The overall steps that i have performed to import two MARC records are as
follows:

#cd /home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/sql/Pg
#./build-db.sh localhost 5432 evergreen postgres evergreen 82
#cd /home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import/
#perl marc2bre.pl --startid 8401 --encoding UTF8 --db_user postgres
--db_host localhost --db_pw evergreen --db_name evergreen  8400.mrc >
~/8400.bre
#perl direct_ingest.pl ~/8400.bre > ~/8400.ingest
#perl pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe -or
mkfe -or msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
--output=8400.sql < ~/8400.ingest
#cp 8400.sql ~
#psql -U evergreen evergreen
#\i ~/8400.sql
# \i
/home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import/quick_metarecord_map.sql
#perl marc2bre.pl --startid 8501 --encoding UTF8 --db_user postgres
--db_host localhost --db_pw evergreen --db_name evergreen  8500.mrc >
~/8500.bre
#perl direct_ingest.pl ~/8500.bre > ~/8500.ingest
#perl pg_loader.pl -or bre -or mrd -or mfr -or mtfe -or mafe -or msfe -or
mkfe -or msefe -a mrd -a mfr -a mtfe -a mafe -a msfe -a mkfe -a msefe
--output=8500.sql < ~/8500.ingest
#cp 8500.sql ~
#psql -U evergreen evergreen
#\i ~/8500.sql
# \i
/home/opensrf/Evergreen-ILS-1.4.0.4/Open-ILS/src/extras/import/quick_metarecord_map.sql

I followed the above steps without the option 'startid' as well. I first
processed the 8400.mrc and imported the records successfully. Then after, I
again processed the 8500.mrc, but I couldn't import the sql output.

Would you please kindly provide me the steps by which you imported two sets
of MARC records? Were the steps similar to above or were different? Please
help.

I look forward to hearing from you.

Thanks in advance.

With kind regards,
Dibyendra

On Fri, Sep 4, 2009 at 9:18 PM, Dan Scott <denials at gmail.com> wrote:

> 2009/8/26 Dibyendra Hyoju <dibyendra at gmail.com>:
> > Hi Dan,
> > I couldn't solve the problem about the duplication error " ERROR:
> duplicate
> > key violates unique constraint "biblio_record_unique_tcn", even after
> using
> > the option "--used_tcn_file". The file that is used by the option
> contains
> > two lines like you had suggested. I searched for the solution for couple
> of
> > days related to this issue, but couldn't find the relevant solutions. So,
> > would you please kindly send me the file required for the
> > option "--used_tcn_file", or suggest me some alternative solution by
> which I
> > can import our library records into Evergreen? We are currently testing
> the
> > features of Evergreen 1.4.
> > I look forward to hearing from you.
>
> Hi Dibyendra:
>
> I just imported both 8400.mrc and 8500.mrc successfully into an
> Evergreen 1.4.0.4 system.
>
> I suspect the problem that you're running into at the moment is that
> you're processing 8400.mrc and 8500.mrc first, then loading their SQL
> files into the database.
>

> This approach will result in duplicate TCN values and failure because
> marc2bre.pl is generating TCN values based on the record ID, which is
> just a numeric sequence in the database. When you run marc2bre.pl, it
> grabs the current maximum record ID from the database and uses that as
> the starting point for the record IDs & TCN values that it generates.
> So if you process both files first, then load them into the database,
> they will both have been based on the same starting point for record
> ID.
>
> On the other hand, if you process 8400.mrc first and load 8400.sql
> into the database, then when you process 8500.mrc marc2bre.pl will
> start generating record IDs that are beyond the range that were
> generated for 8400.mrc. So, your safest approach would be to process
> each file and load it into the database.
>
> Alternately, if you want to process a bunch of files in batch, and you
> know you have 100 records per file, you could use the "--startid"
> parameter to marc2bre.pl to force it to start at a new, unused range
> of record IDs for each batch. For example:
>
> perl marc2bre.pl --startid 8401 ... 8400.mrc > 8400.bre
> ...
> perl marc2bre.pl --startid 8501 ... 8400.mrc > 8500.bre
> ...
>
> Sorry for the delays in responding, I've been on leave and not
> spending much time online.
>
> (As an aside, marc2bre.pl could probably stand to be taught to be
> smarter about parsing & generating TCN values, but this should get you
> by for now).
>
> --
> Dan Scott
> Laurentian University
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20090915/e8176517/attachment.htm