[Evergreen-dev] marc_export - apache crashes
Jason Stephenson
jason at sigio.com
Mon Oct 9 09:45:00 EDT 2023
Brian,
If marc_export causes a crash, then your database server is under
powered (most likely). There could be something buggy or misconfigured
somewhere.
marc_export does not use Apache or the Evergreen back end in any serious
way. It looks up some settings via OpenSRF::Utils::SettingsClient and
that's it. The rest of the time it runs select statements in the database.
If you want to batch export records, short of writing your own tool,
marc_export is it. I also doubt you would gain much performance from a
custom tool because the database is the main bottleneck. It's possible
that some of the queries used by marc_export could be improved,
particularly for more recent PostgreSQL versions.
I implement many custom exports, and they are typically wrappers around
marc_export. I will implement queries to find the set of records that I
want and then pipe the record IDs into marc_export, or the script might
determine what options to use when running marc_export based on a
configuration file.
If you have any specific ideas to improve marc_export or the export
process from the staff client, feel free to file bugs on Launchpad:
https://bugs.launchpad.net/evergreen.
Sorry that I can't be more helpful at this time.
Jason Stephenson
On 10/9/23 08:56, Brian Holda via Evergreen-dev wrote:
> Or maybe a better way to ask this. Have people found a good way to
> export a large number of marc records within Evergreen? We found the
> staff client way to do it. And it processes files of 5-10,000 records at
> a time. But if we want to do 1 million records, let's say, it's a bit
> tedious. So then I found the marc_export script
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.evergreen-2Dils.org_3.2_-5Fmarc-5Fexport-5Fexporting-5Fbibliographic-5Frecords-5Finto-5Fmarc-5Ffiles.html&d=DwMGaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=rB3XDC6iCWGkjZtiGXbRHlEfGQP12yvXoVpChsQG6IY&m=PZmft0gaDkWQJ-PsQAmuzLbFyoxYjph24cGK4vqaTEXFnORZ9vEDMUnFRbuzb4np&s=28sCuznEp6eX3W0zi51ryW-NBwTZ0P8RjE4l6oEL_rE&e=>. But that crashed our server doing it with 3,000 records at a time. We have ideas on how to modify the process, and it's not terrible using the staff client way, but I figure this must be a somewhat common task that others have good solutions for? Anyone willing to share 🙂?
>
> Thanks,
> Brian
>
> Brian Holda
> Library Technology Manager
> Hekman Library
> Calvin University
> (616) 526-8673
>
> <https://library.calvin.edu/>
>
> ------------------------------------------------------------------------
> *From:* Evergreen-dev <evergreen-dev-bounces at list.evergreen-ils.org> on
> behalf of Brian Holda via Evergreen-dev
> <evergreen-dev at list.evergreen-ils.org>
> *Sent:* Thursday, October 5, 2023 4:25 PM
> *To:* Evergreen Development Discussion List
> <evergreen-dev at list.evergreen-ils.org>
> *Subject:* [Evergreen-dev] marc_export - apache crashes
> Hi all,
>
> Not sure if it's user error or something else going on, so wanted to see
> if any of you all have experience using marc_export script
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.evergreen-2Dils.org_3.2_-5Fmarc-5Fexport-5Fexporting-5Fbibliographic-5Frecords-5Finto-5Fmarc-5Ffiles.html&d=DwMGaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=rB3XDC6iCWGkjZtiGXbRHlEfGQP12yvXoVpChsQG6IY&m=PZmft0gaDkWQJ-PsQAmuzLbFyoxYjph24cGK4vqaTEXFnORZ9vEDMUnFRbuzb4np&s=28sCuznEp6eX3W0zi51ryW-NBwTZ0P8RjE4l6oEL_rE&e=> and had similar problems.
>
> In brief:
>
> * Tue, 5pm - I ran the following test (this is for a file of 3,100
> records). This took about 30 sec. and successfully created the
> export file without any noticeable effects on our apache2 server:
> |cat /home/opensrf/marc-test.txt | marc_export --reporter -i -c
> /openils/conf/opensrf_core.xml -x /openils/conf/fm_IDL.xml -f
> XML --timeout 5 > exported_files.xml|
> *
> Wed, 11:40am- I ran what I thought was essentially the same test
> (for the same file of 3,100 records). This also took about 30 sec.
> and successfully created the export file. However, 8 min. later
> apache crashed and had to be restarted. In the error log, it said
> "couldn't grab the accept mutex" immediately before crashing. Here's
> the code I ran:
> cat /tmp/marc-output/marc1.txt | marc_export --reporter -i -c
> /openils/conf/opensrf_core.xml -x /openils/conf/fm_IDL.xml -f
> XML --timeout 5 > /tmp/marc-output/exported-marc1.xml
> *
> Wed, 4pm- I ran essentially the same command (for the same file of
> 3,100 records), but without using the |tmp| folder. This time it
> stalled and after waiting a few minutes we pressed |ctrl| + |c|
> which I assumed stopped everything cleanly, as it returned me to the
> command prompt. However, at 4:50pm apache quit again, with the same
> "couldn't grab the accept mutex" messages beforehand. Here's the
> code I ran this time:
> |cat /home/opensrf/marc2.txt | marc_export --reporter -i -c
> /openils/conf/opensrf_core.xml \ -x /openils/conf/fm_IDL.xml -f XML
> --timeout 5 > /home/opensrf/exported-marc2.xml|
>
> Anyone know what might be happening here?
>
> Brian Holda
> Library Technology Manager
> Hekman Library
> Calvin University
> (616) 526-8673
>
> <https://library.calvin.edu/>
>
>
> _______________________________________________
> Evergreen-dev mailing list
> Evergreen-dev at list.evergreen-ils.org
> http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev
More information about the Evergreen-dev
mailing list