[OPEN-ILS-DEV] request help optimizing very slow select query for ebooks

Blake Henderson blake at mobiusconsortium.org
Wed Jul 10 15:37:52 EDT 2019


Melissa,

I've found this magic sauce to do the trick and QUICK

select date_part('month',bre.create_date),count(*)
from
biblio.record_entry bre
where
not bre.deleted and
create_date between '2019-01-01' and '2019-07-01' and
lower(marc) ~ $$<datafield tag="856" ind1="4" ind2="0">$$ AND

(
         marc ~ $$tag="008">.......................[oqs]$$
         or
         marc ~ $$tag="006">......[oqs]$$
)
and
(
         marc ~ $$<leader>......[at]$$
)
and
(
         marc ~ $$<leader>.......[acdm]$$
)

group by 1

Tweak as needed

-Blake-
Conducting Magic
MOBIUS

On 7/10/2019 1:32 PM, Rogan Hamby wrote:
> Hi Melissa,
>
> Quick question, are you looking for all ebooks or a certain subset 
> indicate by the 'full text via%' matching?  It's not clear to me if 
> that's how you're identifying they are ebooks at all versus a subset 
> of them.  If you're looking for all of them you can use the fact that 
> the fixed field data will map to entries in the 
> config.coded_values_map table to identify them.  So, for example if 
> the ebook search format is id 614 in that table then vlist entries 
> in metabib.record_attr_vector_list with a 614 as one of the values 
> will be an ebook.  That will certainly search faster than multiple 
> joins against metabib.real_full_rec.
>
>
>
>
>
> On Wed, Jul 10, 2019 at 2:23 PM Melissa Belvadi <mbelvadi at upei.ca 
> <mailto:mbelvadi at upei.ca>> wrote:
>
>     We're on 3.0.3, with under 2 million total biblio recs, under 1
>     million of those undeleted.
>     The following query is taking over 10 hours to run at the command
>     line or through pgadmin or dbeaver. For a system our size, that
>     seems far too slow.
>     We've checked other aspects of our server, e.g. other processes
>     running on it and index integrity, and can't account for the long
>     execution time.
>
>     I'd appreciate any advice on how to either optimize this search or
>     restructure it as we'll need to run this regularly.  As you can
>     guess, there's another query that comes after this that uses the
>     temp table, but that responds really fast once the temp table is
>     finally made.
>     The point of this query is to find all of the ebooks added to the
>     system over a given time period, and pull together a bunch of data
>     about those books.
>     If there's some entirely different way to go about this, eg using
>     SRU or some other tool, I'm totally open to that too.
>
>     create temp table temp_newebooks (create_date, call_number,
>     author, publisher, pubdate, elocation, title, id, isbn,
>     contentnote, crecord) as
>
>     WITH myconstants as (select to_date('20190101','YYYYMMDD') as
>     startdate, to_date('20190701','YYYYMMDD') as enddate,
>     cast('20160101' as varchar) as pubstart)
>
>     select distinct bre.create_date, mfr.value, r.author, r.publisher,
>     r.pubdate, mfr2.value, tfe.value, r.id <http://r.id>, r.isbn,
>     mfr3.value, mfr3.record
>
>     from biblio.record_entry bre
>
>     join metabib.real_full_rec mfr on (mfr.record=bre.id <http://bre.id>)
>
>     left join metabib.real_full_rec mfr3 on (mfr3.record=bre.id
>     <http://bre.id>) and mfr3.tag = '520' and mfr3.subfield = 'a'
>
>     join metabib.real_full_rec mfr2 on (mfr2.record=bre.id
>     <http://bre.id>)
>
>     join reporter.materialized_simple_record r on (r.id <http://r.id>
>     = bre.id <http://bre.id>)
>
>     join metabib.title_field_entry tfe on (tfe.source=bre.id
>     <http://bre.id>)
>
>     where
>
>     not bre.deleted
>
>     and substring(r.pubdate from 1 for 4) >= (select pubstart from
>     myconstants)
>
>     and tfe.field = 5
>
>     and (mfr.tag like '050' or mfr.tag like '090') and mfr.subfield
>     like 'a'
>
>     and mfr2.tag like '856' and mfr2.subfield like 'y'
>
>     and mfr2.value  ilike 'full text via%'
>
>     and bre.create_date > (select startdate from myconstants) and
>     bre.create_date < (select enddate from myconstants);
>
>
>
>     Thank you!
>
>     Melissa Belvadi
>     Collections Librarian
>     University of Prince Edward Island
>     mbelvadi at upei.ca <mailto:mbelvadi at upei.ca> 902-566-0581
>     my public calendar
>     <http://www.google.com/calendar/embed?src=mbelvadi%40upei.ca&ctz=America/Halifax&mode=week>
>     Make an appointment <https://mbelvadi.youcanbook.me/> via YouCanBookMe
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-dev/attachments/20190710/f38b0c60/attachment.html>


More information about the Open-ils-dev mailing list