[OPEN-ILS-DEV] request help optimizing very slow select query for ebooks

Melissa Belvadi mbelvadi at upei.ca
Thu Jul 11 08:13:42 EDT 2019


Thank you very much Blake and Rogan!  Two excellent tips for me to explore
to improve how I find ebooks!

Melissa Belvadi
Collections Librarian
University of Prince Edward Island
mbelvadi at upei.ca  902-566-0581
my public calendar
<http://www.google.com/calendar/embed?src=mbelvadi%40upei.ca&ctz=America/Halifax&mode=week>
Make an appointment <https://mbelvadi.youcanbook.me/> via YouCanBookMe





On Wed, Jul 10, 2019 at 4:40 PM Blake Henderson <blake at mobiusconsortium.org>
wrote:

> Melissa,
>
> I've found this magic sauce to do the trick and QUICK
>
> select date_part('month',bre.create_date),count(*)
> from
> biblio.record_entry bre
> where
> not bre.deleted and
> create_date between '2019-01-01' and '2019-07-01' and
> lower(marc) ~ $$<datafield tag="856" ind1="4" ind2="0">$$ AND
>
> (
>         marc ~ $$tag="008">.......................[oqs]$$
>         or
>         marc ~ $$tag="006">......[oqs]$$
> )
> and
> (
>         marc ~ $$<leader>......[at]$$
> )
> and
> (
>         marc ~ $$<leader>.......[acdm]$$
> )
>
> group by 1
>
> Tweak as needed
>
> -Blake-
> Conducting Magic
> MOBIUS
>
>
> On 7/10/2019 1:32 PM, Rogan Hamby wrote:
>
> Hi Melissa,
>
> Quick question, are you looking for all ebooks or a certain subset
> indicate by the 'full text via%' matching?  It's not clear to me if that's
> how you're identifying they are ebooks at all versus a subset of them.  If
> you're looking for all of them you can use the fact that the fixed field
> data will map to entries in the config.coded_values_map table to identify
> them.  So, for example if the ebook search format is id 614 in that table
> then vlist entries in metabib.record_attr_vector_list with a 614 as one of
> the values will be an ebook.  That will certainly search faster than
> multiple joins against metabib.real_full_rec.
>
>
>
>
>
> On Wed, Jul 10, 2019 at 2:23 PM Melissa Belvadi <mbelvadi at upei.ca> wrote:
>
>> We're on 3.0.3, with under 2 million total biblio recs, under 1 million
>> of those undeleted.
>> The following query is taking over 10 hours to run at the command line or
>> through pgadmin or dbeaver. For a system our size, that seems far too slow.
>> We've checked other aspects of our server, e.g. other processes running
>> on it and index integrity, and can't account for the long execution time.
>>
>> I'd appreciate any advice on how to either optimize this search or
>> restructure it as we'll need to run this regularly.  As you can guess,
>> there's another query that comes after this that uses the temp table, but
>> that responds really fast once the temp table is finally made.
>> The point of this query is to find all of the ebooks added to the system
>> over a given time period, and pull together a bunch of data about those
>> books.
>> If there's some entirely different way to go about this, eg using SRU or
>> some other tool, I'm totally open to that too.
>>
>> create temp table temp_newebooks (create_date, call_number, author,
>> publisher, pubdate, elocation, title, id, isbn, contentnote, crecord) as
>>
>> WITH myconstants as (select to_date('20190101','YYYYMMDD') as startdate,
>> to_date('20190701','YYYYMMDD') as enddate, cast('20160101' as varchar) as
>> pubstart)
>>
>> select distinct bre.create_date, mfr.value, r.author, r.publisher,
>> r.pubdate, mfr2.value, tfe.value, r.id, r.isbn, mfr3.value, mfr3.record
>>
>> from biblio.record_entry bre
>>
>> join metabib.real_full_rec mfr on (mfr.record=bre.id)
>>
>> left join metabib.real_full_rec mfr3 on (mfr3.record=bre.id) and
>> mfr3.tag = '520' and mfr3.subfield = 'a'
>>
>> join metabib.real_full_rec mfr2 on (mfr2.record=bre.id)
>>
>> join reporter.materialized_simple_record r on (r.id = bre.id)
>>
>> join metabib.title_field_entry tfe on (tfe.source=bre.id)
>>
>> where
>>
>> not bre.deleted
>>
>> and substring(r.pubdate from 1 for 4) >= (select pubstart from
>> myconstants)
>>
>> and tfe.field = 5
>>
>> and (mfr.tag like '050' or mfr.tag like '090') and mfr.subfield like 'a'
>>
>> and mfr2.tag like '856' and mfr2.subfield like 'y'
>>
>> and mfr2.value  ilike 'full text via%'
>>
>> and bre.create_date > (select startdate from myconstants) and
>> bre.create_date < (select enddate from myconstants);
>>
>>
>> Thank you!
>>
>> Melissa Belvadi
>> Collections Librarian
>> University of Prince Edward Island
>> mbelvadi at upei.ca  902-566-0581
>> my public calendar
>> <http://www.google.com/calendar/embed?src=mbelvadi%40upei.ca&ctz=America/Halifax&mode=week>
>> Make an appointment <https://mbelvadi.youcanbook.me/> via YouCanBookMe
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-dev/attachments/20190711/7d52ad3b/attachment.html>


More information about the Open-ils-dev mailing list