[OPEN-ILS-DEV] request help optimizing very slow select query for ebooks
Blake Henderson
blake at mobiusconsortium.org
Wed Jul 10 15:37:52 EDT 2019
Melissa,
I've found this magic sauce to do the trick and QUICK
select date_part('month',bre.create_date),count(*)
from
biblio.record_entry bre
where
not bre.deleted and
create_date between '2019-01-01' and '2019-07-01' and
lower(marc) ~ $$<datafield tag="856" ind1="4" ind2="0">$$ AND
(
marc ~ $$tag="008">.......................[oqs]$$
or
marc ~ $$tag="006">......[oqs]$$
)
and
(
marc ~ $$<leader>......[at]$$
)
and
(
marc ~ $$<leader>.......[acdm]$$
)
group by 1
Tweak as needed
-Blake-
Conducting Magic
MOBIUS
On 7/10/2019 1:32 PM, Rogan Hamby wrote:
> Hi Melissa,
>
> Quick question, are you looking for all ebooks or a certain subset
> indicate by the 'full text via%' matching? It's not clear to me if
> that's how you're identifying they are ebooks at all versus a subset
> of them. If you're looking for all of them you can use the fact that
> the fixed field data will map to entries in the
> config.coded_values_map table to identify them. So, for example if
> the ebook search format is id 614 in that table then vlist entries
> in metabib.record_attr_vector_list with a 614 as one of the values
> will be an ebook. That will certainly search faster than multiple
> joins against metabib.real_full_rec.
>
>
>
>
>
> On Wed, Jul 10, 2019 at 2:23 PM Melissa Belvadi <mbelvadi at upei.ca
> <mailto:mbelvadi at upei.ca>> wrote:
>
> We're on 3.0.3, with under 2 million total biblio recs, under 1
> million of those undeleted.
> The following query is taking over 10 hours to run at the command
> line or through pgadmin or dbeaver. For a system our size, that
> seems far too slow.
> We've checked other aspects of our server, e.g. other processes
> running on it and index integrity, and can't account for the long
> execution time.
>
> I'd appreciate any advice on how to either optimize this search or
> restructure it as we'll need to run this regularly. As you can
> guess, there's another query that comes after this that uses the
> temp table, but that responds really fast once the temp table is
> finally made.
> The point of this query is to find all of the ebooks added to the
> system over a given time period, and pull together a bunch of data
> about those books.
> If there's some entirely different way to go about this, eg using
> SRU or some other tool, I'm totally open to that too.
>
> create temp table temp_newebooks (create_date, call_number,
> author, publisher, pubdate, elocation, title, id, isbn,
> contentnote, crecord) as
>
> WITH myconstants as (select to_date('20190101','YYYYMMDD') as
> startdate, to_date('20190701','YYYYMMDD') as enddate,
> cast('20160101' as varchar) as pubstart)
>
> select distinct bre.create_date, mfr.value, r.author, r.publisher,
> r.pubdate, mfr2.value, tfe.value, r.id <http://r.id>, r.isbn,
> mfr3.value, mfr3.record
>
> from biblio.record_entry bre
>
> join metabib.real_full_rec mfr on (mfr.record=bre.id <http://bre.id>)
>
> left join metabib.real_full_rec mfr3 on (mfr3.record=bre.id
> <http://bre.id>) and mfr3.tag = '520' and mfr3.subfield = 'a'
>
> join metabib.real_full_rec mfr2 on (mfr2.record=bre.id
> <http://bre.id>)
>
> join reporter.materialized_simple_record r on (r.id <http://r.id>
> = bre.id <http://bre.id>)
>
> join metabib.title_field_entry tfe on (tfe.source=bre.id
> <http://bre.id>)
>
> where
>
> not bre.deleted
>
> and substring(r.pubdate from 1 for 4) >= (select pubstart from
> myconstants)
>
> and tfe.field = 5
>
> and (mfr.tag like '050' or mfr.tag like '090') and mfr.subfield
> like 'a'
>
> and mfr2.tag like '856' and mfr2.subfield like 'y'
>
> and mfr2.value ilike 'full text via%'
>
> and bre.create_date > (select startdate from myconstants) and
> bre.create_date < (select enddate from myconstants);
>
>
>
> Thank you!
>
> Melissa Belvadi
> Collections Librarian
> University of Prince Edward Island
> mbelvadi at upei.ca <mailto:mbelvadi at upei.ca> 902-566-0581
> my public calendar
> <http://www.google.com/calendar/embed?src=mbelvadi%40upei.ca&ctz=America/Halifax&mode=week>
> Make an appointment <https://mbelvadi.youcanbook.me/> via YouCanBookMe
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-dev/attachments/20190710/f38b0c60/attachment.html>
More information about the Open-ils-dev
mailing list