[OPEN-ILS-GENERAL] SQL Query Performance

Galen Charlton gmc at esilibrary.com
Fri Apr 26 16:21:11 EDT 2013


Hi,

On Fri, Apr 26, 2013 at 12:36 PM, Joshua D. Drake <jd at commandprompt.com> wrote:
> I think this is a false test. You don't have an index on
> asset.copy(barcode). See:

Yes, I know.  The databases I ran the explain on were using stock(ish)
Evergreen schemas, where the only index on asset.copy (barcode) was
the partial one.

In particular, my test were inspired by Dan's response to this from
one of your earlier emails:

> "copy_barcode_key" UNIQUE, btree (barcode) WHERE deleted = false OR deleted IS FALSE
>
> Although a valid index, the two conditions within the index are duplicative. It would be better to drop that index and just use:

The "duplicative" conditions are indeed a pain.  I would be happier if
I could just say:

CREATE UNIQUE INDEX copy_barcode_key ON asset.copy(barcode) WHERE NOT deleted;

and trust that the query parser would pick up on all of the
(reasonable) ways of asserting a falsity condition on a boolean column
and treat them the same for query planning purposes.

It occurred to me that (of course) it's been a while since 8.4 was
released or in active use by most Evergreen sites, so I decided to do
aforementioned tests to see if the query parser had gotten smarter. So
far, the answer is no -- and that was the point of my 12:14 PDT email.
 But the fine details of partial index clause involving Boolean
columns is almost a digression from the original thread.

Sure, adding a plain index on asset.copy(barcode) works around this
(probably minor, all things considered) limitation of the query
planner.

And if an Evergreen user wants to create such an index because they
need it for the particular ad hoc queries they expect to run, there's
little reason not to.  Of course, each additional index comes with a
cost in terms of disk space and row insertion and update time (and I
say this simply to address a potential question from the peanut
gallery about whether to index every last column in the Evergreen
schema ;-) ), but if one needs to regularly run queries on deleted
items by barcode, it's unlikely that the small tradeoff wouldn't be
worth it.

However, if one's item reporting deals with non-deleted items, as is
by far the most common case, one can also simply tack on a 'deleted is
false' clause to queries on asset.copy.

In fact, regardless of the particular indexes one does or does not
have, liberal use of 'deleted is false' conditions for queries on
asset.copy, asset.call_number, and biblio.record_entry (and a few
other tables) is desirable for many ad hoc reports so that the report
doesn't inadvertently include data on discarded items.  Unless, of
course, you're doing (say) a year-end report on circulation where you
do care about items that circulated that year but later got weeded.

Regardless, Evergreen users should not drop the partial index, as the
application does need to unique-if-not-deleted constraint on item
barcodes.

Regards,

Galen
--
Galen Charlton
Manager of Implementation
Equinox Software, Inc. / The Open Source Experts
email:  gmc at esilibrary.com
direct: +1 770-709-5581
cell:   +1 404-984-4366
skype:  gmcharlt
web:    http://www.esilibrary.com/
Supporting Koha and Evergreen: http://koha-community.org &
http://evergreen-ils.org


More information about the Open-ils-general mailing list