[OPEN-ILS-DEV] Questions About Query Parser

Mike Rylander mrylander at gmail.com
Thu Nov 14 13:52:24 EST 2013


UPDATE

Dan,

I've built a new branch, and a bug pointing at it, available here:
https://bugs.launchpad.net/evergreen/+bug/1251353

The branch fixes other things as well, so I gave it a separate bug at:
https://bugs.launchpad.net/evergreen/+bug/1251359

If you have a chance to test, that would be grand.

--miker


On Wed, Nov 13, 2013 at 8:50 PM, Mike Rylander <mrylander at gmail.com> wrote:
> Dan,
>
> I've found and fixed your Case 2 problem.  It all came down to
> needless (and harmful) localization of a variable.  IIRC, it was long
> like that, but seems to have become bad after my bool-op pushdown
> change back at the beginning of this hear.  I'm still lacking tuits
> ATM, but I'll bug this tomorrow.  The branch containing (most of) the
> fix is here*, however, it needs just a smidge more work to reset that
> variable at the right time to be safe outside a test environment.
> FWIW, I can't break it now.
>
> Case 1 may need to be a syntax documentation and tpac fixup thing.  If
> you wrap all top-level-only stuff (most filters, modifiers, etc) in {{
> }} then all's well.  We can teach the tpac to do that easily for query
> components that come from from UI elements other than a text input ...
> I think I've suggested this before on a bug or two, and certainly in
> person to some folks.  Anyway, tuits just ran out.  More tomorrow!
>
>  * http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/miker/qp-curlys
>
> --miker
>
> On Wed, Nov 13, 2013 at 3:06 PM, Mike Rylander <mrylander at gmail.com> wrote:
>> I only have a moment, but I can comment on both cases, at least a little.
>>
>>  * Case 1 is very likely a variant of the problem fixed
>> (superficially) in ab37336715cb7d84ed0c30b1cd2e9e6b85933774 and is not
>> fixed in QP-proper, yet.  The current workaround is "put filters and
>> modifiers at the end".
>>  * Case 2 is something new to me.  Could you pull the full query
>> generated by the working and non-working versions out of your logs,
>> per chance? If you're logging long-running queries longer than a
>> second (not a bad idea, IMO), they're likely in there.
>>
>> --miker
>>
>> On Wed, Nov 13, 2013 at 2:24 PM, Dan Wells <dbw2 at calvin.edu> wrote:
>>> Hello all,
>>>
>>> Having made the leap from 2.3 to 2.5, we have run into a few issues with the
>>> updated query parser, and rather than dig into something I know from the
>>> surface to be very complex, I am hoping someone might shed some light on
>>> what is going on.  The issues I am finding *seem* like bugs, but maybe I am
>>> missing something fundamental in my queries.
>>>
>>> You see, we have an external system (SFX) which does automated queries of
>>> the catalog to find book content.  These queries are largely based on ISBN,
>>> and when we upgraded to 2.5, they started to fail in strange ways.  After
>>> much poking and prodding, I have boiled things down to a couple simple cases
>>> with surprising results.  In both cases, the order of operands changes the
>>> result set despite using what I think is a commutative operator (||).  You
>>> can test these queries using our current catalog, http://ulysses.calvin.edu/
>>> , but I can also take some time to find similar issues using Concerto
>>> records if it comes to that.
>>>
>>> Case 1a:
>>> item_form(s) && identifier|isbn:0830837035 || identifier|isbn:1844743829 (no
>>> results)
>>> vs.
>>> identifier|isbn:1844743829 || identifier|isbn:0830837035 && item_form(s) (1
>>> result)
>>>
>>> In this case, I would have expected both to return 1 result.  I also see the
>>> same behavior even if the given ISBN is identical (a contrived example):
>>>
>>> Case 1b:
>>> item_form(s) && identifier|isbn:0830837035 || identifier|isbn:0830837035 (no
>>> results)
>>> vs.
>>> identifier|isbn:0830837035 || identifier|isbn:0830837035 && item_form(s) (1
>>> result)
>>>
>>>
>>>
>>> The next case is similar, but with slightly more nuance.  I have two of the
>>> same title, one print, one electronic.  If I OR the ISBNs together, it
>>> works:
>>>
>>> Case 2a:
>>> identifier|isbn:074944990X || identifier|isbn:0749452897 -- WORKS
>>>
>>> identifier|isbn:0749452897 || identifier|isbn:074944990X – WORKS
>>>
>>> However, if I add a third ISBN to the mix, I now get different results
>>> depending on the order of operands:
>>>
>>> Case2b:
>>> identifier|isbn:074944990X || identifier|isbn:0749452897 ||
>>> identifier|isbn:7313054289 -- DOESN'T WORK (print result)
>>>
>>> identifier|isbn:074944990X || identifier|isbn:7313054289 ||
>>> identifier|isbn:0749452897 -- DOESN'T WORK (print result)
>>>
>>> identifier|isbn:7313054289 || identifier|isbn:074944990X ||
>>> identifier|isbn:0749452897 -- DOESN'T WORK (no results)
>>>
>>> identifier|isbn:7313054289 || identifier|isbn:0749452897 ||
>>> identifier|isbn:074944990X -- DOESN'T WORK (no results)
>>>
>>> identifier|isbn:0749452897 || identifier|isbn:074944990X ||
>>> identifier|isbn:7313054289 -- DOESN'T WORK (e result)
>>>
>>> identifier|isbn:0749452897 || identifier|isbn:7313054289 ||
>>> identifier|isbn:074944990X -- DOESN'T WORK (e result)
>>>
>>> I do seem to get the same behavior when using a more compact query notation
>>> (which I believe should be identical in effect):
>>>
>>> Case 2c:
>>> identifier|isbn:(074944990X || 0749452897) -- WORKS
>>>
>>> identifier|isbn:(074944990X || 0749452897 || 7313054289) -- DOESN'T WORK
>>> (print result)
>>>
>>> identifier|isbn:(0749452897 || 074944990X || 7313054289) -- DOESN'T WORK (e
>>> result)
>>>
>>>
>>>
>>> Based on when development in query parsing was most active, I imagine this
>>> behavior has existed since 2.4.  Can anyone verify that?  Also, is there an
>>> explanation for this behavior which I may be missing?  If not, can anyone
>>> more familiar with this code at least narrow down what is causing these
>>> issues?  I’m willing to dive in if necessary, but given the complexity of
>>> this code, I may not soon have enough free time to effectively troubleshoot
>>> this.
>>>
>>> Finally, I am happy to move the conversation over to LP if that is a better
>>> venue, but I was struggling with pinpointing exactly what this bug affects
>>> (and therefore how to file it properly), so I thought I would first seek
>>> input from the list.
>>>
>>> Thanks,
>>> Dan
>>>
>>>
>>> Daniel Wells
>>> Library Programmer/Analyst
>>> Hekman Library, Calvin College
>>> 616.526.7133
>>>
>>>
>>>
>>
>>
>>
>> --
>> Mike Rylander
>>  | Director of Research and Development
>>  | Equinox Software, Inc. / Your Library's Guide to Open Source
>>  | phone:  1-877-OPEN-ILS (673-6457)
>>  | email:  miker at esilibrary.com
>>  | web:  http://www.esilibrary.com
>
>
>
> --
> Mike Rylander
>  | Director of Research and Development
>  | Equinox Software, Inc. / Your Library's Guide to Open Source
>  | phone:  1-877-OPEN-ILS (673-6457)
>  | email:  miker at esilibrary.com
>  | web:  http://www.esilibrary.com



-- 
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com


More information about the Open-ils-dev mailing list