[OPEN-ILS-DEV] Questions About Query Parser
Dan Wells
dbw2 at calvin.edu
Wed Nov 13 14:24:18 EST 2013
Hello all,
Having made the leap from 2.3 to 2.5, we have run into a few issues with the updated query parser, and rather than dig into something I know from the surface to be very complex, I am hoping someone might shed some light on what is going on. The issues I am finding *seem* like bugs, but maybe I am missing something fundamental in my queries.
You see, we have an external system (SFX) which does automated queries of the catalog to find book content. These queries are largely based on ISBN, and when we upgraded to 2.5, they started to fail in strange ways. After much poking and prodding, I have boiled things down to a couple simple cases with surprising results. In both cases, the order of operands changes the result set despite using what I think is a commutative operator (||). You can test these queries using our current catalog, http://ulysses.calvin.edu/ , but I can also take some time to find similar issues using Concerto records if it comes to that.
Case 1a:
item_form(s) && identifier|isbn:0830837035 || identifier|isbn:1844743829 (no results)
vs.
identifier|isbn:1844743829 || identifier|isbn:0830837035 && item_form(s) (1 result)
In this case, I would have expected both to return 1 result. I also see the same behavior even if the given ISBN is identical (a contrived example):
Case 1b:
item_form(s) && identifier|isbn:0830837035 || identifier|isbn:0830837035 (no results)
vs.
identifier|isbn:0830837035 || identifier|isbn:0830837035 && item_form(s) (1 result)
The next case is similar, but with slightly more nuance. I have two of the same title, one print, one electronic. If I OR the ISBNs together, it works:
Case 2a:
identifier|isbn:074944990X || identifier|isbn:0749452897 -- WORKS
identifier|isbn:0749452897 || identifier|isbn:074944990X - WORKS
However, if I add a third ISBN to the mix, I now get different results depending on the order of operands:
Case2b:
identifier|isbn:074944990X || identifier|isbn:0749452897 || identifier|isbn:7313054289 -- DOESN'T WORK (print result)
identifier|isbn:074944990X || identifier|isbn:7313054289 || identifier|isbn:0749452897 -- DOESN'T WORK (print result)
identifier|isbn:7313054289 || identifier|isbn:074944990X || identifier|isbn:0749452897 -- DOESN'T WORK (no results)
identifier|isbn:7313054289 || identifier|isbn:0749452897 || identifier|isbn:074944990X -- DOESN'T WORK (no results)
identifier|isbn:0749452897 || identifier|isbn:074944990X || identifier|isbn:7313054289 -- DOESN'T WORK (e result)
identifier|isbn:0749452897 || identifier|isbn:7313054289 || identifier|isbn:074944990X -- DOESN'T WORK (e result)
I do seem to get the same behavior when using a more compact query notation (which I believe should be identical in effect):
Case 2c:
identifier|isbn:(074944990X || 0749452897) -- WORKS
identifier|isbn:(074944990X || 0749452897 || 7313054289) -- DOESN'T WORK (print result)
identifier|isbn:(0749452897 || 074944990X || 7313054289) -- DOESN'T WORK (e result)
Based on when development in query parsing was most active, I imagine this behavior has existed since 2.4. Can anyone verify that? Also, is there an explanation for this behavior which I may be missing? If not, can anyone more familiar with this code at least narrow down what is causing these issues? I'm willing to dive in if necessary, but given the complexity of this code, I may not soon have enough free time to effectively troubleshoot this.
Finally, I am happy to move the conversation over to LP if that is a better venue, but I was struggling with pinpointing exactly what this bug affects (and therefore how to file it properly), so I thought I would first seek input from the list.
Thanks,
Dan
Daniel Wells
Library Programmer/Analyst
Hekman Library, Calvin College
616.526.7133
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-dev/attachments/20131113/ec5e1a87/attachment.htm>
More information about the Open-ils-dev
mailing list