[OPEN-ILS-DEV] Introduction and Question

Hardy, Elaine ehardy at georgialibraries.org
Mon Sep 17 15:43:34 EDT 2007


Patrick,

Finally had a chance to look at the NP-completeness keyword search.
There are several reasons why you got the result set you did. As Jason
said, np, or (for no paging, I guess) is in the 300 of both the
Toulouse-Lautrec and the Jazz book and the word "complete" is 245
(title) of both. Both of these records are not good MARC records and I
will merge or overlay tomorrow or the next  day to get better ones. As
Jason says, it is not really a valid record -- for example,  the correct
form for the 300 is 1 v. (unpaged). But n.p. is currently a valid
abbreviation in the 260 field when there is no publisher on the piece. I
don't know if the 260 is in the keyword index. I don't really think the
300 field should be.

The remaining two in the result set have NP in the 050 (Library of
Congress Classification number) and "complete" in the 520 (summary, etc.
of the work) I may need to merge these 2 records. I'll do a little more
research on it tomorrow. 

Elaine
 
PS Having the cataloger sanity check would be a good thing. But we would
need to do it on a lot of catalogers, me included, first. 
 
J. Elaine Hardy
Library Services Manager - Collections & Reference
Georgia Public Library Service,
A Unit of the University System of Georgia
1800 Century Place, Suite 150
Atlanta, Ga. 30345-4304
404.235-7128
404.235-7201, fax
 
ehardy at georgialibraries.org
www.georgialibraries.org

-----Original Message-----
From: open-ils-dev-bounces at list.georgialibraries.org
[mailto:open-ils-dev-bounces at list.georgialibraries.org] On Behalf Of
Etheridge, Jason - Gmail
Sent: Friday, September 14, 2007 5:34 PM
To: open-ils-dev at list.georgialibraries.org
Subject: Re: [OPEN-ILS-DEV] Introduction and Question

On 9/14/07, Patrick Durusau <patrick at durusau.net> wrote:
> BTW, I am still curious about the "relevance" algorithm that returned
> jazz music for the search term (without quotes) np-completeness. Or
does
> the system not react well to hyphens in names unless surrounded by
> quotes? Not real sure why it would parse a hyphen but I have seen
odder
> things. (Noting that when I surrounded it with quotes
"np-completeness"
> I got zero hits, not jazz.)

Hi Patrick,

I believe when you quote a search term, it searches for that "exact"
string, with no stemming or other interpretation.  Without the quotes,
I believe EG will strip out punctuation, so you'd basically be doing a
search for np and completeness, or some stemmed variants.  So your
first hit there found a "np" in a 300 field, and "complete" in the
245.  Hrmm, is that a valid record?  For the cases where we do
encounter messed up records, I imagine we could codify some cataloger
sanity checking and not index certain things that look like garbage,
but I don't think it'll ever be perfect.

Here's a wiki document explaining some relevance ranking stuff, though
I don't know if it's still accurate:
http://open-ils.org/dokuwiki/doku.php?id=scratchpad:opac_demo

The "metarecords" it talks about is the FRBR-like groupings you can
get it if you choose Group Formats and Editions in the Advanced
Search.

> PS: One more question: Are there plans to add synonym support to
further
> confuse users with search results? ;-) I would think it would be an
> advanced search option.

I know they're planning multiple thesaurus support, but I think that
might manifest in the "Did you mean/Are you looking for/spellcheck"
feature (another kettle of fish that needs work), and/or in the
authority-based sidebars.

I can't imagine "loosening" search results just to inflate the number
of hits.  I'd rather get zero hits and then a lot of suggestions.

-- Jason
http://esilibrary.com/


More information about the Open-ils-dev mailing list