[OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen

Hardy, Elaine ehardy at georgialibraries.org
Wed Mar 7 08:35:53 EST 2012


Kathy,

While the relevance display is much improved in 2.x, it would be good to
have greater relevance given, in a keyword search, to title (specifically
the 245)and then subject fields. I also see where having a popularity
ranking might be beneficial.

I just had to explain to a board member of one of our libraries why his
search for John Sandford turned up children's titles first. So having MARC
field 100s ranked higher than 700 in author searches would be beneficial
as well.

I can't comment on any of the coding possibilities other than to say which
every way doesn't negatively impact search return time is preferable.

Elaine
 

J. Elaine Hardy
PINES Bibliographic Projects and Metadata Manager
Georgia Public Library Service,
A Unit of the University System of Georgia
1800 Century Place, Suite 150
Atlanta, Ga. 30345-4304
404.235-7128
404.235-7201, fax

ehardy at georgialibraries.org
www.georgialibraries.org
http://www.georgialibraries.org/pines/


-----Original Message-----
From: open-ils-general-bounces at list.georgialibraries.org
[mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf Of
Kathy Lussier
Sent: Tuesday, March 06, 2012 4:43 PM
To: 'Evergreen Discussion Group'
Subject: [OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen

Hi all,

I mentioned this during an e-mail discussion on the list last month, but I
just wanted to hear from others in the Evergreen community about whether
there is a desire to improve the relevance ranking for search results in
Evergreen. Currently, we can tweak relevancy in the opensrf.xml, and it
can look at things like the document length, word proximity, and unique
word count. We've found that we had to remove the modifiers for document
length and unique word count to prevent a problem where brief bib records
were ranked way too high in our search results.

In our local discussions, we've thought the following enhancements could
improve the ranking of search results:

* Giving greater weight to a record if the search terms appear in the
title or subject (ideally, we would like these field to be configurable.)
This is something that is tweakable in search.relevance_ranking, but my
understanding is that the use of these tweaks results in a major reduction
in search performance.

* Using some type of popularity metric to boost relevancy for popular
titles. I'm not sure what this metric should be (number of copies attached
to record? Total circs in last x months? Total current circs?), but we
believe some type of popularity measure would be particularly helpful in a
public library where searches will often be for titles that are popular.
For example, a search for "twilight" will most likely be for the Stephanie
Meyers novel and not this
http://books.google.com/books/about/Twilight.html?id=zEhkpXCyGzIC. Mike
Rylander had indicated in a previous e-mail
(http://markmail.org/message/h6u5r3sy4nr36wsl) that we might be able to
handle this through an overnight cron job without a negative impact on
search speeds.

Do others think these two enhancements would improve the search results in
Evergreen? Do you think there are other things we could do to improve
relevancy? My main concern would be that any changes might slow down
search speeds, and I would want to make sure that we could do something to
retrieve better search results without a slowdown.

Also, I was wondering if this type of project might be a good candidate
for a Google Summer of Code project.

I look forward to hearing your feedback!

Kathy

-------------------------------------------------------------
Kathy Lussier
Project Coordinator
Massachusetts Library Network Cooperative
(508) 756-0172
(508) 755-3721 (fax)
klussier at masslnc.org
IM: kmlussier (AOL & Yahoo)
Twitter: http://www.twitter.com/kmlussier






More information about the Open-ils-general mailing list