[OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen

Kathy Lussier klussier at masslnc.org
Tue Mar 6 16:42:55 EST 2012


Hi all,

I mentioned this during an e-mail discussion on the list last month, but I
just wanted to hear from others in the Evergreen community about whether
there is a desire to improve the relevance ranking for search results in
Evergreen. Currently, we can tweak relevancy in the opensrf.xml, and it can
look at things like the document length, word proximity, and unique word
count. We've found that we had to remove the modifiers for document length
and unique word count to prevent a problem where brief bib records were
ranked way too high in our search results.

In our local discussions, we've thought the following enhancements could
improve the ranking of search results:

* Giving greater weight to a record if the search terms appear in the title
or subject (ideally, we would like these field to be configurable.) This is
something that is tweakable in search.relevance_ranking, but my
understanding is that the use of these tweaks results in a major reduction
in search performance. 

* Using some type of popularity metric to boost relevancy for popular
titles. I'm not sure what this metric should be (number of copies attached
to record? Total circs in last x months? Total current circs?), but we
believe some type of popularity measure would be particularly helpful in a
public library where searches will often be for titles that are popular. For
example, a search for "twilight" will most likely be for the Stephanie
Meyers novel and not this
http://books.google.com/books/about/Twilight.html?id=zEhkpXCyGzIC. Mike
Rylander had indicated in a previous e-mail
(http://markmail.org/message/h6u5r3sy4nr36wsl) that we might be able to
handle this through an overnight cron job without a negative impact on
search speeds.

Do others think these two enhancements would improve the search results in
Evergreen? Do you think there are other things we could do to improve
relevancy? My main concern would be that any changes might slow down search
speeds, and I would want to make sure that we could do something to retrieve
better search results without a slowdown.

Also, I was wondering if this type of project might be a good candidate for
a Google Summer of Code project.

I look forward to hearing your feedback!

Kathy

-------------------------------------------------------------
Kathy Lussier
Project Coordinator
Massachusetts Library Network Cooperative
(508) 756-0172
(508) 755-3721 (fax)
klussier at masslnc.org
IM: kmlussier (AOL & Yahoo)
Twitter: http://www.twitter.com/kmlussier
 
 




More information about the Open-ils-general mailing list