[OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen

Mike Rylander mrylander at gmail.com
Wed Mar 7 10:11:23 EST 2012


On Wed, Mar 7, 2012 at 8:35 AM, Hardy, Elaine
<ehardy at georgialibraries.org> wrote:
> Kathy,
>
> While the relevance display is much improved in 2.x, it would be good to
> have greater relevance given, in a keyword search, to title (specifically
> the 245)and then subject fields. I also see where having a popularity
> ranking might be beneficial.
>
> I just had to explain to a board member of one of our libraries why his
> search for John Sandford turned up children's titles first. So having MARC
> field 100s ranked higher than 700 in author searches would be beneficial
> as well.
>

To be clear, weighting hits that come from different index definitions
has always been possible.  2.2 will have a staff client interface to
make it easier, but the capability has been there all along.

Weighting different parts of one indexed term -- say, weighting the
title embedded in the keyword blob higher than the subjects embedded
in the same blob -- would require the above-mentioned "make use of
tsearch class weighting".  But one can approximate that today by
duplicating the index definitions from, say, title, author and subject
classes within the keyword class.

-- 
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com


> I can't comment on any of the coding possibilities other than to say which
> every way doesn't negatively impact search return time is preferable.
>
> Elaine
>
>
> J. Elaine Hardy
> PINES Bibliographic Projects and Metadata Manager
> Georgia Public Library Service,
> A Unit of the University System of Georgia
> 1800 Century Place, Suite 150
> Atlanta, Ga. 30345-4304
> 404.235-7128
> 404.235-7201, fax
>
> ehardy at georgialibraries.org
> www.georgialibraries.org
> http://www.georgialibraries.org/pines/
>
>
> -----Original Message-----
> From: open-ils-general-bounces at list.georgialibraries.org
> [mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf Of
> Kathy Lussier
> Sent: Tuesday, March 06, 2012 4:43 PM
> To: 'Evergreen Discussion Group'
> Subject: [OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen
>
> Hi all,
>
> I mentioned this during an e-mail discussion on the list last month, but I
> just wanted to hear from others in the Evergreen community about whether
> there is a desire to improve the relevance ranking for search results in
> Evergreen. Currently, we can tweak relevancy in the opensrf.xml, and it
> can look at things like the document length, word proximity, and unique
> word count. We've found that we had to remove the modifiers for document
> length and unique word count to prevent a problem where brief bib records
> were ranked way too high in our search results.
>
> In our local discussions, we've thought the following enhancements could
> improve the ranking of search results:
>
> * Giving greater weight to a record if the search terms appear in the
> title or subject (ideally, we would like these field to be configurable.)
> This is something that is tweakable in search.relevance_ranking, but my
> understanding is that the use of these tweaks results in a major reduction
> in search performance.
>
> * Using some type of popularity metric to boost relevancy for popular
> titles. I'm not sure what this metric should be (number of copies attached
> to record? Total circs in last x months? Total current circs?), but we
> believe some type of popularity measure would be particularly helpful in a
> public library where searches will often be for titles that are popular.
> For example, a search for "twilight" will most likely be for the Stephanie
> Meyers novel and not this
> http://books.google.com/books/about/Twilight.html?id=zEhkpXCyGzIC. Mike
> Rylander had indicated in a previous e-mail
> (http://markmail.org/message/h6u5r3sy4nr36wsl) that we might be able to
> handle this through an overnight cron job without a negative impact on
> search speeds.
>
> Do others think these two enhancements would improve the search results in
> Evergreen? Do you think there are other things we could do to improve
> relevancy? My main concern would be that any changes might slow down
> search speeds, and I would want to make sure that we could do something to
> retrieve better search results without a slowdown.
>
> Also, I was wondering if this type of project might be a good candidate
> for a Google Summer of Code project.
>
> I look forward to hearing your feedback!
>
> Kathy
>
> -------------------------------------------------------------
> Kathy Lussier
> Project Coordinator
> Massachusetts Library Network Cooperative
> (508) 756-0172
> (508) 755-3721 (fax)
> klussier at masslnc.org
> IM: kmlussier (AOL & Yahoo)
> Twitter: http://www.twitter.com/kmlussier
>
>
>
>


More information about the Open-ils-general mailing list