[OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen

Mike Rylander mrylander at gmail.com
Wed Mar 7 20:29:19 EST 2012


On Wed, Mar 7, 2012 at 2:57 PM, Kathy Lussier <klussier at masslnc.org> wrote:
> Hi Mike,
>
>>>To be clear, weighting hits that come from different index definitions
>>>has always been possible.  2.2 will have a staff client interface to
>>>make it easier, but the capability has been there all along.
>
> Is this staff client interface already available in master? If so, can you
> give me a little more information on how this is done?

It is.  Go to  Admin -> Server Administration -> MARC Search/Facet
Fields and see the Weight field.  The higher the number, the more
"important" the field.

-- 
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com


>
> Thanks!
> Kathy
>
>
>
>>>-----Original Message-----
>>>From: open-ils-general-bounces at list.georgialibraries.org [mailto:open-
>>>ils-general-bounces at list.georgialibraries.org] On Behalf Of Mike
>>>Rylander
>>>Sent: Wednesday, March 07, 2012 10:11 AM
>>>To: Evergreen Discussion Group
>>>Subject: Re: [OPEN-ILS-GENERAL] Improving relevance ranking in
>>>Evergreen
>>>
>>>On Wed, Mar 7, 2012 at 8:35 AM, Hardy, Elaine
>>><ehardy at georgialibraries.org> wrote:
>>>> Kathy,
>>>>
>>>> While the relevance display is much improved in 2.x, it would be good
>>>to
>>>> have greater relevance given, in a keyword search, to title
>>>(specifically
>>>> the 245)and then subject fields. I also see where having a popularity
>>>> ranking might be beneficial.
>>>>
>>>> I just had to explain to a board member of one of our libraries why
>>>his
>>>> search for John Sandford turned up children's titles first. So having
>>>MARC
>>>> field 100s ranked higher than 700 in author searches would be
>>>beneficial
>>>> as well.
>>>>
>>>
>>>To be clear, weighting hits that come from different index definitions
>>>has always been possible.  2.2 will have a staff client interface to
>>>make it easier, but the capability has been there all along.
>>>
>>>Weighting different parts of one indexed term -- say, weighting the
>>>title embedded in the keyword blob higher than the subjects embedded
>>>in the same blob -- would require the above-mentioned "make use of
>>>tsearch class weighting".  But one can approximate that today by
>>>duplicating the index definitions from, say, title, author and subject
>>>classes within the keyword class.
>>>
>>>--
>>>Mike Rylander
>>> | Director of Research and Development
>>> | Equinox Software, Inc. / Your Library's Guide to Open Source
>>> | phone:  1-877-OPEN-ILS (673-6457)
>>> | email:  miker at esilibrary.com
>>> | web:  http://www.esilibrary.com
>>>
>>>
>>>> I can't comment on any of the coding possibilities other than to say
>>>which
>>>> every way doesn't negatively impact search return time is preferable.
>>>>
>>>> Elaine
>>>>
>>>>
>>>> J. Elaine Hardy
>>>> PINES Bibliographic Projects and Metadata Manager
>>>> Georgia Public Library Service,
>>>> A Unit of the University System of Georgia
>>>> 1800 Century Place, Suite 150
>>>> Atlanta, Ga. 30345-4304
>>>> 404.235-7128
>>>> 404.235-7201, fax
>>>>
>>>> ehardy at georgialibraries.org
>>>> www.georgialibraries.org
>>>> http://www.georgialibraries.org/pines/
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: open-ils-general-bounces at list.georgialibraries.org
>>>> [mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf
>>>Of
>>>> Kathy Lussier
>>>> Sent: Tuesday, March 06, 2012 4:43 PM
>>>> To: 'Evergreen Discussion Group'
>>>> Subject: [OPEN-ILS-GENERAL] Improving relevance ranking in Evergreen
>>>>
>>>> Hi all,
>>>>
>>>> I mentioned this during an e-mail discussion on the list last month,
>>>but I
>>>> just wanted to hear from others in the Evergreen community about
>>>whether
>>>> there is a desire to improve the relevance ranking for search results
>>>in
>>>> Evergreen. Currently, we can tweak relevancy in the opensrf.xml, and
>>>it
>>>> can look at things like the document length, word proximity, and
>>>unique
>>>> word count. We've found that we had to remove the modifiers for
>>>document
>>>> length and unique word count to prevent a problem where brief bib
>>>records
>>>> were ranked way too high in our search results.
>>>>
>>>> In our local discussions, we've thought the following enhancements
>>>could
>>>> improve the ranking of search results:
>>>>
>>>> * Giving greater weight to a record if the search terms appear in the
>>>> title or subject (ideally, we would like these field to be
>>>configurable.)
>>>> This is something that is tweakable in search.relevance_ranking, but
>>>my
>>>> understanding is that the use of these tweaks results in a major
>>>reduction
>>>> in search performance.
>>>>
>>>> * Using some type of popularity metric to boost relevancy for popular
>>>> titles. I'm not sure what this metric should be (number of copies
>>>attached
>>>> to record? Total circs in last x months? Total current circs?), but
>>>we
>>>> believe some type of popularity measure would be particularly helpful
>>>in a
>>>> public library where searches will often be for titles that are
>>>popular.
>>>> For example, a search for "twilight" will most likely be for the
>>>Stephanie
>>>> Meyers novel and not this
>>>> http://books.google.com/books/about/Twilight.html?id=zEhkpXCyGzIC.
>>>Mike
>>>> Rylander had indicated in a previous e-mail
>>>> (http://markmail.org/message/h6u5r3sy4nr36wsl) that we might be able
>>>to
>>>> handle this through an overnight cron job without a negative impact
>>>on
>>>> search speeds.
>>>>
>>>> Do others think these two enhancements would improve the search
>>>results in
>>>> Evergreen? Do you think there are other things we could do to improve
>>>> relevancy? My main concern would be that any changes might slow down
>>>> search speeds, and I would want to make sure that we could do
>>>something to
>>>> retrieve better search results without a slowdown.
>>>>
>>>> Also, I was wondering if this type of project might be a good
>>>candidate
>>>> for a Google Summer of Code project.
>>>>
>>>> I look forward to hearing your feedback!
>>>>
>>>> Kathy
>>>>
>>>> -------------------------------------------------------------
>>>> Kathy Lussier
>>>> Project Coordinator
>>>> Massachusetts Library Network Cooperative
>>>> (508) 756-0172
>>>> (508) 755-3721 (fax)
>>>> klussier at masslnc.org
>>>> IM: kmlussier (AOL & Yahoo)
>>>> Twitter: http://www.twitter.com/kmlussier
>>>>
>>>>
>>>>
>>>>
>


More information about the Open-ils-general mailing list