[OPEN-ILS-GENERAL] question about relevance - mult eds of same book not in most recent order

Joe Atzberger atz at esilibrary.com
Fri Sep 4 15:18:31 EDT 2009


Karen Schneider wrote:
>
> Indexed-field weighting, which controls relevance ranking in 
> Evergreen, is configured in the database (no UI available yet) on the 
> table called config.metabib_field, using the ‘weight’ column.
>
> (The other four columns are field_class, name, xpath, and format; the 
> table is too wide to display in this email, but here is one line: 
> author    | conference  | 
> //mods32:mods/mods32:name[@type='conference']/mods32:namePart[../mods32:role/mods32:roleTerm[text()='creator']] 
> | mods32 |      1 )
>
> The default value for index-field weights is “1.” Adjust the weighting 
> of indexed fields to give those fields a boost in searching. The 
> larger the value for ‘weight,' the higher the relevance score for 
> matches on that indexed field.
>
> For example, by increasing the weight of the title-proper field,  a 
> search for *jaguar* would give higher relevance to the book titled 
> /Aimee and Jaguar /than to a record with the term *jaguar *in another 
> indexed field.
>
> You can also add generic matchpoint bonuses for the following types:
>
> *first_word* — boosts relevance if the query is one term long and 
> matches the first term in the indexed field (search for *twain*, get a 
> bonus for *twain, mark* but not* mark twain*)
>
> *word_order* — increases relevance for words matching the order of 
> search terms, so that the results for the search *legend suicide* 
> would match higher for the book *Legend of a Suicide* than for the 
> book, *Suicide Legend*
>
> *full_match* — full_match — boosts relevance when the full query 
> exactly matches the entire indexed field (after space, case and 
> diacritic normalization on both). So a title search for *The Future of 
> Ice* would get a relevance boost above *Ice Ages of the Future*.  **
>
> The matchpoint bonuses are configured on a table called 
> search.relevance_adjustment, using the ‘multiplier’ column.  That is a 
> floating-point multiplier, where the relevance score is multiplied by 
> that at the end.  So, if the first-word bonus is 1.2, then the 
> relevance score gets a 20% bonus (x * 1.2).
>
> The search.relevance_adjustment weighting can be adjusted for each field.
>
> The search.relevance_adjustment table has three other columns: 
> field_class, name, and bump_type. Here are several lines from the 
> search.relevance_adjustment table:
>
> title       | translated  | word_order |         10
> title       | uniform     | first_word |        1.5
> title       | uniform     | full_match |         20
> title       | uniform     | word_order |         10
>
> Does that help? If so, I'll put this on the DocBook docket.
>
> Big ol' thanks to Mike Rylander for helping me with this answer!
>
> -- 
> -- 
> | Karen G. Schneider
> | Community Librarian
> | Equinox Software Inc. "The Evergreen Experts"
> | Toll-free: 1.877.Open.ILS (1.877.673.6457) x712
> | kgs at esilibrary.com <mailto:kgs at esilibrary.com>
> | Web: http://www.esilibrary.com
>
So this doesn't directly address the core complaint, which is that the 
different editions of the *same* title do not show up in a predictable 
order.  Is the suggestion that there may be a weighting that would 
impose a sensible order (i.e., newest first)?   Keep in mind that the 
user will not have searched for "5th ed." or "2009", so whatever xpath 
that targets that info would need to *influence* but *not match*.  For 
that reason, it seems to me that matching rules will not accommodate the 
intended behavior. 

Bring on the FRBR.

--joe atzberger


More information about the Open-ils-general mailing list