[OPEN-ILS-GENERAL] question about relevance - mult eds of same book not in most recent order
Joe Atzberger
atz at esilibrary.com
Fri Sep 4 15:18:31 EDT 2009
Karen Schneider wrote:
>
> Indexed-field weighting, which controls relevance ranking in
> Evergreen, is configured in the database (no UI available yet) on the
> table called config.metabib_field, using the ‘weight’ column.
>
> (The other four columns are field_class, name, xpath, and format; the
> table is too wide to display in this email, but here is one line:
> author | conference |
> //mods32:mods/mods32:name[@type='conference']/mods32:namePart[../mods32:role/mods32:roleTerm[text()='creator']]
> | mods32 | 1 )
>
> The default value for index-field weights is “1.” Adjust the weighting
> of indexed fields to give those fields a boost in searching. The
> larger the value for ‘weight,' the higher the relevance score for
> matches on that indexed field.
>
> For example, by increasing the weight of the title-proper field, a
> search for *jaguar* would give higher relevance to the book titled
> /Aimee and Jaguar /than to a record with the term *jaguar *in another
> indexed field.
>
> You can also add generic matchpoint bonuses for the following types:
>
> *first_word* — boosts relevance if the query is one term long and
> matches the first term in the indexed field (search for *twain*, get a
> bonus for *twain, mark* but not* mark twain*)
>
> *word_order* — increases relevance for words matching the order of
> search terms, so that the results for the search *legend suicide*
> would match higher for the book *Legend of a Suicide* than for the
> book, *Suicide Legend*
>
> *full_match* — full_match — boosts relevance when the full query
> exactly matches the entire indexed field (after space, case and
> diacritic normalization on both). So a title search for *The Future of
> Ice* would get a relevance boost above *Ice Ages of the Future*. **
>
> The matchpoint bonuses are configured on a table called
> search.relevance_adjustment, using the ‘multiplier’ column. That is a
> floating-point multiplier, where the relevance score is multiplied by
> that at the end. So, if the first-word bonus is 1.2, then the
> relevance score gets a 20% bonus (x * 1.2).
>
> The search.relevance_adjustment weighting can be adjusted for each field.
>
> The search.relevance_adjustment table has three other columns:
> field_class, name, and bump_type. Here are several lines from the
> search.relevance_adjustment table:
>
> title | translated | word_order | 10
> title | uniform | first_word | 1.5
> title | uniform | full_match | 20
> title | uniform | word_order | 10
>
> Does that help? If so, I'll put this on the DocBook docket.
>
> Big ol' thanks to Mike Rylander for helping me with this answer!
>
> --
> --
> | Karen G. Schneider
> | Community Librarian
> | Equinox Software Inc. "The Evergreen Experts"
> | Toll-free: 1.877.Open.ILS (1.877.673.6457) x712
> | kgs at esilibrary.com <mailto:kgs at esilibrary.com>
> | Web: http://www.esilibrary.com
>
So this doesn't directly address the core complaint, which is that the
different editions of the *same* title do not show up in a predictable
order. Is the suggestion that there may be a weighting that would
impose a sensible order (i.e., newest first)? Keep in mind that the
user will not have searched for "5th ed." or "2009", so whatever xpath
that targets that info would need to *influence* but *not match*. For
that reason, it seems to me that matching rules will not accommodate the
intended behavior.
Bring on the FRBR.
--joe atzberger
More information about the Open-ils-general
mailing list