[open-ils-commits] r670 - in evergreen-ils.org/docs: . 1.6/book1/sysadmin (kgs)
svn at svn.open-ils.org
svn at svn.open-ils.org
Wed Sep 9 13:21:48 EDT 2009
Author: kgs
Date: 2009-09-09 13:21:44 -0400 (Wed, 09 Sep 2009)
New Revision: 670
Added:
evergreen-ils.org/docs/1.6/book1/sysadmin/indexedfieldweighting.xml
Removed:
evergreen-ils.org/docs/indexedfieldweighting.xml
Log:
Moving to sysadmin folder.
Copied: evergreen-ils.org/docs/1.6/book1/sysadmin/indexedfieldweighting.xml (from rev 669, evergreen-ils.org/docs/indexedfieldweighting.xml)
===================================================================
--- evergreen-ils.org/docs/1.6/book1/sysadmin/indexedfieldweighting.xml (rev 0)
+++ evergreen-ils.org/docs/1.6/book1/sysadmin/indexedfieldweighting.xml 2009-09-09 17:21:44 UTC (rev 670)
@@ -0,0 +1,233 @@
+<?xml version='1.0' encoding='UTF-8'?>
+<section xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude"
+ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:id="indexedfieldweighting">
+ <title>Indexed-Field and Matchpoint Weighting</title>
+ <info>
+ <abstract>
+ <para>This chapter describes indexed field weighting and matchpoint weighting, which
+ control relevance ranking in Evergreen catalog search results.</para>
+ <para>
+ <tip>
+ <para>In tuning search relevance, it is good practice to make incremental
+ adjustments, capture search logs, and assess results before making further
+ adjustments. </para>
+ </tip>
+ </para>
+ </abstract>
+ </info>
+ <section>
+ <title>Indexed-field Weighting</title>
+ <para>Indexed-field weighting is configured in the Evergreen database in the weight column
+ of the config.metabib_field table, which follows the other four columns in this table:
+ field_class, name, xpath, and format. </para>
+ <para>The following is one representative line from the config.metabib_field table:</para>
+ <para> author | conference |
+ //mods32:mods/mods32:name[@type='conference']/mods32:namePart[../mods32:role/mods32:roleTerm[text()='creator']]
+ | mods32 | 1 ) </para>
+ <para>The default value for index-field weights in config.metabib_field is 1. Adjust the
+ weighting of indexed fields to boost or lower the relevance score for matches on that
+ indexed field. The weight value may be increased or decreased by whole integers. </para>
+ <para>For example, by increasing the weight of the title-proper field from 1 to 2, a search
+ for <emphasis role="bold">jaguar</emphasis> would double the relevance for the book
+ titled <emphasis role="italic">Aimee and Jaguar</emphasis> than for a record with the
+ term <emphasis role="bold">jaguar</emphasis> in another indexed field. </para>
+ </section>
+ <section>
+ <title>Matchpoint Weighting</title>
+ <para> Matchpoint weighting provides another way to fine-tune Evergreen relevance ranking,
+ and is configured through floating-point multipliers in the multiplier column of the
+ search.relevance_adjustment table.</para>
+ <para> Weighting can be adjusted for one, more, or all multiplier fields in
+ search.relevance_adjustment. </para>
+ <para>You can adjust the following three matchpoints:</para>
+ <itemizedlist>
+ <listitem>
+ <para><indexterm>
+ <primary>first_word</primary>
+ </indexterm> boosts relevance if the query is one term long and matches the
+ first term in the indexed field (search for <emphasis role="bold"
+ >twain</emphasis>, get a bonus for <emphasis role="bold">twain,
+ mark</emphasis> but not <emphasis role="bold">mark twain</emphasis>)</para>
+ </listitem>
+ <listitem>
+ <para><indexterm>
+ <primary>word_order</primary>
+ </indexterm> increases relevance for words matching the order of search terms,
+ so that the results for the search <emphasis role="bold">legend
+ suicide</emphasis> would match higher for the book <emphasis role="italic"
+ >Legend of a Suicide</emphasis> than for the book, <emphasis role="italic"
+ >Suicide Legend</emphasis></para>
+ </listitem>
+ <listitem>
+ <para><indexterm>
+ <primary>full_match</primary>
+ </indexterm> boosts relevance when the full query exactly matches the entire
+ indexed field (after space, case, and diacritics are normalized). So a title
+ search for <emphasis role="italic">The Future of Ice</emphasis> would get a
+ relevance boost above <emphasis role="italic">Ice Ages of the
+ Future</emphasis>.</para>
+ </listitem>
+ </itemizedlist>
+ <para> Here are the default settings of the search.relevance_adjustment table: </para>
+ <table xml:id="search.relevance">
+ <title>search.relevance_adjustment table</title>
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>field_class</entry>
+ <entry>name</entry>
+ <entry>bump_type</entry>
+ <entry>multiplier</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>author</entry>
+ <entry>conference</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>author</entry>
+ <entry>corporate</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>author </entry>
+ <entry>other </entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>author</entry>
+ <entry>personal</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>keyword</entry>
+ <entry>keyword</entry>
+ <entry>word_order</entry>
+ <entry>10</entry>
+ </row>
+ <row>
+ <entry>series</entry>
+ <entry>seriestitle</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>series</entry>
+ <entry>seriestitle</entry>
+ <entry>full_match</entry>
+ <entry>20</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>abbreviated</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>abbreviated</entry>
+ <entry>full_match</entry>
+ <entry>20</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>abbreviated</entry>
+ <entry>word_order</entry>
+ <entry>10</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>alternative</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>alternative</entry>
+ <entry>full_match</entry>
+ <entry>20</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>alternative</entry>
+ <entry>word_order</entry>
+ <entry>10</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>proper</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>proper</entry>
+ <entry>full_match</entry>
+ <entry>20</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>proper</entry>
+ <entry>word_order</entry>
+ <entry>10</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>translated</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>translated</entry>
+ <entry>full_match</entry>
+ <entry>20</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>translated</entry>
+ <entry>word_order</entry>
+ <entry>10</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>uniform</entry>
+ <entry>first_word</entry>
+ <entry>1.5</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>uniform</entry>
+ <entry>full_match</entry>
+ <entry>20</entry>
+ </row>
+ <row>
+ <entry>title</entry>
+ <entry>uniform</entry>
+ <entry>word_order</entry>
+ <entry>10</entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </section>
+ <section>
+ <title>Combining Index Weighting and Matchpoint Weighting</title>
+ <para>Index weighting and matchpoint weighting may be combined. The relevance boost of the
+ combined weighting is equal to the product of the two multiplied values. </para>
+ <para>If the relevance setting in the config.metabib_field were increased to 2, and the
+ multiplier set to 1.2 in the search.relevance_adjustment table, the resulting matchpoint
+ increase would be 240%. </para>
+ <note>
+ <para>In practice, these weights are applied serially -- first the index weight, then
+ all the matchpoint weights that apply -- because they are evaluated at different
+ stages of the search process.</para>
+ </note>
+ </section>
+</section>
Deleted: evergreen-ils.org/docs/indexedfieldweighting.xml
===================================================================
--- evergreen-ils.org/docs/indexedfieldweighting.xml 2009-09-09 17:18:38 UTC (rev 669)
+++ evergreen-ils.org/docs/indexedfieldweighting.xml 2009-09-09 17:21:44 UTC (rev 670)
@@ -1,233 +0,0 @@
-<?xml version='1.0' encoding='UTF-8'?>
-<section xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude"
- xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:id="indexedfieldweighting">
- <title>Indexed-Field and Matchpoint Weighting</title>
- <info>
- <abstract>
- <para>This chapter describes indexed field weighting and matchpoint weighting, which
- control relevance ranking in Evergreen catalog search results.</para>
- <para>
- <tip>
- <para>In tuning search relevance, it is good practice to make incremental
- adjustments, capture search logs, and assess results before making further
- adjustments. </para>
- </tip>
- </para>
- </abstract>
- </info>
- <section>
- <title>Indexed-field Weighting</title>
- <para>Indexed-field weighting is configured in the Evergreen database in the weight column
- of the config.metabib_field table, which follows the other four columns in this table:
- field_class, name, xpath, and format. </para>
- <para>The following is one representative line from the config.metabib_field table:</para>
- <para> author | conference |
- //mods32:mods/mods32:name[@type='conference']/mods32:namePart[../mods32:role/mods32:roleTerm[text()='creator']]
- | mods32 | 1 ) </para>
- <para>The default value for index-field weights in config.metabib_field is 1. Adjust the
- weighting of indexed fields to boost or lower the relevance score for matches on that
- indexed field. The weight value may be increased or decreased by whole integers. </para>
- <para>For example, by increasing the weight of the title-proper field from 1 to 2, a search
- for <emphasis role="bold">jaguar</emphasis> would double the relevance for the book
- titled <emphasis role="italic">Aimee and Jaguar</emphasis> than for a record with the
- term <emphasis role="bold">jaguar</emphasis> in another indexed field. </para>
- </section>
- <section>
- <title>Matchpoint Weighting</title>
- <para> Matchpoint weighting provides another way to fine-tune Evergreen relevance ranking,
- and is configured through floating-point multipliers in the multiplier column of the
- search.relevance_adjustment table.</para>
- <para> Weighting can be adjusted for one, more, or all multiplier fields in
- search.relevance_adjustment. </para>
- <para>You can adjust the following three matchpoints:</para>
- <itemizedlist>
- <listitem>
- <para><indexterm>
- <primary>first_word</primary>
- </indexterm> boosts relevance if the query is one term long and matches the
- first term in the indexed field (search for <emphasis role="bold"
- >twain</emphasis>, get a bonus for <emphasis role="bold">twain,
- mark</emphasis> but not <emphasis role="bold">mark twain</emphasis>)</para>
- </listitem>
- <listitem>
- <para><indexterm>
- <primary>word_order</primary>
- </indexterm> increases relevance for words matching the order of search terms,
- so that the results for the search <emphasis role="bold">legend
- suicide</emphasis> would match higher for the book <emphasis role="italic"
- >Legend of a Suicide</emphasis> than for the book, <emphasis role="italic"
- >Suicide Legend</emphasis></para>
- </listitem>
- <listitem>
- <para><indexterm>
- <primary>full_match</primary>
- </indexterm> boosts relevance when the full query exactly matches the entire
- indexed field (after space, case, and diacritics are normalized). So a title
- search for <emphasis role="italic">The Future of Ice</emphasis> would get a
- relevance boost above <emphasis role="italic">Ice Ages of the
- Future</emphasis>.</para>
- </listitem>
- </itemizedlist>
- <para> Here are the default settings of the search.relevance_adjustment table: </para>
- <table xml:id="search.relevance">
- <title>search.relevance_adjustment table</title>
- <tgroup cols="4">
- <thead>
- <row>
- <entry>field_class</entry>
- <entry>name</entry>
- <entry>bump_type</entry>
- <entry>multiplier</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>author</entry>
- <entry>conference</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>author</entry>
- <entry>corporate</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>author </entry>
- <entry>other </entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>author</entry>
- <entry>personal</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>keyword</entry>
- <entry>keyword</entry>
- <entry>word_order</entry>
- <entry>10</entry>
- </row>
- <row>
- <entry>series</entry>
- <entry>seriestitle</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>series</entry>
- <entry>seriestitle</entry>
- <entry>full_match</entry>
- <entry>20</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>abbreviated</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>abbreviated</entry>
- <entry>full_match</entry>
- <entry>20</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>abbreviated</entry>
- <entry>word_order</entry>
- <entry>10</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>alternative</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>alternative</entry>
- <entry>full_match</entry>
- <entry>20</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>alternative</entry>
- <entry>word_order</entry>
- <entry>10</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>proper</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>proper</entry>
- <entry>full_match</entry>
- <entry>20</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>proper</entry>
- <entry>word_order</entry>
- <entry>10</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>translated</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>translated</entry>
- <entry>full_match</entry>
- <entry>20</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>translated</entry>
- <entry>word_order</entry>
- <entry>10</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>uniform</entry>
- <entry>first_word</entry>
- <entry>1.5</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>uniform</entry>
- <entry>full_match</entry>
- <entry>20</entry>
- </row>
- <row>
- <entry>title</entry>
- <entry>uniform</entry>
- <entry>word_order</entry>
- <entry>10</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </section>
- <section>
- <title>Combining Index Weighting and Matchpoint Weighting</title>
- <para>Index weighting and matchpoint weighting may be combined. The relevance boost of the
- combined weighting is equal to the product of the two multiplied values. </para>
- <para>If the relevance setting in the config.metabib_field were increased to 2, and the
- multiplier set to 1.2 in the search.relevance_adjustment table, the resulting matchpoint
- increase would be 240%. </para>
- <note>
- <para>In practice, these weights are applied serially -- first the index weight, then
- all the matchpoint weights that apply -- because they are evaluated at different
- stages of the search process.</para>
- </note>
- </section>
-</section>
More information about the open-ils-commits
mailing list