[OPEN-ILS-DEV] ***SPAM*** Re: 1.6 RC1 (and 1.4 and probably before) subject searching

Dan Scott dan at coffeecode.net
Mon Oct 12 10:58:44 EDT 2009


2009/9/17 Dan Scott <dan at coffeecode.net>:
> The subject search in bibtemplate, interestingly enough, searches
> against the keywords index rather than the subject index.
>
> This might just be a fortuitous mistake, but I suspect it is because
> the subject index doesn't currently support searches of compound
> subjects composed of topic + temporal + geographic headings.
>
> The subject search:
> http://laurentian.concat.ca/opac/en-CA/skin/lul/xml/rresult.xml?rt=subject&tp=subject&t=Great%20Britain%20History%20Wars%20of%20the%20Roses%2C%201455-1485.%20&l=105&d=1&f=&av=
> results in 0 hits.
>
> The keyword search:
> http://laurentian.concat.ca/opac/en-CA/skin/lul/xml/rresult.xml?rt=keyword&tp=keyword&t=Great%20Britain%20History%20Wars%20of%20the%20Roses%2C%201455-1485.%20&l=105&d=1&f=&av=
> results in 5 hits. Some of these hits are because the desired keywords
> fall outside of the subject fields, but if you take a record (say,
> ISBN 041596864X) and remove all of the pertinent keywords from
> non-subject fields, reindex the record and resubmit the keyword
> search, you'll see that the keyword search using the compound subject
> terms does retrieve the desired record. Or just create a minimal
> record with the following field: "651 0. ‡aGreat Britain ‡xHistory
> ‡yWars of the Roses, 1455-1485."
>
> If you peek at the metabib.subject_field_entry table, you'll see that
> the subject terms are all there, but broken up across entries.
> Presumably, this is why the compound search doesn't find a hit against
> them.
>
> One way that I documented back in February
> (http://coffeecode.net/archives/183-Evergreen-Exposed-introduction-to-Evergreen-development-OLA-2009.html)
> of resolving this problem is to add an additional entry to the subject
> search index:
>
> INSERT INTO config.metabib_field (field_class, name, xpath, weight,
> format, search_field)
>    VALUES ('subject', 'flat', '//mods32:mods/mods32:subject//text()',
> 1, 'mods32', 't');
>
> Once that has been added, you can reindex all of your records and
> compound subject searches will work as expected.
>
> However, I would suggest that it would be better to have this as part
> of the default search indexes in 1.6 - and that the bibtemplate
> subject search should then be adjusted to search the subject index
> accordingly, to avoid false positives that occur when using the
> keyword index.
>
> From a migration perpsective, one could conceivably avoid the cost of
> reindexing all of the records by iterating over all of the unique
> values of source in msfe and populating an additional row containing
> the union of the subject terms for each source record. Once that's
> complete, insert the new config.metabib_field entry and the system
> would be golden.
>
> If there's a better way of solving this problem, that would be great,
> as adding the //mods32:mods/mods32:subject//text() xpath means that
> we're effectively indexing all subject terms twice if we keep the
> existing temporal / topic / geographic subject indexes as well.  Maybe
> instead of adding the xpath, we could populate a view or use rules to
> maintain an additional row based on a union of all subject terms for a
> given record? Just brainstorming in a slightly sick fashion. But in
> the short term, that slight additional bloat seems like a small price
> to pay for making subject searches work the way that our users appear
> to expect them to work.
>

For what it's worth, I did commit this change to trunk in changeset
14250, along with the required upgrade SQL.

However, I'd like to know if others feel this is important enough to
be backported to 1.6. To me, subject searching feels broken out of the
box without this patch.


More information about the Open-ils-dev mailing list