[OPEN-ILS-GENERAL] Z39.50 client query encoding issues

Linda Jansova skolkova at chello.cz
Wed Aug 19 03:50:59 EDT 2015


Hi all,

Jabok Library currently uses Evergreen 2.8.2 and we have successfully 
changed charsets both for <client> and <yazgfs> (in the configuration 
files mentioned at
http://docs.evergreen-ils.org/2.1/html/Z3950serversupport.html) to utf-8 
and so now Z39.50 clients can receive data (records) with the correct 
diacritics.

However, one related problem still persists - the Z39.50 queries only 
work when no diacritics are used. Eg. search results are returned when 
we submit a query "matousek" (author's surname) but no results are 
reported when the correct version "matoušek" is used.

We have tried the following but to no avail:

1) add element client_query_charset to gfs (according to 
http://www.indexdata.com/yaz/doc/server.vhosts.html) but it was an 
unknown element;

2) delete the second mention of "encoding="utf-8"" from 
/xsl/MARC21slim2SRWDC.xsl and restart the open-ils.supercat service, 
hoping that this procedure would have similar results like when MODS 
stylesheets were treated in the same way to resolve our Zotero encoding 
problems (see https://bugs.launchpad.net/evergreen/+bug/1442276).

We have also tried further query testing in yaz-client. In this case, 
some interesting things happened:

When yaz-client was used for a generic query "find matoušek" (i.e., with 
diacritics), the answer was 34 hits:

Z> find matoušek
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 34, setno 1
records returned: 0
Elapsed: 0.681894

However, when searching specifically for author (with diacritics again), 
the answer was zero hits:

Z> find @attr 1=1003 @attr 2=3 "matoušek"
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 0, setno 12
records returned: 0
Elapsed: 0.117265

When diacritics were omitted, we got 34 hits again:

Z> find @attr 1=1003 @attr 2=3 "matousek"
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 34, setno 13
records returned: 0
Elapsed: 0.637897

Our Z39.50 server runs at mojzis.jabok.cuni.cz (port 9999, database 
Jabok) and it now uses the utf-8 encoding.

When we have tried Laurentian (laurentian.concat.ca, port 210, database 
OSUL), we have used a word "francais" and "français" (searching for a 
person in Tellico), in case of "francais" we got the results but when 
asking for "français", no results were found. So probably it is not just 
our case...

Do you have any ideas what we could do to make the queries with 
diacritics work correctly?

Thank you in advance for any hints!

Linda



More information about the Open-ils-general mailing list