[OPEN-ILS-GENERAL] Z39.50 client query encoding issues

Hardy, Elaine ehardy at georgialibraries.org
Wed Aug 19 08:56:24 EDT 2015


I was able to retrieve 3539 hits in a search of OCLC for author matoušek and 
for author matousek through our Z39.50 gateway. I am afraid I can't help 
with anything else other than that it does work in our Z39.50 instance with 
OCLC.

We have had occasional problems with some diacritics and with some language 
scripts. It is a minor issue for us; however, and I have been able to use 
Vandelay to bring in the individual record that didn't retrieve via the 
Z39.50 connection. I believe it was a record with a parallel title in 
Turkish. Occasionally, an OCLC record will have a nonUTF-8 character which 
will also block retrieval; but, that is a simple matter of correcting the 
record in OCLC.


Elaine


J. Elaine Hardy
PINES & Collaborative Projects Manager
Georgia Public Library Service
1800 Century Place, Ste 150
Atlanta, Ga. 30345-4304


404.235.7128
404.235.7201, fax
ehardy at georgialibraries.org
www.georgialibraries.org
www.georgialibraries.org/pines

-----Original Message-----
From: Open-ils-general 
[mailto:open-ils-general-bounces at list.georgialibraries.org] On Behalf Of 
Linda Jansova
Sent: Wednesday, August 19, 2015 3:51 AM
To: Evergreen Discussion Group
Subject: [OPEN-ILS-GENERAL] Z39.50 client query encoding issues

Hi all,

Jabok Library currently uses Evergreen 2.8.2 and we have successfully 
changed charsets both for <client> and <yazgfs> (in the configuration files 
mentioned at
http://docs.evergreen-ils.org/2.1/html/Z3950serversupport.html) to utf-8 and 
so now Z39.50 clients can receive data (records) with the correct 
diacritics.

However, one related problem still persists - the Z39.50 queries only work 
when no diacritics are used. Eg. search results are returned when we submit 
a query "matousek" (author's surname) but no results are reported when the 
correct version "matoušek" is used.

We have tried the following but to no avail:

1) add element client_query_charset to gfs (according to
http://www.indexdata.com/yaz/doc/server.vhosts.html) but it was an unknown 
element;

2) delete the second mention of "encoding="utf-8"" from 
/xsl/MARC21slim2SRWDC.xsl and restart the open-ils.supercat service, hoping 
that this procedure would have similar results like when MODS stylesheets 
were treated in the same way to resolve our Zotero encoding problems (see 
https://bugs.launchpad.net/evergreen/+bug/1442276).

We have also tried further query testing in yaz-client. In this case, some 
interesting things happened:

When yaz-client was used for a generic query "find matoušek" (i.e., with 
diacritics), the answer was 34 hits:

Z> find matoušek
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 34, setno 1
records returned: 0
Elapsed: 0.681894

However, when searching specifically for author (with diacritics again), the 
answer was zero hits:

Z> find @attr 1=1003 @attr 2=3 "matoušek"
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 0, setno 12
records returned: 0
Elapsed: 0.117265

When diacritics were omitted, we got 34 hits again:

Z> find @attr 1=1003 @attr 2=3 "matousek"
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 34, setno 13
records returned: 0
Elapsed: 0.637897

Our Z39.50 server runs at mojzis.jabok.cuni.cz (port 9999, database
Jabok) and it now uses the utf-8 encoding.

When we have tried Laurentian (laurentian.concat.ca, port 210, database 
OSUL), we have used a word "francais" and "français" (searching for a person 
in Tellico), in case of "francais" we got the results but when asking for 
"français", no results were found. So probably it is not just our case...

Do you have any ideas what we could do to make the queries with diacritics 
work correctly?

Thank you in advance for any hints!

Linda



More information about the Open-ils-general mailing list