[OPEN-ILS-DEV] PATCH & RFC: Providing i18n support in OpenILS database schema

Dan Scott denials at gmail.com
Mon Jun 4 14:38:16 EDT 2007


Hello:

This is just a small patch demonstrating the direction I'm thinking
the OpenILS database schema needs to take to provide I18N support for
various tables that currently contain hard-coded English text. I would
like to get some feedback on this direction before completing the
patch, as it will end up being a rather largish set of changes.

Some anticipated questions (with answers):
Q: Why do we need to change the database schema? Can't sites that need
to support a different language simply change the English text in
OpenILS/src/sql/Pg/*.sql to whatever language they need?

A: That approach works fine for a site that only needs to support one
language - but for bilingual (or multilingual) sites, where you have
to support users who prefer different languages, that approach won't
fly. For example, our university is a bilingual French & English
university - so we need to be able to store both the French and
English versions of a given piece of text in the database.

Q: What format should the locale names use?

A: Quick answer: ll-LL (two-char lowercase language code, hyphen,
two-char uppercase region code)

There are a number of possible formats: en_US, en-us, en_US.UTF-8,
etc. However, given that the Open-ILS database is created using the
UTF-8 encoding, I think we can assume that any language will use the
UTF-8 encoding and therefore avoid having to include the encoding as
part of the locale name. Second, the ISO639-1 standard for
two-character language codes and ISO3166-1 standard for two-character
region codes are pretty well established for use in defining browser
locales and in, er, Java (see the java.util.Locale class for some
details).

A free table listing ISO 639-1 and 639-2 codes is here:
http://www.loc.gov/standards/iso639-2/php/code_list.php

The ISO 3166 region codes are listed here:
http://www.iso.ch/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html

Given that the canonical format of locales for Web applications
follows the 'll-LL' format
(http://www.w3.org/TR/i18n-html-tech-lang/), and given that much of
Evergreen is exposed through Web services or technologies, I suggest
that the w3c guidelines for locale names be adopted for use in the
database schema and have therefore specified 'en-US' as the locale for
the default strings in the current patch.

Q: Okay, so you're replacing the current table with a set of tables
and a view to maintain  compatibility with the current OpenSRF /
Open-ILS services until they can be taught to be locale-aware. What's
the plan to make these locale-aware?

A: <hand-waving about teaching the settings server to attach client
locale to requests with a default that can hopefully be set
system-wide>

Q: Okay, so you've converted one table over. Big deal. How many more
tables are there to convert?

A: Not many, actually - but there are a few that have more significant
content. At this point there might be some breakage to the existing
config CGIs (I imagine that they would be trying to insert into the
view rather than the new underlying tables) but I wouldn't expect the
restt of the existing services to break _if_ the tables are replaced
with views that default to the en-US locale.

I just want to get general agreement that this patch is headed in the
right direction before heading much further with the other tables. And
yeah, I'm aware that Mike is on vacation now so I couldn't possibly
have worse timing :)

-- 
Dan Scott
Laurentian University
-------------- next part --------------
Index: 002.schema.config.sql
===================================================================
RCS file: /cvs/ILS/Open-ILS/src/sql/Pg/002.schema.config.sql,v
retrieving revision 1.23
diff -c -r1.23 002.schema.config.sql
*** 002.schema.config.sql	25 Apr 2007 14:07:09 -0000	1.23
--- 002.schema.config.sql	4 Jun 2007 14:25:58 -0000
***************
*** 64,74 ****
  INSERT INTO config.bib_source (quality, source) VALUES (10, 'System Local');
  INSERT INTO config.bib_source (quality, source, transcendant) VALUES (1, 'Project Gutenberg', TRUE);
  
! CREATE TABLE config.standing (
! 	id		SERIAL	PRIMARY KEY,
! 	value		TEXT	NOT NULL UNIQUE
  );
! COMMENT ON TABLE config.standing IS $$
  /*
   * Copyright (C) 2005  Georgia Public Library Service 
   * Mike Rylander <mrylander at gmail.com>
--- 64,88 ----
  INSERT INTO config.bib_source (quality, source) VALUES (10, 'System Local');
  INSERT INTO config.bib_source (quality, source, transcendant) VALUES (1, 'Project Gutenberg', TRUE);
  
! CREATE TABLE config.standing_ids (
! 	id		SERIAL	PRIMARY KEY
! );
! 
! CREATE TABLE config.standing_text (
! 	id		INTEGER NOT NULL REFERENCES config.standing_ids(id),
! 	value		TEXT	NOT NULL,
! 	locale		TEXT	NOT NULL DEFAULT 'en-US'
  );
! 
! ALTER TABLE config.standing_text ADD PRIMARY KEY (id, value, locale);
! 
! CREATE VIEW config.standing AS
! 	SELECT id, value
! 	FROM config.standing_text
! 	WHERE locale = 'en-US'
! ;
! 
! COMMENT ON TABLE config.standing_text IS $$
  /*
   * Copyright (C) 2005  Georgia Public Library Service 
   * Mike Rylander <mrylander at gmail.com>
***************
*** 77,83 ****
   *
   * This table contains the values that can be applied to a patron
   * by a staff member.  These values should not be changed, other
!  * that for translation, as the ID column is currently a "magic
   * number" in the source. :(
   *
   * ****
--- 91,97 ----
   *
   * This table contains the values that can be applied to a patron
   * by a staff member.  These values should not be changed, other
!  * than for translation, as the ID column is currently a "magic
   * number" in the source. :(
   *
   * ****
***************
*** 94,101 ****
   */
  $$;
  
! INSERT INTO config.standing (value) VALUES ('Good');
! INSERT INTO config.standing (value) VALUES ('Barred');
  
  
  CREATE TABLE config.xml_transform (
--- 108,117 ----
   */
  $$;
  
! INSERT INTO config.standing_ids (id) VALUES (DEFAULT);
! INSERT INTO config.standing_text (id, value) VALUES (currval('config.standing_ids_id_seq'), 'Good');
! INSERT INTO config.standing_ids (id) VALUES (DEFAULT);
! INSERT INTO config.standing_text (id, value) VALUES (currval('config.standing_ids_id_seq'), 'BARRED');
  
  
  CREATE TABLE config.xml_transform (


More information about the Open-ils-dev mailing list