[OPEN-ILS-DEV] Nature of the compositional pattern in biblio.record_entry.marc

Galen Charlton gmc at esilibrary.com
Wed Dec 22 10:18:20 EST 2010


Hi,

On Dec 21, 2010, at 3:05 PM, John Craig wrote:
> Just trying to verify if the data in biblio.record_entry.marc can be assumed to be in a particular canonical compositional form. What I'm seeing appears to be canonical-composed form. Is this uniform or just happenstance?

Are you referring to the Unicode normalization form?  Typically records inserted using marc2bre.pl or Vandelay will be converted to NFC (see the entityize method in AppUtils.pm), but since there's nothing preventing somebody from manually inserting NFD strings via SQL, if you need a particular normalization form when processing bib records, it's best to explicitly convert to the desired form.  The Perl module Unicode::Normalize or the Java Normalizer class can do this for you.

Regards,

Galen
--
Galen Charlton
VP, Data Services
Equinox Software, Inc. / Your Library's Guide to Open Source
email:  gmc at esilibrary.com
direct: +1 352-215-7548
skype:  gmcharlt
web:    http://www.esilibrary.com/



More information about the Open-ils-dev mailing list