[OPEN-ILS-GENERAL] ***SPAM*** Re: Arabic in Evergreen

Dan Scott dan at coffeecode.net
Thu Mar 3 09:47:10 EST 2011


On Thu, Mar 3, 2011 at 7:02 AM, Christoph Schilling <ch.schilling at gmx.de> wrote:
>
> Hi Dan,
> I am very glad that there seems to be a way out, I also saw on the roadmap for the 2.0 release Arabic is on the list of the staff client language sets. Great! Maybe I should soon move to the new release.

A quick warning on this - the staff client and OPAC is currently not
set up to support right-to-left in the interface. The record display
should be okay-ish, but we'll need someone with RTL experience to help
us add that support to the staff client and OPAC interfaces.

> 1) Some sample Arabic MARC records (ideally, licensed so that we can
> add them to our test datasets)
>
>
> I was working on a Mark record from the LC. I am just trying to figure out how Arabic Marc records function since there seems to be not a very functioning  consense when I look into how  western libraries do it and how libraries in the Arab world do it.
> I hope to still add some records in the coming days. How do I export them from evergreen to send them to you? (sorry for the simple question)

There are a few ways, depending on your Evergreen version. In the
staff client, the Cataloguing -> MARC batch import/export menu item
might be the easiest. From the command line, the
Open-ILS/src/extras/marc_export tool is probably your best bet.

> 2) A screenshot of how the text in the records should appear (perhaps
> copy the text into a text editor or whatever application displays it
> properly).
>
> That is how it should look like:
>
> 100 1 #‡6 880-01 ‡a Tūnjī, Muḥammad.
>
> 880 1 #‡6 100-01/(3/r&#x200f; ‡a&#x200f;تونجي, محمد
>
> 245 1 3 #‡6 880-02 ‡a al-Muʻarrab wa-al-dakhīl fī al-lughah al-ʻArabīyah wa-ādābihā / ‡c ta&#x02bc;līf Muḥammad al-Tūnjī.
>
> 880 1 2 #‡6 245-02/(3/r/&#x200f; ‡a &#x200fالمعرب والدخيل في اللغة العربية وآدابها /&#٬٢٠٠ن؛ ‡c &#x200f;تأليف محمد التونجي.

I grabbed the same record directly from the Library of Congress to
ensure that we were working with the raw source, not suffering from an
import problem in Evergreen, and... oh dear. It looks like that
catalog record is... bad. Ux200F is a marker for right-to-left text,
and I'm surprised to see the XML entity version mixed directly into
the MARC fields. I wouldn't be surprised if that was the desperate
choice that a committee made somewhere in the Western world to try and
support RTL text in a MARC8 encoded record, but if that's the case,
it's not good. We'll have to do some more research to see what the
standard actually says.


More information about the Open-ils-general mailing list