[OPEN-ILS-DEV] ***SPAM*** Re: ***SPAM*** Re: ***SPAM*** Problem with utf8 and MARC Edit
Alan Rykhus
alan.rykhus at mnsu.edu
Wed Jun 9 12:16:51 EDT 2010
Hello Dan,
We are running Evergreen 1.6.0.4 on Ubuntu Hardy. I installed version
2.23-1 of the libencode-perl package and restarted everything on our
test server. Preliminary tests seem to show that this has fixed the
problem.
al
On Wed, 2010-06-09 at 11:50 -0400, Dan Scott wrote:
> On Wed, 2010-06-09 at 08:57 -0500, Alan Rykhus wrote:
> > Hello,
> >
> > We're having a problem in MARC Edit where when we try to save a record
> > we get the following:
> >
> >
> > Network or server failure. Please check your Internet connection to
> > balsam.mnpals.net and choose Retry Network. If you need to enter
> > Offline Mode, choose Ignore Errors in this and subsequent dialogs. If
> > you believe this error is due to a bug in Evergreen and not network
> > problems, please contact your help desk or friendly Evergreen
> > administrators, and give them this information:
> > method=open-ils.cat.biblio.record.xml.update
> > params=["3cb05162d451a7aa5640490ffde742ca",99254,"<record
> > xsi:schemaLocation=\"http://www.loc.gov/MARC21/slim
> > http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd\" xmlns:xsi=
> > \"http://www.w3.org/2001/XMLSchema-instance\" xmlns=
> > \"http://www.loc.gov/MARC21/slim\">\n <leader
> > .
> > .
> > .
> > <subfield code=\"c\">99254</subfield>\n </datafield>\n</record>"]
> > THROWN:
> > {"payload":[],"debug":"osrfMethodException : *** Call to
> > [open-ils.cat.biblio.record.
> >
> > I've traced the problem down to the function 'sub entityize' in
> > Application::AppUtils.
> >
> > In this function there is a call to:
> >
> > $string = decode_utf8($string);
> >
> > The problem seems to be that the record(string) is already in utf8. If
> > you check the string with:
> >
> > is_utf8($string)
> >
> > a true response will be returned. Should this call to decode_utf8() be
> > wrapped? for example:
> >
> > if (! is_utf8($string)) {
> > $string = decode_utf8($string);
> > }
> >
> > It seems that the object of decode_utf8 is to put the string into the
> > perl internal utf8 format used by perl and to turn the utf8 flag on. If
> > the flag is already on, as determined by the is_utf8 call, it does not
> > make sense to decode_utf8 a string that is already utf8.
> >
> > In addition, according to the perl documentation:
> >
> > is_utf8(STRING [, CHECK])
> >
> > [INTERNAL] Tests whether the UTF8 flag is turned on in the STRING. If
> > CHECK is true, also checks the data in STRING for being well-formed
> > UTF-8. Returns true if successful, false otherwise.
> >
> >
> > So the is_utf8 call makes sure we have a well-formed string when the
> > utf8 flag is indeed on.
> >
> > gosh I hope this makes sense(because it fixes the problem we're seeing)
> > -- al
> >
> >
>
> Hi Alan:
>
> You forgot to mention which version of Evergreen you are running and
> what Linux distribution you're running on.
>
> Also, on decode_utf8() vs. is_utf8(), Perl best practices
> (http://juerd.nl/site.plp/perluniadvice) suggest that you stay the hell
> away from is_utf8(). decode_utf8() is supposed to detect if the incoming
> string is already UTF8, and if it is, pass it back untouched.
>
> If you have the buggy version of Encode.pm, as Dan Wells pointed out,
> decode_utf8() is probably giving the string a bad touch and causing your
> problems.
>
> Dan
>
--
Alan Rykhus
PALS, A Program of the Minnesota State Colleges and Universities
(507)389-1975
alan.rykhus at mnsu.edu
"It's hard to lead a cavalry charge if you think you look funny on a
horse" ~ Adlai Stevenson
More information about the Open-ils-dev
mailing list