[OPEN-ILS-DEV] ***SPAM*** Re: ***SPAM*** Re: ***SPAM*** Problem with utf8 and MARC Edit

Alan Rykhus alan.rykhus at mnsu.edu
Wed Jun 9 12:16:51 EDT 2010


Hello Dan,

We are running Evergreen 1.6.0.4 on Ubuntu Hardy. I installed version
2.23-1 of the libencode-perl package and restarted everything on our
test server. Preliminary tests seem to show that this has fixed the
problem.

al



On Wed, 2010-06-09 at 11:50 -0400, Dan Scott wrote:
> On Wed, 2010-06-09 at 08:57 -0500, Alan Rykhus wrote:
> > Hello,
> > 
> > We're having a problem in MARC Edit where when we try to save a record
> > we get the following:
> > 
> > 
> > Network or server failure.  Please check your Internet connection to
> > balsam.mnpals.net and choose Retry Network.  If you need to enter
> > Offline Mode, choose Ignore Errors in this and subsequent dialogs.  If
> > you believe this error is due to a bug in Evergreen and not network
> > problems, please contact your help desk or friendly Evergreen
> > administrators, and give them this information:
> > method=open-ils.cat.biblio.record.xml.update
> > params=["3cb05162d451a7aa5640490ffde742ca",99254,"<record
> > xsi:schemaLocation=\"http://www.loc.gov/MARC21/slim
> > http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd\" xmlns:xsi=
> > \"http://www.w3.org/2001/XMLSchema-instance\" xmlns=
> > \"http://www.loc.gov/MARC21/slim\">\n  <leader 
> > .
> > .
> > .
> > <subfield code=\"c\">99254</subfield>\n  </datafield>\n</record>"]
> > THROWN:
> > {"payload":[],"debug":"osrfMethodException :  *** Call to
> > [open-ils.cat.biblio.record.
> > 
> > I've traced the problem down to the function 'sub entityize' in
> > Application::AppUtils.
> > 
> > In this function there is a call to:
> > 
> >      $string = decode_utf8($string);
> > 
> > The problem seems to be that the record(string) is already in utf8. If
> > you check the string with:
> > 
> >      is_utf8($string)
> > 
> > a true response will be returned. Should this call to decode_utf8() be
> > wrapped? for example:
> > 
> >         if (! is_utf8($string)) {
> >             $string = decode_utf8($string);
> >         }
> > 
> > It seems that the object of decode_utf8 is to put the string into the
> > perl internal utf8 format used by perl and to turn the utf8 flag on. If
> > the flag is already on, as determined by the is_utf8 call, it does not
> > make sense to decode_utf8 a string that is already utf8.
> > 
> > In addition, according to the perl documentation:
> > 
> > is_utf8(STRING [, CHECK]) 
> > 
> > [INTERNAL] Tests whether the UTF8 flag is turned on in the STRING. If
> > CHECK is true, also checks the data in STRING for being well-formed
> > UTF-8. Returns true if successful, false otherwise.
> > 
> > 
> > So the is_utf8 call makes sure we have a well-formed string when the
> > utf8 flag is indeed on.
> > 
> > gosh I hope this makes sense(because it fixes the problem we're seeing)
> > -- al
> > 
> > 
> 
> Hi Alan:
> 
> You forgot to mention which version of Evergreen you are running and
> what Linux distribution you're running on.
> 
> Also, on decode_utf8() vs. is_utf8(), Perl best practices
> (http://juerd.nl/site.plp/perluniadvice) suggest that you stay the hell
> away from is_utf8(). decode_utf8() is supposed to detect if the incoming
> string is already UTF8, and if it is, pass it back untouched. 
> 
> If you have the buggy version of Encode.pm, as Dan Wells pointed out,
> decode_utf8() is probably giving the string a bad touch and causing your
> problems.
> 
> Dan
> 


-- 
Alan Rykhus
PALS, A Program of the Minnesota State Colleges and Universities 
(507)389-1975
alan.rykhus at mnsu.edu
"It's hard to lead a cavalry charge if you think you look funny on a
horse" ~ Adlai Stevenson



More information about the Open-ils-dev mailing list