[OPEN-ILS-DEV] PATCH: osrf_json.c (rewrite of jsonObjectToJSON[Raw])

Bill Erickson erickson at esilibrary.com
Tue Nov 18 17:32:00 EST 2008


On Tue, 18 Nov 2008 16:01:51 -0500, Scott McKellar <mck9 at swbell.net> wrote:

> --- On Tue, 11/18/08, Bill Erickson <erickson at esilibrary.com> wrote:
>
>> From: Bill Erickson <erickson at esilibrary.com>
> <snip>
>
>> I've run into one issue with these last 2 patches, in
>> particular, there
>> seems to be a subtle difference in uescape() and
>> buffer_append_uescape().
>> The majority of the JSON parsing succeeds, but parsing
>> certain
>> bibliographic data (MARC records) always results in NULL
>> when attempting
>> to encode the containing object (biblio_record entry
>> objects).    If I
>> replace the call to buffer_append_uescape() with a call to
>> uescape() in
>> osrf_json_object.c, then all is well.
>>
>> I would send you some sample data to test, but I've yet
>> to recreate the
>> issue outside of a running system.  For now, I'm adding
>> some logging to
>> see if I can track it down.  I just wanted to pass this on
>> in case any
>> lightbulbs went off.  I'll let you if I find
>> something...
>
> I'm not sure what you mean by "always results in NULL".  What is NULL?
> The return from the original uescape()?  That would indicate invalid
> UTF-8 characters in the input string.
>
> Maybe you mean that the buffer_append_uescape() seems to work, but the
> resulting JSON string later fails to parse correctly.
>
> In my test harness I translated things both ways and tested with strcmp()
> to verify that I got the same results.  However I didn't test with
> non-ASCII characters, so that's likely to be the weakest link.  It may
> be that I'm mangling the UTF-8.
>
> I'm more inclined to think that you have invalid UTF-8 characters in some
> of your MARC records.  You wouldn't know about it because uescape()
> silently discards an entire input string if any of it is invalid.  The
> new function translates as far as it can before it hits the invalid
> characters.  I *think* that the result would still be valid JSON, because
> we only translate the values of strings, which get enclosed by double
> quotes anyway.  However the semantics of that string might be invalid
> at some higher level.
>
> For diagnostic purposes, I suggest either or both of the following:
>
> 1. Check the return code from buffer_append_uescape(), and log a
> message if it's non-zero.
>
> 2. Tweak buffer_append_uescape() so that, if it detects invalid UTF-8,
> it restores the growing_buffer to its original length.  In effect, this
> change would make buffer_append_uescape() behave like uescape() in the
> face of invalid UTF-8.  I can send you a patch for this if you like, but
> I don't know if you would want it based on the original code or on the
> already patched version.
>
> In the long run we should probably think about how best to respond to
> invalid UTF-8.  For now we just want to identify the problem and fix it.
>
> Meanwhile you can get most of the speedup by reverting to the original
> uescape function but otherwise keeping the rewrite of jsonObjectToJSON().
>
> Scott McKellar
> http://home.swbell.net/mck9/ct/
>


OK, I tracked down 2 problems, the first makes sense, the second, not so  
much.

The first problem was the result of attempting to escape quotes (") and  
backslashes (\) in the final if/else block of buffer_append_uescape().   
Since " and \ are both >= ' ', they were not getting escaped.

The second was caused by NULL jsonObject's being passed from  
OSRF_LIST_GET_INDEX (osrf_json_object:337) into add_json_to_buffer().   
osrfList allows NULL entries, so it seems the old could should have been  
dying here, but for some reason only the new code suffers.

Slightly modified patch committed:  
http://svn.open-ils.org/trac/OpenSRF/changeset/1498

Thanks again, Scott.  Eyes on on the latest patch would be appreciated.

-b

-- 
Bill Erickson
| VP, Software Development & Integration
| Equinox Software, Inc. / The Evergreen Experts
| phone: 877-OPEN-ILS (673-6457)
| email: erickson at esilibrary.com
| web: http://esilibrary.com


More information about the Open-ils-dev mailing list