[OPEN-ILS-DEV] PATCH: osrf_json_utils.h (scientific notation)

Mike Rylander mrylander at gmail.com
Wed Dec 12 23:08:17 EST 2007


On Dec 11, 2007 9:18 PM, Scott McKellar <mck9 at swbell.net> wrote:
> This tiny patch enables the JSON parser to accept a capital 'E' as
> well as a lower case 'e' in a number expressed in scientific notation.
>
> If there is some reason to insist on lower case -- which I doubt --
> then disregard this patch.

I can think of no reason.  Applied.

>
> -----------
>
> As long as we're on the subject of numeric representations, I'm
> going to raise an issue that has been bothering me for a while.
>
> When we parse a number within a JSON string, we immediately convert it
> into a double and store it as such within the jsonObject.  Due to the
> inherent imprecision of floating point, this conversion may change
> the numeric value.
>

This has always bugged me too, so I'm all for a plan that fixes this issue.

[snip examples]

> For example, we might want to filter the JSON text so as to remove
> certain parts of it, leaving the rest intact.  In such a case we
> would want to preserve the original content unchanged.
>

This (not using the number in calculations) is nearly always the case
for Evergreen apps.

> Accordingly, I propose that numeric values be stored as character
> strings, in order to preserve the original values exactly.  In a
> context where we don't need to do any arithmetic on it, we can
> extract it in the original format.  Only when necessary would we
> convert it to a numeric value.  This way we would avoid losing
> information when we don't have to.
>

And only return a converted copy, keeping the original until it is
later changed with jsonObjectSetNumber().

> The object type would still be JSON_NUMBER, but the internal
> representation would be different.
>
> The strategy would be:
>
> 1. Identify all locations that access the value.n member of a
> jsonObject.
>

This only happens inside libopensrf, so that's good.

> 2. Rewrite them to call either jsonObjectToSimpleString() or
> jsonObjectGetNumber(), according to circumstance.
>

I think jsonObjectGetNumber() will be correct in all cases.

> 3. Modify osrf_json_object.c as needed to store the number as a
> character string.
>

So, we let jsonObjectGetNumber() continue to return a double, but
create that double at call time instead of just retrieving it from the
(now non-existant) value.n member.

In addition to the double version, we should be able to retrieve a
native int (as well as send a float with a decimal value of .0 instead
of forcing to an integer).  So, we need be able to see if we are
working with a float or an int.  To do that, we test for the existence
of '.', 'e', or 'E' in the string.  This test can be a function called
jsonObjectNumericType(), and can do the calculation on demand.

If it's an int, then we can still call jsonObjectGetNumber() and get
back a double, but there should also be a jsonObjectGetInt(),
returning an int, so that when we know a priori the type of the number
value (or we've tested it with jsonObjectNumericType() function) and
we know we're dealing with an integer, then we don't need to do the
silly (int)foo cast and we can just use atoi internally.

We'd also want a jsonObjectSetInt() analog to the
jsonObjectSetNumber() function.  SetNumber would always serialize to a
float, and SetInt would always serialize to a decimal number.

Thinking a little further, jsonObjectGetInt() could just be a wrapper
to jsonObjectGetNumber() that applies the (int) cast to the returned
double.

Thoughts?

> 4. Eliminate the value.n member altogether.  If we overlooked any
> spots that used it, we'll find them in the next build.
>

Yes.

> Admitted drawbacks:
>
> 1. Storing numbers as character strings would require us to allocate
> and free buffers, incurring additional overhead.
>

Luckily, we're already building a buffer while parsing the string, for
the purpose of converting to a double, so no extra cost there.

> 2. We may not want to take these measures until the legacy JSON
> code is gone, so that we wouldn't have to change things in both
> places.
>

I /think/ we can skip this, because all existing code uses the new
JSON format, except for the hybrid gateway apache module.  In that
case it uses the legacy JSON code simply for parsing and serializing
to legacy clients, so leaving that behavior as-is is really OK.
What's more is we're moving all client code to the new JSON format in
on go as far as the ILS code is concerned, and even so, the existing
behavior does not cause any problems now.

> Scott McKellar
> http://home.swbell.net/mck9/ct/



-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker at esilibrary.com
 | web:  http://www.esilibrary.com


More information about the Open-ils-dev mailing list