[OPEN-ILS-DEV] PATCH: osrf_json_utils.h (scientific notation)

Sun Dec 16 08:58:04 EST 2007

Scott McKellar wrote:
> I've been working on this issue today and I'm about halfway there.
> I think.  I hope to post some patches tomorrow but I make no promises.
>
> There are a number of judgement calls about how to handle various
> situations.  I will discuss them in some detail when I post.
>
> Meanwhile I'd like to articulate a point of view that I think should
> be considered.  I'm not sure how much I agree with this point of view
> but it may be useful to raise the issues.
>
> Begin manifesto:
>
> JSON is structured text.  Period.  It defines a set of well-formed
> sequences of characters.  That's all it does.
>
> The semantics of that text are entirely the province of the
> application.  JSON doesn't care what tags you use or how you arrange
> them or what kinds of values may validly be assigned to them or what
> those values mean.  The application does that.
>
> JSON defines a number token as a sequence of characters fitting a
> particular pattern.  The mathematical significance of those character
> sequences are none of JSON's concern.  That's for the application
> to worry about.
>
> Implication: the JSON-related code should not translate between
> character strings and any of the various ways of representing numbers,
> be they ints, longs, doubles, or whatever.  The JSON-related code 
> should simply deliver character strings to the application, and 
> receive character strings from the application.  Let the application
> decide how to do that translation.
>
> If the JSON-related code tries to do the numerical translations, it
> will inevitably make decisions that may be appropriate for some
> applications -- even most applications -- but not for others.
>
> What sort of numeric representation do we want to use?  Ints?  Longs?
> Long longs?  Doubles?  Long doubles?  Do we support infinities and
> NANs?  What limits do we apply?  What if the application wants to use 
> some kind of high-precision library that supports thousands of digits?
>
> If we find ourselves converting a floating point type to an integral
> type, how do we handle the rounding?  Truncate up or down?  What about
> negatives -- toward or away from zero?  Or maybe banker's rounding?
> What do we do if the floating point number is too big (or too small)
> to be converted to an integral type?
>
> If we convert a number to a string, how many digits do we retain to
> the right of the decimal?  Do we include leading or trailing zeros?
> If it happens to be an integer, do we tack a decimal point on the
> end?  Do we use periods or commas for the decimal point?  Do we
> support hex and octal?  Or do we just take whatever we get from 
> snprintf() and its relatives?
>
> We can answer all these questions in ways that will work for Evergreen.
> However OSRF should not be constrained by the peculiarities of
> Evergreen, even if those peculiarities aren't very peculiar.
>
> Conclusion: OSRF should not include such functions as
> jsonObjectGetNumber() and jsonObjectSetNumber().  All interfaces to
> the JSON-related functions should encode numeric values solely 
> as character strings.
>
> Certainly it may be convenient to have something like
> jsonObjectGetNumber().  However any such convenience should be
> implemented within Evergreen, or whatever other application, built 
> upon the public interface of the JSON code.
>
> We won't get to that point immediately, and it will take a fair
> amount of work to get there.  Fortunately Evergreen reportedly does
> very little arithmetic on any data stored in jsonObjects, so the
> changes will probably be pretty manageable.  The result will be a
> cleaner, more generic API for OSRF.
>
> ...end of manifesto.
>
> As expressed above, this point of view may be a bit extreme.  In
> particular, it is appropriate for OSRF to handle some of the semantic
> issues, namely those surrounding UTF8.  It does not follow that it
> should handle the issues surrounding numeric representation.
>   

In principle, I agree with the manifesto.  The obvious problem is that 
the (potentially flawed) logic of converting a JSON number into an 
integer has to live somewhere.  Naturally, the application could better 
anticipate the type of data it's expecting, but if each application has 
to implement its own  string-to-number logic, it seems like that's 
encouraging a lot of code duplication.

Just a thought, what if the JSON number functions reported any loss of 
precision?

int num;
int status = jsonToNumber(obj, &num);

Then, let the application decide how it wants to proceed after receiving 
a bad status. 

Is it always possible to know when a loss of precision has occurred?

This would still require a good bit of work, but it provides a common 
framework for applications to use.

-bill

-- 
Bill Erickson
| VP, Software Development & Integration
| Equinox Software, Inc. / The Evergreen Experts
| phone: 877-OPEN-ILS (673-6457)
| email: erickson at esilibrary.com
| web: http://esilibrary.com