[open-ils-commits] r13134 - in trunk/docs: . TechRef (kgs)
svn at svn.open-ils.org
svn at svn.open-ils.org
Mon May 11 21:16:49 EDT 2009
Author: kgs
Date: 2009-05-11 21:16:48 -0400 (Mon, 11 May 2009)
New Revision: 13134
Added:
trunk/docs/TechRef/
trunk/docs/TechRef/JSONGrammar.xml
trunk/docs/TechRef/TechRef.xml
Log:
Added: trunk/docs/TechRef/JSONGrammar.xml
===================================================================
--- trunk/docs/TechRef/JSONGrammar.xml (rev 0)
+++ trunk/docs/TechRef/JSONGrammar.xml 2009-05-12 01:16:48 UTC (rev 13134)
@@ -0,0 +1,1223 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<sect1 version="5.0" xmlns="http://docbook.org/ns/docbook"
+ xmlns:xi="http://www.w3.org/2003/XInclude"
+ xmlns:xlink="http://www.w3.org/1999/xlink">
+
+ <title>Grammar of JSON Queries</title>
+
+ <para>
+ <author>
+ <personname>
+ <firstname>Scott</firstname>
+ <surname>McKellar</surname>
+ </personname>
+ <affiliation>
+ <orgname>Equinox Software, Inc.</orgname>
+ </affiliation>
+ </author>
+ </para>
+
+
+ <sect2>
+ <title>Introduction</title>
+ <para> The format of this grammar approximates Extended Backus-Naur notation. However it is
+ intended as input to human beings, not to parser generators such as Lex or Yacc. Do not
+ expect formal rigor. Sometimes narrative text will explain things that are clumsy to
+ express in formal notation. More often, the text will restate or summarize the formal
+ productions. </para>
+ <para> Conventions: </para>
+ <orderedlist>
+ <listitem>
+ <para>The grammar is a series of productions.</para>
+ </listitem>
+ <listitem>
+ <para>A production consists of a name, followed by "::=", followed by a definition
+ for the name. The name identifies a grammatical construct that can appear on the
+ right side of another production.</para>
+ </listitem>
+ <listitem>
+ <para>Literals (including punctuation) are enclosed in 'single quotes', or in
+ "double quotes" if case is not significant.</para>
+ </listitem>
+ <listitem>
+ <para>A single quotation mark within a literal is escaped with a preceding
+ backslash: 'dog\'s tail'.</para>
+ </listitem>
+ <listitem>
+ <para>If a construct can be defined more than one way, then the alternatives may
+ appear in separate productions; or, they may appear in the same production,
+ separated by pipe symbols. The choice between these representations is of only
+ cosmetic significance.</para>
+ </listitem>
+ <listitem>
+ <para>A construct enclosed within square brackets is optional.</para>
+ </listitem>
+ <listitem>
+ <para>A construct enclosed within curly braces may be repeated zero or more
+ times.</para>
+ </listitem>
+ <listitem>
+ <para>JSON allows arbitrary white space between tokens. To avoid ugly clutter, this
+ grammar ignores the optional white space. </para>
+ </listitem>
+ <listitem>
+ <para>In many cases a production defines a JSON object, i.e. a list of name-value
+ pairs, separated by commas. Since the order of these name/value pairs is not
+ significant, the grammar will not try to show all the possible sequences. In
+ general it will present the required pairs first, if any, followed by any
+ optional elements.</para>
+ </listitem>
+ </orderedlist>
+
+ <para> Since both EBNF and JSON use curly braces and square brackets, pay close attention to
+ whether these characters are in single quotes. If they're in single quotes, they are
+ literal elements of the JSON notation. Otherwise they are elements of the EBNF notation.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Primitives</title>
+ <para> We'll start by defining some primitives, to get them out of the way. They're mostly
+ just what you would expect. </para>
+
+ <productionset>
+ <production xml:id="ebnf.string">
+ <lhs> string </lhs>
+ <rhs> '"' chars '"' </rhs>
+ </production>
+
+ <production xml:id="ebnf.chars">
+ <lhs> chars </lhs>
+ <rhs> any valid sequence of UTF-8 characters, with certain special characters
+ escaped according to JSON rules </rhs>
+ </production>
+
+ <production xml:id="ebnf.integer_literal">
+ <lhs> integer_literal </lhs>
+ <rhs> [ sign ] digit { digit } </rhs>
+ </production>
+
+ <production xml:id="ebnf.sign">
+ <lhs> sign </lhs>
+ <rhs> '+' | '-' </rhs>
+ </production>
+
+ <production xml:id="ebnf.digit">
+ <lhs> digit </lhs>
+ <rhs>digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'</rhs>
+ </production>
+
+ <production xml:id="ebnf.integer_string">
+ <lhs> integer_string </lhs>
+ <rhs> '"' integer_literal '"' </rhs>
+ </production>
+
+ <production xml:id="ebnf.integer">
+ <lhs> integer </lhs>
+ <rhs> integer_literal | integer_string </rhs>
+ </production>
+
+ <production xml:id="ebnf.number">
+ <lhs> number </lhs>
+ <rhs> any valid character sequence that is numeric according to JSON rules </rhs>
+ </production>
+
+ </productionset>
+
+ <para> When json_query requires an integral value, it will usually accept a quoted string
+ and convert it to an integer by brute force – to zero if necessary. Likewise it may
+ truncate a floating point number to an integral value. Scientific notation will be
+ accepted but may not give the intended results. </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.boolean">
+ <lhs> boolean </lhs>
+ <rhs> 'true' | 'false' | string | number </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The preferred way to encode a boolean is with the JSON reserved word true or false,
+ in lower case without quotation marks. The string <literal>true</literal>, in upper,
+ lower, or mixed case, is another way to encode true. Any other string evaluates to
+ false. </para>
+ <para> As an accommodation to perl, numbers may be used as booleans. A numeric value of 1
+ means true, and any other numeric value means false. </para>
+ <para> Any other valid JSON value, such as an array, will be accepted as a boolean but
+ interpreted as false. </para>
+ <para> The last couple of primitives aren't really very primitive, but we introduce them
+ here for convenience: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.class_name">
+ <lhs> class_name </lhs>
+ <rhs> string </rhs>
+ </production>
+
+ </productionset>
+
+ <para> A class_name is a special case of a string: the name of a class as defined by the
+ IDL. The class may refer either to a database table or to a source_definition, which is
+ a subquery. </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.field_name">
+ <lhs> field_name </lhs>
+ <rhs> string </rhs>
+ </production>
+
+ </productionset>
+
+ <para> A field_name is another special case of a string: the name of a non-virtual field as
+ defined by the IDL. A field_name is also a column name for the table corresponding to
+ the relevant class. </para>
+
+ </sect2>
+
+ <sect2>
+ <title>Query</title>
+
+ <para> The following production applies not only to the main query but also to most
+ subqueries. </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.query">
+ <lhs> query </lhs>
+ <rhs> '{'<sbr/> '"from"' ':' from_list<sbr/> [ ',' '"select"' ':' select_list
+ ]<sbr/> [ ',' '"where"' ':' where_condition ]<sbr/> [ ',' '"having"' ':'
+ where_condition ]<sbr/> [ ',' '"order_by"' ':' order_by_list ]<sbr/> [ ','
+ '"limit"' ':' integer ]<sbr/> [ ',' '"offset"' ':' integer ]<sbr/> [ ','
+ '"distinct"' ':' boolean ]<sbr/> [ ',' '"no_i18n"' ':' boolean ]<sbr/> '}'
+ </rhs>
+ </production>
+
+ </productionset>
+
+ <para> Except for the <literal>"distinct"</literal> and <literal>no_i18n</literal> entries,
+ each name/value pair represents a major clause of the SELECT statement. The name/value
+ pairs may appear in any order. </para>
+ <para> There is no name/value pair for the GROUP BY clause, because json_query generates it
+ automatically according to information encoded elsewhere. </para>
+ <para> The <literal>"distinct"</literal> entry, if present and true, tells json_query that
+ it may have to create a GROUP BY clause. If not present, it defaults to false. </para>
+ <para> The <literal>"no_i18n"</literal> entry, if present and true, tells json_query to
+ suppress internationalization. If not present, it defaults to false. (Note that
+ <literal>"no_i18n"</literal> contains the digit one, not the letter ell.) </para>
+ <para> The values for <literal>limit</literal> and <literal>offset</literal> provide the
+ arguments of the LIMIT and OFFSET clauses, respectively, of the SQL statement. Each
+ value should be non-negative, if present, or else the SQL won't work. </para>
+
+ </sect2>
+
+ <sect2><title>FROM Clause</title>
+ <para>
+ The object identified by <literal>“from”</literal> encodes the FROM clause of
+ the SQL. The associated value may be a string, an array, or a JSON object.
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.from_list_0">
+ <lhs> from_list </lhs>
+ <rhs> class_name </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ If <literal>from_list</literal> is a <literal>class_name</literal>, the
+ json_query inserts the corresponding table name or subquery into the FROM
+ clause, using the <literal>class_name</literal> as an alias for the table
+ or subquery. The class must be defined as non-virtual in the IDL.
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.from_list_1">
+ <lhs> from_list </lhs>
+ <rhs> '[' string { ',' parameter } ']' </rhs>
+ </production>
+
+ <production xml:id="ebnf.parameter">
+ <lhs> parameter </lhs>
+ <rhs> string | number | 'null' </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ If from_list is a JSON array, then it represents a table-like function from
+ which the SQL statement will select rows, using a SELECT clause consisting
+ of “SELECT *” (regardless of the select_list supplied by the method parameter).
+ </para>
+ <para>
+ The first entry in the array is the name of the function. It must be a string
+ naming a stored function. Each subsequent entry is a function parameter. If
+ it is a string or a number, json_query will insert it into a comma-separated
+ parameter list, enclosed in quotes, with any special characters escaped as needed.
+ If it is the JSON reserved word <literal>null</literal>, json_query will insert
+ it into the parameter list as a null value.
+ </para>
+ <para>
+ If <literal>from_list</literal> is a JSON object, it must contain exactly one entry.
+ The key of this entry must be the name of a non-virtual class defined in the IDL.
+ This class will be the top-level class of the FROM clause, the only one named
+ outside of a JOIN clause.
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.from_list_2">
+ <lhs> from_list </lhs>
+ <rhs> '{' class_name ':' join_list '}' </rhs>
+ </production>
+
+ <production xml:id="ebnf.join_list_0">
+ <lhs> join_list </lhs>
+ <rhs> class_name </rhs>
+ </production>
+
+ <production xml:id="ebnf.join_list_1">
+ <lhs> join_list </lhs>
+ <rhs> '{' join_def { ',' join_def } '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ If the associated data is a <literal>class_name</literal>, json_query will
+ construct an INNER JOIN clause joining the class to the top-level clause,
+ using the columns specified by the IDL for such a join.
+ </para>
+ <para>
+ Otherwise, the associated data must be a JSON object with one or more entries,
+ each entry defining a join:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.join_def">
+ <lhs> join_def </lhs>
+ <rhs>
+ class_name ':'<sbr/>
+ '{'<sbr/>
+ [ '”type”' ':' string ]<sbr/>
+ [ '”field”' ':' field_name ]<sbr/>
+ [ '”fkey”' ':' field_name ]<sbr/>
+ [ '”filter”' ':' where_condition ]<sbr/>
+ [ '”filter_op”' ':' string ]<sbr/>
+ [ '”join”' ':' join_list ]<sbr/>
+ '}'
+ </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ The data portion of the <literal>“join_type”</literal> entry tells json_query
+ whether to use a left join, right join, full join, or inner join. The values
+ <literal>“left”</literal>, <literal>“right”</literal>, and <literal>“full”</literal>,
+ in upper, lower, or mixed case, have the obvious meanings. If the
+ <literal>“join_type”</literal> entry has any other value, or is not present,
+ json_query constructs an inner join.
+ </para>
+ <para>
+ The <literal>“field”</literal> and <literal>“fkey”</literal> attributes specify the
+ columns to be equated in the join condition. The <literal>“field”</literal>
+ attribute refers to the column in the joined table, i.e. the one named by the
+ <literal>join_def</literal>. The <literal>“fkey”</literal> attribute refers to the
+ corresponding column in the other table, i.e. the one named outside the
+ <literal>join_def</literal> – either the top-level table or a table named by some
+ other <literal>join_def</literal>.
+ </para>
+ <para>
+ It may be tempting to suppose that <literal>“fkey”</literal> stands for “foreign key”,
+ and therefore refers to a column in the child table that points to the key of a
+ parent table. Resist the temptation; the labels are arbitrary. The json_query
+ method doesn't care which table is the parent and which is the child.
+ </para>
+ <para>
+ These relationships are best explained with an example. The following
+ <literal>from_list</literal>:
+ </para>
+
+ <informalexample>
+ <programlisting language="JSON">
+ {
+ "aou": {
+ "asv": {
+ "type" : "left",
+ "fkey" : "id",
+ "field" : "owner"
+ }
+ }
+ }
+ </programlisting>
+ </informalexample>
+
+ <para>
+ ...turns into the following FROM clause:
+ </para>
+
+ <informalexample>
+ <programlisting language="SQL">
+ FROM
+ actor.org_unit AS "aou"
+ LEFT JOIN action.survey AS "asv"
+ ON ( "asv".owner = "aou".id )
+ </programlisting>
+ </informalexample>
+
+ <para>
+ Note in this example that <literal>“fkey”</literal> refers to a column of the
+ class <literal>“aou”</literal>, and <literal>“field”</literal> refers to a
+ column of the class <literal>“asv”</literal>.
+ </para>
+ <para>
+ If you specify only one of the two columns, json_query will try to identify the
+ other one from the IDL. However, if you specify only the column from the parent
+ table, this attempt will probably fail.
+ </para>
+ <para>
+ If you specify both columns, json_query will use the column names you specify,
+ without verifying them with a lookup in the IDL. By this means you can perform
+ a join using a linkage that the IDL doesn't define. Of course, if the columns
+ don't exist in the database, the query will fail when json_query tries to execute it.
+ </para>
+ <para>
+ Using the columns specified, either explicitly or implicitly, the json_query
+ method constructs a join condition. With raw SQL it is possible (though
+ rarely useful) to join two tables by an inequality. However the json_query
+ method always uses a simple equality condition.
+ </para>
+ <para>
+ Using a <literal>“filter”</literal> entry in the join_def, you can apply one
+ or more additional conditions to the JOIN clause, typically to restrict the
+ join to certain rows of the joined table. The data associated with the
+ <literal>“filter”</literal> key is the same sort of
+ <literal>where_condition</literal> that you use for a WHERE clause
+ (discussed below).
+ </para>
+ <para>
+ If the string associated with the <literal>“filter_op”</literal> entry is
+ <literal>“OR”</literal> in upper, lower, or mixed case, then the json_query
+ method uses OR to connect the standard join condition to any additional
+ conditions supplied by a <literal>“filter”</literal> entry.
+ </para>
+ <para>
+ (Note that if the <literal>where_condition</literal> supplies multiple
+ conditions, they will be connected by AND. You will probably want to move
+ them down a layer – enclose them in parentheses, in effect – to avoid a
+ confusing mixture of ANDs and ORs.)
+ </para>
+ <para>
+ If the <literal>“filter_op”</literal> entry carries any other value, or if
+ it is absent, then the json_query method uses AND. In the absence of a
+ <literal>“filter”</literal> entry, <literal>“filter_op”</literal> has no effect.
+ </para>
+ <para>
+ A <literal>“join”</literal> entry in a <literal>join_def</literal> specifies
+ another layer of join. The class named in the subjoin is joined to the class
+ named by the <literal>join_def</literal> to which it is subordinate. By this
+ means you can encode multiple joins in a hierarchy.
+ </para>
+ </sect2>
+
+ <sect2><title>SELECT Clause</title>
+ <para>
+ If a query does not contain an entry for <literal>“select”</literal>, json_query
+ will construct a default SELECT clause. The default includes every non-virtual
+ field from the top-level class of the FROM clause, as defined by the IDL. The
+ result is similar to SELECT *, except:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>The default includes only the fields defined in the IDL.</para>
+ </listitem>
+ <listitem>
+ <para>The columns will appear in the same order in which they appear in the IDL,
+ regardless of the order in which the database defines them.</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ There are other ways to specify a default SELECT list, as shown below.
+ </para>
+ <para>
+ If a <literal>"select"</literal> entry is present, the associated value must
+ be a JSON object, keyed on class names:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.select_list">
+ <lhs> select_list </lhs>
+ <rhs> '{' class_name ':' field_list { ',' class_name ':' field_list } '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ The <literal>class_name</literal> must identify either the top-level class or
+ a class belonging to one of the joins. Otherwise json_query will silently
+ ignore the <literal>select_list</literal>.
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.field_list_0">
+ <lhs> field_list </lhs>
+ <rhs> 'null' | '”*”' </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ If a field_list is either the JSON reserved word <literal>null</literal>
+ (in lower case) or an asterisk in double quotes, json_query constructs a
+ default SELECT list – provided that the class is the top-level class of the
+ query. If the class belongs to a join somewhere, json_query ignores the
+ <literal>field_list</literal>.
+ </para>
+ <para>
+ More commonly, the <literal>field_list</literal> is a JSON array of zero or
+ more field specifications:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.field_list_1">
+ <lhs> field_list </lhs>
+ <rhs> '[' [ field_spec { ',' field_spec } ] ']' </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ If the array is empty, json_query will construct a default SELECT list for
+ the class – again, provided that the class is the top-level class in the query.
+ </para>
+ <para>
+ In the simplest case, a field specification may name a non-virtual field
+ defined in the IDL:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.field_spec_0">
+ <lhs> field_spec </lhs>
+ <rhs> field_name </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ In some cases json_query constructs a call to the
+ <literal>oils_i18n_xlate</literal> function to internationalize the value of the
+ selected column. Specifically, it does so if all the following are true:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>the settings file defines a locale;</para>
+ </listitem>
+ <listitem>
+ <para>in the field definition for the field in the IDL, the tag
+ <literal>“il8n”</literal> is present and true;</para>
+ </listitem>
+ <listitem>
+ <para>the query does <emphasis>not</emphasis> include the
+ <literal>"no_il8n"</literal> tag (or includes it with a value of false).</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ A field specification may be a JSON object:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.field_spec_1">
+ <lhs> field_spec </lhs>
+ <rhs>
+ '{'<sbr/>
+ '”column”' ':' <sbr/>
+ [ ',' '”alias”' ':' string ]<sbr/>
+ [ ',' '”aggregate”' ':' boolean ]<sbr/>
+ [ ',' transform_spec ]<sbr/>
+ '}'
+ </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ The <literal>“column”</literal> entry provides the column name, which must
+ be defined as non-virtual in the IDL.
+ </para>
+ <para>
+ The <literal>“alias”</literal> entry provides a column alias. If no alias
+ is specified, json_query uses the column name as its own alias.
+ </para>
+ <para>
+ The <literal>“aggregate”</literal> entry has no effect on the SELECT clause
+ itself. Rather, it affects the construction of a GROUP BY class. If there
+ is an <literal>“aggregate”</literal> entry for any field, then json_query builds
+ a GROUP BY clause listing every column that is <emphasis>not</emphasis> tagged
+ for aggregation (or that carries an <literal>“aggregate”</literal> entry with
+ a value of false). If <emphasis>all</emphasis> columns are tagged for
+ aggregation, then json_query omits the GROUP BY clause.
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.transform_spec_0">
+ <lhs> transform_spec </lhs>
+ <rhs>
+ '”transform”' ':' string ]<sbr/>
+ [ ',' '”result_field” ':' string ]<sbr/>
+ [ ',' '”params” ':' param_list ]
+ </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ When a <literal>transform_spec</literal> is present, json_query selects the
+ return value of a function instead of selecting the column directly. The entry
+ for <literal>“transform”</literal> provides the name of the function, and the
+ column name (as specified by the <literal>“column”</literal> tag), qualified by
+ the class name, is the argument to the function. For example, you might use such
+ a function to format a date or time, or otherwise transform a column value.
+ You might also use an aggregate function such as SUM, COUNT, or MAX (possibly
+ together with the <literal>“aggregate”</literal> tag).
+ </para>
+ <para>
+ The <literal>“result_field”</literal> entry, when present, specifies a subcolumn
+ of the function's return value. The resulting SQL encloses the function call
+ in parentheses, and follows it with a period and the subcolumn name.
+ </para>
+ <para>
+ The <literal>“params”</literal> entry, if present, provides a possibly empty
+ array of additional parameter values, either strings, numbers, or nulls:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.param_list">
+ <lhs> param_list </lhs>
+ <rhs> '[' [ parameter { ',' parameter } ] ']' </rhs>
+ </production>
+
+ </productionset>
+
+ <para>
+ Such parameter values are enclosed in single quotes, with any special characters
+ escaped as needed, and inserted after the column name as additional parameters
+ to the function. You might, for example, use an additional parameter to provide
+ a format string for a reformatting function.
+ </para>
+ </sect2>
+
+ <sect2><title>WHERE Clause</title>
+ <para> There are two types of <literal>where_condition</literal>: objects and arrays.
+ Of these, the object type is the more fundamental, and occurs at some level in every
+ <literal>where_condition</literal>. The array type is mainly a way of circumventing
+ a limitation of the object type. </para>
+ <para> The object type of <literal>where_condition</literal> is a comma-separated list
+ of one or more <literal>conditions</literal>: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.where_condition_0">
+ <lhs> where_condition </lhs>
+ <rhs> '{' condition { ',' condition } '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The generated SQL will include a code fragment for each <literal>condition</literal>,
+ joined by AND (or in some cases by OR, as described below). </para>
+ <para> As usual for entries in a JSON object, each <literal>condition</literal> consists
+ of a unique string to serve as a key, a colon, and an associated value. </para>
+ <para> The key string may be the name of a column belonging to the relevant table, or
+ it may be an operator string. In order to distinguish it from any possible column
+ name, an operator string always begins with a plus sign or minus sign. </para>
+ <para> JSON requires that every key string be unique within an object. This requirement
+ imposes some awkward limitations on a JSON query. For example, you might want to
+ express two conditions for the same column: id > 10 and id != 25. Since each of
+ those conditions would have the same key string, namely “id”, you can't put them
+ into the same JSON object. </para>
+ <para> The solution is to put such conflicting conditions in separate JSON objects, and
+ put the objects into an array: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.where_condition_1">
+ <lhs> where_condition </lhs>
+ <rhs> '[' where_condition { ',' where_condition } ']' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The resulting SQL encloses each subordinate set of <literal>conditions</literal>
+ in parentheses, and connects the sets with AND (or in some cases OR, as described
+ below). It's possible to put only a single <literal>where_condition</literal> in
+ the array; the result is to add a layer of parentheses around the condition. </para>
+ <para> There are two kinds of <literal>condition</literal> where the operator begins
+ with a plus sign. In the simpler case, the associated data is simply a column name:
+ </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_0">
+ <lhs> condition </lhs>
+ <rhs> plus_class ':' field_name </rhs>
+ </production>
+
+ </productionset>
+
+ <para> A <literal>plus_class</literal> is a string that begins with a plus sign.
+ The rest of the string, after the plus sign, must be the class name for the table
+ to which the column belongs. </para>
+ <para> If the column is a boolean, then the resulting SQL uses it (as qualified by the
+ class name) as a stand-alone condition. </para>
+ <para> Otherwise, this kind of syntax provides a way to place a column on the right side
+ of a comparison operator. For example: </para>
+
+ <informalexample>
+ <programlisting language="JSON">
+ {
+ "from":"aou",
+ "select": { "aou":[ "id", "name" ] },
+ "where": {
+ "id": {
+ ">": { "+aou":"parent_ou" }
+ }
+ }
+ }
+ </programlisting>
+ </informalexample>
+
+ <para> The resulting SQL: </para>
+
+ <informalexample>
+ <programlisting language="SQL">
+ SELECT
+ "aou".id AS "id",
+ "aou".name AS "name"
+ FROM
+ actor.org_unit AS "aou"
+ WHERE
+ (
+ "aou".id > ( "aou".parent_ou )
+ );
+ </programlisting>
+ </informalexample>
+
+ <para> The other type of <literal>condition</literal> that uses a
+ <literal>plus_class</literal> applies a specified class name to a
+ <literal>where_condition</literal>: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_1">
+ <lhs> condition </lhs>
+ <rhs> plus_class ':' where_condition </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The resulting SQL is enclosed in parentheses, and qualifies the columns with
+ the specified class name. This syntax provides a mechanism to shift the class
+ context – i.e. to refer to one class in a context that would otherwise refer to
+ a different class. </para>
+ <para> Ordinarily the class name must be a valid non-virtual class defined in the IDL,
+ and applicable to the associated <literal>where_condition</literal>. There is at
+ least one peculiar exception. The JSON fragment: </para>
+
+ <informalexample>
+ <programlisting language="JSON">
+ "+abc": { "+xyz":"frobozz" }
+ </programlisting>
+ </informalexample>
+
+ <para> ...is rendered as: </para>
+
+ <informalexample>
+ <programlisting language="SQL">
+ ( "xyz".frobozz )
+ </programlisting>
+ </informalexample>
+
+ <para> ...even though neither <literal>“abc”</literal>, nor <literal>“xyz”</literal>,
+ nor <literal>“frobozz”</literal> is defined in the IDL. The class name
+ <literal>“abc”</literal> isn't used at all because the <literal>“+xyz”</literal>
+ operator overrides it. Such a query won't fail until json_query tries
+ to execute it in the database. </para>
+ <para> The other operators that may occur at this level all begin with a minus sign,
+ and they all represent familiar SQL operators. For example, the
+ <literal>“-or”</literal> operator joins the conditions within a
+ <literal>where_condition</literal> by OR (instead of the default AND), and
+ encloses them all in parentheses: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_2">
+ <lhs> condition </lhs>
+ <rhs> '”-or”' ':' where_condition </rhs>
+ </production>
+
+ </productionset>
+
+ <para> In fact the <literal>“-or”</literal> operator is the only way to get OR into
+ the WHERE clause. </para>
+ <para> The <literal>“-and”</literal> operator is similar, except that it uses AND: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_3">
+ <lhs> condition </lhs>
+ <rhs> '”-and”' ':' where_condition </rhs>
+ </production>
+
+ </productionset>
+
+ <para> Arguably the <literal>“-and”</literal> operator is redundant, because you can
+ get the same effect by wrapping the subordinate <literal>where_condition</literal>
+ in a JSON array. Either technique merely adds a layer of parentheses, since AND
+ connects successive conditions by default. </para>
+ <para> The <literal>“-not”</literal> operator expands the subordinate
+ <literal>where_condition</literal> within parentheses, and prefaces the result
+ with NOT: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_4">
+ <lhs> condition </lhs>
+ <rhs> '”-not”' ':' where_condition </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The <literal>“-exists”</literal> or <literal>“-not-exists”</literal> operator
+ constructs a subquery within an EXISTS or NOT EXISTS clause, respectively: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_5">
+ <lhs> condition </lhs>
+ <rhs> '”-exists”' ':' query </rhs>
+ </production>
+
+ <production xml:id="ebnf.condition_6">
+ <lhs> condition </lhs>
+ <rhs> '”-not-exists”' ':' query </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The remaining kinds of <literal>condition</literal> all have a
+ <literal>field_name</literal> on the left and some kind of <literal>predicate</literal>
+ on the right. A <literal>predicate</literal> places a constraint on the value of
+ the column – or, in some cases, on the value of the column as transformed by some
+ function call: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.condition_7">
+ <lhs> condition </lhs>
+ <rhs> field_name ':' predicate </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The simplest such constraint is to require that the column have a specific value,
+ or be null: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_0">
+ <lhs> predicate </lhs>
+ <rhs> lit_value | 'null' </rhs>
+ </production>
+
+ <production xml:id="ebnf.lit_value">
+ <lhs> lit_value </lhs>
+ <rhs> string | number </rhs>
+ </production>
+
+ </productionset>
+
+ <para> You can also compare a column to a literal value using some kind of inequality.
+ However it's a bit more complicated because you have to specify what kind of comparison
+ to make: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.compare_op_0">
+ <lhs> predicate </lhs>
+ <rhs> '{' compare_op ':' lit_value '}' </rhs>
+ </production>
+
+ <production xml:id="ebnf.compare_op_1">
+ <lhs> compare_op </lhs>
+ <rhs> string </rhs>
+ </production>
+
+ </productionset>
+
+ <para> A <literal>compare_op</literal> is a string that defines a comparison operator.
+ Valid values include the following: </para>
+
+ <programlisting language="SQL">
+ = <> !=
+ < > <= >=
+ ~ ~* !~ !~*
+ like ilike
+ similar to
+ </programlisting>
+
+ <para> Strictly speaking, json_query accepts any <literal>compare_op</literal>
+ that doesn't contain semicolons or white space (or
+ <literal>“similar to”</literal> as a special exception). As a result, it
+ is possible – and potentially useful – to use a custom operator like
+ <literal>“>100*”</literal> in order to insert an expression that would
+ otherwise be difficult or impossible to create through a JSON query. The ban
+ on semicolons and white space prevents certain kinds of SQL injection. </para>
+ <para> Note that json_query does <emphasis>not</emphasis> accept two operators that
+ PostgreSQL <emphasis>does</emphasis> accept: <literal>“is distinct from”</literal>
+ and <literal>“is not distinct from”</literal>. </para>
+ <para> You can also compare a column to a null value: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_1">
+ <lhs> predicate </lhs>
+ <rhs> '{' compare_op ':' 'null' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The equality operator <literal>“=”</literal> turns into IS NULL. Any other
+ operator turns into IS NOT NULL. </para>
+ <para> When a <literal>compare_op</literal> is paired with an array, it defines a
+ function call: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_2">
+ <lhs> predicate </lhs>
+ <rhs> '{' compare_op ':' '[' string { ',' parameter } ']' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The first entry in the array is the function's name. Subsequent entries in
+ the array, if any, represent the parameters of the function call. They may be
+ strings, numbers, or nulls. In the generated SQL, the function call appears on
+ the right of the comparison. </para>
+ <para> The <literal>“between”</literal> operator creates a BETWEEN clause: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_3">
+ <lhs> predicate </lhs>
+ <rhs> '{' “between” ':' '[' lit_value ',' lit_value ']' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> Although PostgreSQL allows a null value in a BETWEEN clause, json_query
+ requires literal non-null values. It isn't sensible to use null values in a
+ BETWEEN clause. A few experiments show that the results of the comparison are
+ peculiar and erratic. </para>
+ <para> There are two ways to create an IN list of allowed values. The simplest is
+ to put literal values into a JSON array: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_4">
+ <lhs> predicate </lhs>
+ <rhs> '[' lit_value { ',' lit_value } ']' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> As with BETWEEN clauses, json_query does not accept null values in an IN list,
+ even though PostgreSQL does allow them. Nulls are not sensible in this context
+ because they never match anything. </para>
+ <para> </para>
+ </sect2>
+
+ <sect2><title>Having Clause</title>
+ <para>For the HAVING clause, json_query accepts exactly the same syntax as it accepts for
+ the WHERE clause.</para>
+ <para> The other way to create an IN list is to use an explicit
+ <literal>“in”</literal> operator with an array of literal values. This format
+ also works for the <literal>“not in”</literal> operator: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_5">
+ <lhs> predicate </lhs>
+ <rhs> '{' in_operator ';' '[' lit_value [ ',' lit_value ] ']' '}' </rhs>
+ </production>
+
+ <production xml:id="ebnf.in_operator">
+ <lhs> in_operator </lhs>
+ <rhs> “in” | “not in” </rhs>
+ </production>
+
+ </productionset>
+
+ <para> Another kind of IN or NOT IN clause uses a subquery instead of a list of
+ values: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_6">
+ <lhs> predicate </lhs>
+ <rhs> '{' in_operator ':' query '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The remaining types of <literal>predicate</literal> can put a function call on
+ the left of the comparison, by using a <literal>transform_spec</literal> together
+ with a <literal>“value”</literal> tag. The <literal>transform_spec</literal> is
+ optional, and if you don't need it, the same SQL would in many cases be easier to
+ express by other means. </para>
+ <para> The <literal>transform_spec</literal> construct was described earlier in
+ connection with the SELECT clause, but here it is again: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.transform_spec_1">
+ <lhs> transform_spec </lhs>
+ <rhs>
+ '”transform”' ':' string ]<sbr/>
+ [ ',' '”result_field” ':' string ]<sbr/>
+ [ ',' '”params” ':' param_list ]
+ </rhs>
+ </production>
+
+ </productionset>
+
+ <para> As in the SELECT clause, the <literal>“transform”</literal> string names the
+ function. The first parameter is always the column identified by the field_name.
+ Additional parameters, if any, appear in the <literal>param_list</literal>. The
+ <literal>“result_field”</literal> string, if present, identifies one column of a
+ multicolumn return value. </para>
+ <para> Here's a second way to compare a value to a literal value (but not to a null
+ value): </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_7">
+ <lhs> predicate </lhs>
+ <rhs> '{' compare_op ':' '{' '”value”' ':' lit_value<sbr/>
+ [ transform_spec ] '}' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> ...and a way to compare a value to a boolean expression: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_8">
+ <lhs> predicate </lhs>
+ <rhs> '{' compare_op ':' '{' '”value”' ':' '{'<sbr/>
+ condition { ',' condition } [ transform_spec ] '}' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The final predicate is another way to put a function call on the right side
+ of the comparison: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.predicate_9">
+ <lhs> predicate </lhs>
+ <rhs> '{' compare_op ':' '{' '”value”' ':' '['<sbr/>
+ string { ',' parameter } ']' [ transform_spec ] '}' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> This format is available for the sake of consistency, but offers no advantage
+ over the simpler version. </para>
+ </sect2>
+
+ <sect2><title>ORDER BY Clause</title>
+ <para> There are two ways to encode an ORDER BY clause: as an array, or as a list.
+ Either may be empty, in which case the generated SQL will not include an ORDER BY
+ clause: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.order_by_list_0">
+ <lhs> order_by_list </lhs>
+ <rhs> '[' ']' | '{' '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> If not empty, the array contains one or more objects, each defining a sort
+ field: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.order_by_list_1">
+ <lhs> order_by_list </lhs>
+ <rhs> '{' sort_field_def { ',' sort_field_def } '}' </rhs>
+ </production>
+
+ <production xml:id="ebnf.sort_field_def">
+ <lhs> sort_field_def </lhs>
+ <rhs> '{'<sbr/>
+ '”class”' ':' class_name<sbr/>
+ ',' '”field”' ':' field_name<sbr/>
+ [ ',' '”direction”' ':' lit_value ]<sbr/>
+ [ ',' transform_spec ]<sbr/>
+ '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> The <literal>“class”</literal> and <literal>“field”</literal> entries are
+ required, and of course the field must belong to the class. Furthermore, at
+ least one field from the class must appear in the SELECT clause. </para>
+ <para> The <literal>“direction”</literal> entry, if present, specifies whether the
+ sort will be ascending or descending for the column in question. If the associated
+ value begins with “D” or “d”, the sort will be descending; otherwise the sort will
+ be ascending. If the value is a number, it will be treated as a string that does not
+ start with “D” or “d”, resulting in an ascending sort. </para>
+ <para> In the absence of a <literal>“direction”</literal> entry, the sort will be
+ ascending. </para>
+ <para> The <literal>transform_spec</literal> works here the same way it works in the
+ SELECT clause and the WHERE clause, enabling you to pass the column through a
+ transforming function before the sort: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.transform_spec_2">
+ <lhs> transform_spec </lhs>
+ <rhs>
+ '”transform”' ':' string ]<sbr/>
+ [ ',' '”result_field” ':' string ]<sbr/>
+ [ ',' '”params” ':' param_list ]
+ </rhs>
+ </production>
+
+ </productionset>
+
+ <para> When the <literal>order_by_list</literal> is an object instead of an array,
+ the syntax is less verbose, but also less flexible. The keys for the object are
+ class names: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.order_by_list_2">
+ <lhs> order_by_list </lhs>
+ <rhs> '{' class_name ':' sort_class_def<sbr/>
+ { ',' class_name ':' sort_class_def } '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> Each class must be referenced in the SELECT clause. </para>
+ <para> As in the SELECT clause, all the fields for a given class must be grouped
+ together. You can't sort by a column from one table, then a column from a second
+ table, then another column from the first table. If you need this kind of sort,
+ you must encode the ORDER BY clause as an array instead of an object. </para>
+ <para> The data associated with a <literal>class_name</literal> may be either an array
+ or an object. If an array, it's simply a list of field names, and each field must
+ belong to the class: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.sort_class_def_0">
+ <lhs> sort_class_def </lhs>
+ <rhs> '[' field_name { ',' field_name } ']' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> With this syntax, the direction of sorting will always be ascending. </para>
+ <para> If the data is an object, the keys are field names, and as usual the fields
+ must belong to the class: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.sort_class_def_1">
+ <lhs> sort_class_def </lhs>
+ <rhs> '{' field_name ':' sort_class_subdef<sbr/>
+ { ',' field_name ':' sort_class_subdef } '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> Since the <literal>field_name</literal> is the key for the object, it cannot
+ appear more than once. As a result, some kinds of sorts are not possible with this
+ syntax. For example, one might want to sort by UPPER( family_name ), and then by
+ family_name with case unchanged, to make sure that “diBona” comes before “Dibona”.
+ For situations like this, you must encode the ORDER BY clause as an array rather
+ than an object. </para>
+ <para> The data associated with each <literal>field_name</literal> may take either of
+ two forms. In the simplest case, it's a literal value to specify the direction
+ of sorting: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.sort_class_subdef_0">
+ <lhs> sort_class_subdef </lhs>
+ <rhs> lit_value </rhs>
+ </production>
+
+ </productionset>
+
+ <para> If the literal is a string starting with “D” or “d”, json_query sorts the field
+ in descending order. Otherwise it sorts the field in ascending order. </para>
+ <para> In other cases, the <literal>field_name</literal> may be paired with an object
+ to specify more details: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.sort_class_subdef_1">
+ <lhs> sort_class_subdef </lhs>
+ <rhs> '{'<sbr/>
+ [ '”direction”' ':' lit_value ]<sbr/>
+ [ transform_spec ]<sbr/>
+ '}' </rhs>
+ </production>
+
+ </productionset>
+
+ <para> As before, the value tagged as <literal>“direction”</literal> specifies the
+ direction of the sort, depending on the first character. If not otherwise
+ specified, the sort direction defaults to ascending. </para>
+ <para> Also as before, the <literal>transform_spec</literal> may specify a function
+ through which to pass the column. </para>
+ <para> Since both the <literal>“direction”</literal> tag and the
+ <literal>transform_spec</literal> are optional, the object may be empty: </para>
+
+ <productionset>
+
+ <production xml:id="ebnf.sort_class_subdef_2">
+ <lhs> sort_class_subdef </lhs>
+ <rhs> '{' '}' </rhs>
+ </production>
+
+ </productionset>
+ </sect2>
+
+</sect1>
Added: trunk/docs/TechRef/TechRef.xml
===================================================================
--- trunk/docs/TechRef/TechRef.xml (rev 0)
+++ trunk/docs/TechRef/TechRef.xml 2009-05-12 01:16:48 UTC (rev 13134)
@@ -0,0 +1,12 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<chapter version="5.0" xmlns="http://docbook.org/ns/docbook"
+ xmlns:xi="http://www.w3.org/2003/XInclude"
+ xmlns:xlink="http://www.w3.org/1999/xlink">
+
+ <title>Evergreen Technical Reference</title>
+
+<xi:include href="JSONGrammar.xml"
+ xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+</chapter>
More information about the open-ils-commits
mailing list