[open-ils-commits] r13134 - in trunk/docs: . TechRef (kgs)

svn at svn.open-ils.org svn at svn.open-ils.org
Mon May 11 21:16:49 EDT 2009


Author: kgs
Date: 2009-05-11 21:16:48 -0400 (Mon, 11 May 2009)
New Revision: 13134

Added:
   trunk/docs/TechRef/
   trunk/docs/TechRef/JSONGrammar.xml
   trunk/docs/TechRef/TechRef.xml
Log:


Added: trunk/docs/TechRef/JSONGrammar.xml
===================================================================
--- trunk/docs/TechRef/JSONGrammar.xml	                        (rev 0)
+++ trunk/docs/TechRef/JSONGrammar.xml	2009-05-12 01:16:48 UTC (rev 13134)
@@ -0,0 +1,1223 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<sect1 version="5.0" xmlns="http://docbook.org/ns/docbook"
+	xmlns:xi="http://www.w3.org/2003/XInclude" 
+	xmlns:xlink="http://www.w3.org/1999/xlink">
+
+	<title>Grammar of JSON Queries</title>
+
+	<para>
+		<author>
+			<personname>
+				<firstname>Scott</firstname>
+				<surname>McKellar</surname>
+			</personname>
+			<affiliation>
+				<orgname>Equinox Software, Inc.</orgname>
+			</affiliation>
+		</author>
+	</para>
+
+
+	<sect2>
+		<title>Introduction</title>
+		<para> The format of this grammar approximates Extended Backus-Naur notation. However it is
+			intended as input to human beings, not to parser generators such as Lex or Yacc. Do not
+			expect formal rigor. Sometimes narrative text will explain things that are clumsy to
+			express in formal notation. More often, the text will restate or summarize the formal
+			productions. </para>
+		<para> Conventions: </para>
+		<orderedlist>
+			<listitem>
+				<para>The grammar is a series of productions.</para>
+			</listitem>
+			<listitem>
+				<para>A production consists of a name, followed by "::=", followed by a definition
+					for the name. The name identifies a grammatical construct that can appear on the
+					right side of another production.</para>
+			</listitem>
+			<listitem>
+				<para>Literals (including punctuation) are enclosed in 'single quotes', or in
+					"double quotes" if case is not significant.</para>
+			</listitem>
+			<listitem>
+				<para>A single quotation mark within a literal is escaped with a preceding
+					backslash: 'dog\'s tail'.</para>
+			</listitem>
+			<listitem>
+				<para>If a construct can be defined more than one way, then the alternatives may
+					appear in separate productions; or, they may appear in the same production,
+					separated by pipe symbols. The choice between these representations is of only
+					cosmetic significance.</para>
+			</listitem>
+			<listitem>
+				<para>A construct enclosed within square brackets is optional.</para>
+			</listitem>
+			<listitem>
+				<para>A construct enclosed within curly braces may be repeated zero or more
+					times.</para>
+			</listitem>
+			<listitem>
+				<para>JSON allows arbitrary white space between tokens. To avoid ugly clutter, this
+					grammar ignores the optional white space. </para>
+			</listitem>
+			<listitem>
+				<para>In many cases a production defines a JSON object, i.e. a list of name-value
+					pairs, separated by commas. Since the order of these name/value pairs is not
+					significant, the grammar will not try to show all the possible sequences. In
+					general it will present the required pairs first, if any, followed by any
+					optional elements.</para>
+			</listitem>
+		</orderedlist>
+
+		<para> Since both EBNF and JSON use curly braces and square brackets, pay close attention to
+			whether these characters are in single quotes. If they're in single quotes, they are
+			literal elements of the JSON notation. Otherwise they are elements of the EBNF notation.
+		</para>
+	</sect2>
+
+	<sect2>
+		<title>Primitives</title>
+		<para> We'll start by defining some primitives, to get them out of the way. They're mostly
+			just what you would expect. </para>
+
+		<productionset>
+			<production xml:id="ebnf.string">
+				<lhs> string </lhs>
+				<rhs> '"' chars '"' </rhs>
+			</production>
+
+			<production xml:id="ebnf.chars">
+				<lhs> chars </lhs>
+				<rhs> any valid sequence of UTF-8 characters, with certain special characters
+					escaped according to JSON rules </rhs>
+			</production>
+
+			<production xml:id="ebnf.integer_literal">
+				<lhs> integer_literal </lhs>
+				<rhs> [ sign ] digit { digit } </rhs>
+			</production>
+
+			<production xml:id="ebnf.sign">
+				<lhs> sign </lhs>
+				<rhs> '+' | '-' </rhs>
+			</production>
+
+			<production xml:id="ebnf.digit">
+				<lhs> digit </lhs>
+				<rhs>digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'</rhs>
+			</production>
+
+			<production xml:id="ebnf.integer_string">
+				<lhs> integer_string </lhs>
+				<rhs> '"' integer_literal '"' </rhs>
+			</production>
+
+			<production xml:id="ebnf.integer">
+				<lhs> integer </lhs>
+				<rhs> integer_literal | integer_string </rhs>
+			</production>
+
+			<production xml:id="ebnf.number">
+				<lhs> number </lhs>
+				<rhs> any valid character sequence that is numeric according to JSON rules </rhs>
+			</production>
+
+		</productionset>
+
+		<para> When json_query requires an integral value, it will usually accept a quoted string
+			and convert it to an integer by brute force – to zero if necessary. Likewise it may
+			truncate a floating point number to an integral value. Scientific notation will be
+			accepted but may not give the intended results. </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.boolean">
+				<lhs> boolean </lhs>
+				<rhs> 'true' | 'false' | string | number </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The preferred way to encode a boolean is with the JSON reserved word true or false,
+			in lower case without quotation marks. The string <literal>true</literal>, in upper,
+			lower, or mixed case, is another way to encode true. Any other string evaluates to
+			false. </para>
+		<para> As an accommodation to perl, numbers may be used as booleans. A numeric value of 1
+			means true, and any other numeric value means false. </para>
+		<para> Any other valid JSON value, such as an array, will be accepted as a boolean but
+			interpreted as false. </para>
+		<para> The last couple of primitives aren't really very primitive, but we introduce them
+			here for convenience: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.class_name">
+				<lhs> class_name </lhs>
+				<rhs> string </rhs>
+			</production>
+
+		</productionset>
+
+		<para> A class_name is a special case of a string: the name of a class as defined by the
+			IDL. The class may refer either to a database table or to a source_definition, which is
+			a subquery. </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.field_name">
+				<lhs> field_name </lhs>
+				<rhs> string </rhs>
+			</production>
+
+		</productionset>
+
+		<para> A field_name is another special case of a string: the name of a non-virtual field as
+			defined by the IDL. A field_name is also a column name for the table corresponding to
+			the relevant class. </para>
+
+	</sect2>
+
+	<sect2>
+		<title>Query</title>
+
+		<para> The following production applies not only to the main query but also to most
+			subqueries. </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.query">
+				<lhs> query </lhs>
+				<rhs> '{'<sbr/> '"from"' ':' from_list<sbr/> [ ',' '"select"' ':' select_list
+					]<sbr/> [ ',' '"where"' ':' where_condition ]<sbr/> [ ',' '"having"' ':'
+					where_condition ]<sbr/> [ ',' '"order_by"' ':' order_by_list ]<sbr/> [ ','
+					'"limit"' ':' integer ]<sbr/> [ ',' '"offset"' ':' integer ]<sbr/> [ ','
+					'"distinct"' ':' boolean ]<sbr/> [ ',' '"no_i18n"' ':' boolean ]<sbr/> '}'
+				</rhs>
+			</production>
+
+		</productionset>
+
+		<para> Except for the <literal>"distinct"</literal> and <literal>no_i18n</literal> entries,
+			each name/value pair represents a major clause of the SELECT statement. The name/value
+			pairs may appear in any order. </para>
+		<para> There is no name/value pair for the GROUP BY clause, because json_query generates it
+			automatically according to information encoded elsewhere. </para>
+		<para> The <literal>"distinct"</literal> entry, if present and true, tells json_query that
+			it may have to create a GROUP BY clause. If not present, it defaults to false. </para>
+		<para> The <literal>"no_i18n"</literal> entry, if present and true, tells json_query to
+			suppress internationalization. If not present, it defaults to false. (Note that
+				<literal>"no_i18n"</literal> contains the digit one, not the letter ell.) </para>
+		<para> The values for <literal>limit</literal> and <literal>offset</literal> provide the
+			arguments of the LIMIT and OFFSET clauses, respectively, of the SQL statement. Each
+			value should be non-negative, if present, or else the SQL won't work. </para>
+
+	</sect2>
+
+	<sect2><title>FROM Clause</title>
+		<para>
+			The object identified by <literal>“from”</literal> encodes the FROM clause of
+			the SQL.  The associated value may be a string, an array, or a JSON object.
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.from_list_0">
+				<lhs> from_list </lhs>
+				<rhs> class_name </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			If <literal>from_list</literal> is a <literal>class_name</literal>, the
+			json_query inserts the corresponding table name or subquery into the FROM
+			clause, using the <literal>class_name</literal> as an alias for the table
+			or subquery.  The class must be defined as non-virtual in the IDL.
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.from_list_1">
+				<lhs> from_list </lhs>
+				<rhs> '['  string  {  ','  parameter  }  ']' </rhs>
+			</production>
+
+			<production xml:id="ebnf.parameter">
+				<lhs> parameter </lhs>
+				<rhs> string  |  number  |  'null' </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			If from_list is a JSON array, then it represents a table-like function from
+			which the SQL statement will select rows, using a SELECT clause consisting
+			of “SELECT *” (regardless of the select_list supplied by the method parameter).
+		</para>
+		<para>
+			The first entry in the array is the name of the function.  It must be a string
+			naming a stored function.  Each subsequent entry is a function parameter.  If
+			it is a string or a number, json_query will insert it into a comma-separated
+			parameter list, enclosed in quotes, with any special characters escaped as needed.
+			If it is the JSON reserved word <literal>null</literal>, json_query will insert
+			it into the parameter list as a null value.
+		</para>
+		<para>
+			If <literal>from_list</literal> is a JSON object, it must contain exactly one entry.
+			The key of this entry must be the name of a non-virtual class defined in the IDL.
+			This class will be the top-level class of the FROM clause, the only one named
+			outside of a JOIN clause.
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.from_list_2">
+				<lhs> from_list </lhs>
+				<rhs> '{' class_name ':' join_list '}' </rhs>
+			</production>
+
+			<production xml:id="ebnf.join_list_0">
+				<lhs> join_list </lhs>
+				<rhs> class_name </rhs>
+			</production>
+
+			<production xml:id="ebnf.join_list_1">
+				<lhs> join_list </lhs>
+				<rhs> '{' join_def { ',' join_def } '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			If the associated data is a <literal>class_name</literal>, json_query will
+			construct an INNER JOIN clause joining the class to the top-level clause,
+			using the columns specified by the IDL for such a join.
+		</para>
+		<para>
+			Otherwise, the associated data must be a JSON object with one or more entries,
+			each entry defining a join:
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.join_def">
+				<lhs> join_def </lhs>
+				<rhs>
+					class_name  ':'<sbr/>
+					'{'<sbr/>
+					[  '”type”'      ':'  string      ]<sbr/>
+					[  '”field”'     ':'  field_name  ]<sbr/>
+					[  '”fkey”'      ':'  field_name  ]<sbr/>
+					[  '”filter”'    ':'  where_condition  ]<sbr/>
+					[  '”filter_op”' ':'  string      ]<sbr/>
+					[  '”join”'      ':'  join_list   ]<sbr/>
+					'}'
+				</rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			The data portion of the <literal>“join_type”</literal> entry tells json_query
+			whether to use a left join, right join, full join, or inner join.  The values
+			<literal>“left”</literal>, <literal>“right”</literal>, and <literal>“full”</literal>,
+			in upper, lower, or mixed case, have the obvious meanings.  If the
+			<literal>“join_type”</literal> entry has any other value, or is not present,
+			json_query constructs an inner join.
+		</para>
+		<para>
+			The <literal>“field”</literal> and <literal>“fkey”</literal> attributes specify the
+			columns to be equated in the join condition.  The <literal>“field”</literal>
+			attribute refers to the column in the joined table, i.e. the one named by the
+			<literal>join_def</literal>.  The <literal>“fkey”</literal> attribute refers to the
+			corresponding column in the other table, i.e. the one named outside the
+			<literal>join_def</literal> – either the top-level table or a table named by some
+			other <literal>join_def</literal>.
+		</para>
+		<para>
+			It may be tempting to suppose that <literal>“fkey”</literal> stands for “foreign key”,
+			and therefore refers to a column in the child table that points to the key of a
+			parent table.  Resist the temptation; the labels are arbitrary.  The json_query
+			method doesn't care which table is the parent and which is the child.
+		</para>
+		<para>
+			These relationships are best explained with an example.  The following
+			<literal>from_list</literal>:
+		</para>
+
+		<informalexample>
+			<programlisting language="JSON">
+	{
+	    "aou": {
+	        "asv": {
+	            "type" : "left",
+	            "fkey" : "id",
+	            "field" : "owner"
+	        }
+	    }
+	}
+			</programlisting>
+		</informalexample>
+
+		<para>
+			...turns into the following FROM clause:
+		</para>
+
+		<informalexample>
+			<programlisting language="SQL">
+	FROM
+	    actor.org_unit AS "aou"
+	        LEFT JOIN action.survey AS "asv"
+	            ON ( "asv".owner = "aou".id )
+			</programlisting>
+		</informalexample>
+
+		<para>
+			Note in this example that <literal>“fkey”</literal> refers to a column of the
+			class <literal>“aou”</literal>, and <literal>“field”</literal> refers to a
+			column of the class <literal>“asv”</literal>.
+		</para>
+		<para>
+			If you specify only one of the two columns, json_query will try to identify the
+			other one from the IDL. However, if you specify only the column from the parent
+			table, this attempt will probably fail.
+		</para>
+		<para>
+			If you specify both columns, json_query will use the column names you specify,
+			without verifying them with a lookup in the IDL.  By this means you can perform
+			a join using a linkage that the IDL doesn't define.  Of course, if the columns
+			don't exist in the database, the query will fail when json_query tries to execute it.
+		</para>
+		<para>
+			Using the columns specified, either explicitly or implicitly, the json_query
+			method constructs a join condition.  With raw SQL it is possible (though
+			rarely useful) to join two tables by an inequality.  However the json_query
+			method always uses a simple equality condition.
+		</para>
+		<para>
+			Using a <literal>“filter”</literal> entry in the join_def, you can apply one
+			or more additional conditions to the JOIN clause, typically to restrict the
+			join to certain rows of the joined table.  The data associated with the
+			<literal>“filter”</literal> key is the same sort of
+			<literal>where_condition</literal> that you use for a WHERE clause
+			(discussed below).
+		</para>
+		<para>
+			If the string associated with the <literal>“filter_op”</literal> entry is
+			<literal>“OR”</literal> in upper, lower, or mixed case, then the json_query
+			method uses OR to connect the standard join condition to any additional
+			conditions supplied by a <literal>“filter”</literal> entry.
+		</para>
+		<para>
+			(Note that if the <literal>where_condition</literal> supplies multiple
+			conditions, they will be connected by AND.  You will probably want to move
+			them down a layer – enclose them in parentheses, in effect – to avoid a
+			confusing mixture of ANDs and ORs.)
+		</para>
+		<para>
+			If the <literal>“filter_op”</literal> entry carries any other value, or if
+			it is absent, then the json_query method uses AND.  In the absence of a
+			<literal>“filter”</literal> entry, <literal>“filter_op”</literal> has no effect.
+		</para>
+		<para>
+			A <literal>“join”</literal> entry in a <literal>join_def</literal> specifies
+			another layer of join.  The class named in the subjoin is joined to the class
+			named by the <literal>join_def</literal> to which it is subordinate.  By this
+			means you can encode multiple joins in a hierarchy.
+		</para>
+	</sect2>
+
+	<sect2><title>SELECT Clause</title>
+		<para>
+			If a query does not contain an entry for <literal>“select”</literal>, json_query
+			will construct a default SELECT clause.  The default includes every non-virtual
+			field from the top-level class of the FROM clause, as defined by the IDL.  The
+			result is similar to SELECT *, except:
+		</para>
+
+		<itemizedlist>
+			<listitem>
+				<para>The default includes only the fields defined in the IDL.</para>
+			</listitem>
+			<listitem>
+				<para>The columns will appear in the same order in which they appear in the IDL,
+				regardless of the order in which the database defines them.</para>
+			</listitem>
+		</itemizedlist>
+		
+		<para>
+			There are other ways to specify a default SELECT list, as shown below.
+		</para>
+		<para>
+			If a <literal>"select"</literal> entry is present, the associated value must
+			be a JSON object, keyed on class names:
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.select_list">
+				<lhs> select_list </lhs>
+				<rhs> '{' class_name ':' field_list { ',' class_name ':' field_list } '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			The <literal>class_name</literal> must identify either the top-level class or
+			a class belonging to one of the joins.  Otherwise json_query will silently
+			ignore the <literal>select_list</literal>.
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.field_list_0">
+				<lhs> field_list </lhs>
+				<rhs> 'null'  |  '”*”' </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			If a field_list is either the JSON reserved word <literal>null</literal>
+			(in lower case) or an asterisk in double quotes, json_query constructs a
+			default SELECT list – provided that the class is the top-level class of the
+			query.  If the class belongs to a join somewhere, json_query ignores the
+			<literal>field_list</literal>.
+		</para>
+		<para>
+			More commonly, the <literal>field_list</literal> is a JSON array of zero or
+			more field specifications:
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.field_list_1">
+				<lhs> field_list </lhs>
+				<rhs> '['  [  field_spec  {  ','  field_spec  }  ]  ']' </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			If the array is empty, json_query will construct a default SELECT list for
+			the class – again, provided that the class is the top-level class in the query.
+		</para>
+		<para>
+			In the simplest case, a field specification may name a non-virtual field
+			defined in the IDL:
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.field_spec_0">
+				<lhs> field_spec </lhs>
+				<rhs> field_name </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			In some cases json_query constructs a call to the
+			<literal>oils_i18n_xlate</literal> function to internationalize the value of the
+			selected column.  Specifically, it does so if all the following are true:
+		</para>
+
+		<itemizedlist>
+			<listitem>
+				<para>the settings file defines a locale;</para>
+			</listitem>
+			<listitem>
+				<para>in the field definition for the field in the IDL, the tag
+				<literal>“il8n”</literal> is present and true;</para>
+			</listitem>
+			<listitem>
+				<para>the query does <emphasis>not</emphasis> include the
+				<literal>"no_il8n"</literal> tag (or includes it with a value of false).</para>
+			</listitem>
+		</itemizedlist>
+		
+		<para>
+			A field specification may be a JSON object:
+		</para>
+		
+		<productionset>
+			
+			<production xml:id="ebnf.field_spec_1">
+				<lhs> field_spec </lhs>
+				<rhs>
+					'{'<sbr/>
+					'”column”'  ':'  <sbr/>
+					[ ',' '”alias”'  ':'  string  ]<sbr/>
+					[ ',' '”aggregate”'  ':'  boolean  ]<sbr/>
+					[ ',' transform_spec  ]<sbr/>
+					'}'
+				</rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			The <literal>“column”</literal> entry provides the column name, which must
+			be defined as non-virtual in the IDL.
+		</para>
+		<para>
+			The <literal>“alias”</literal> entry provides a column alias.  If no alias
+			is specified, json_query uses the column name as its own alias.
+		</para>
+		<para>
+			The <literal>“aggregate”</literal> entry has no effect on the SELECT clause
+			itself.  Rather, it affects the construction of a GROUP BY class.  If there
+			is an <literal>“aggregate”</literal> entry for any field, then json_query builds
+			a GROUP BY clause listing every column that is <emphasis>not</emphasis> tagged
+			for aggregation (or that carries an <literal>“aggregate”</literal> entry with
+			a value of false).  If <emphasis>all</emphasis> columns are tagged for
+			aggregation, then json_query omits the GROUP BY clause.
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.transform_spec_0">
+				<lhs> transform_spec </lhs>
+				<rhs>
+					'”transform”'  ':'  string  ]<sbr/>
+					[ ',' '”result_field”  ':'  string  ]<sbr/>
+					[ ',' '”params”  ':' param_list  ]
+				</rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			When a <literal>transform_spec</literal> is present, json_query selects the
+			return value of a function instead of selecting the column directly.  The entry
+			for <literal>“transform”</literal> provides the name of the function, and the
+			column name (as specified by the <literal>“column”</literal> tag), qualified by
+			the class name, is the argument to the function.  For example, you might use such
+			a function to format a date or time, or otherwise transform a column value.
+			You might also use an aggregate function such as SUM, COUNT, or MAX (possibly
+			together with the <literal>“aggregate”</literal> tag).
+		</para>
+		<para>
+			The <literal>“result_field”</literal> entry, when present, specifies a subcolumn
+			of the function's return value.  The resulting SQL encloses the function call
+			in parentheses, and follows it with a period and the subcolumn name.
+		</para>
+		<para>
+			The <literal>“params”</literal> entry, if present, provides a possibly empty
+			array of additional parameter values, either strings, numbers, or nulls:
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.param_list">
+				<lhs> param_list </lhs>
+				<rhs> '['  [  parameter  {  ','  parameter  }  ]  ']' </rhs>
+			</production>
+
+		</productionset>
+
+		<para>
+			Such parameter values are enclosed in single quotes, with any special characters
+			escaped as needed, and inserted after the column name as additional parameters
+			to the function.  You might, for example, use an additional parameter to provide
+			a format string for a reformatting function.
+		</para>
+	</sect2>
+
+	<sect2><title>WHERE Clause</title>
+		<para> There are two types of <literal>where_condition</literal>: objects and arrays.
+			Of these, the object type is the more fundamental, and occurs at some level in every
+			<literal>where_condition</literal>.  The array type is mainly a way of circumventing
+			a limitation of the object type. </para>
+ 		<para> The object type of <literal>where_condition</literal> is a comma-separated list
+			of one or more <literal>conditions</literal>: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.where_condition_0">
+				<lhs> where_condition </lhs>
+				<rhs> '{' condition { ',' condition } '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The generated SQL will include a code fragment for each <literal>condition</literal>,
+			joined by AND (or in some cases by OR, as described below). </para>
+		<para> As usual for entries in a JSON object, each <literal>condition</literal> consists
+			of a unique string to serve as a key, a colon, and an associated value. </para>
+		<para> The key string may be the name of a column belonging to the relevant table, or
+			it may be an operator string.  In order to distinguish it from any possible column
+			name, an operator string always begins with a plus sign or minus sign. </para>
+		<para> JSON requires that every key string be unique within an object.  This requirement
+			imposes some awkward limitations on a JSON query.  For example, you might want to
+			express two conditions for the same column: id &gt; 10 and id != 25.  Since each of
+			those conditions would have the same key string, namely “id”, you can't put them
+			into the same JSON object. </para>
+		<para> The solution is to put such conflicting conditions in separate JSON objects, and
+			put the objects into an array: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.where_condition_1">
+				<lhs> where_condition </lhs>
+				<rhs> '[' where_condition { ',' where_condition } ']' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The resulting SQL encloses each subordinate set of <literal>conditions</literal>
+			in parentheses, and connects the sets with AND (or in some cases OR, as described
+			below).  It's possible to put only a single <literal>where_condition</literal> in
+			the array; the result is to add a layer of parentheses around the condition. </para>
+		<para> There are two kinds of <literal>condition</literal> where the operator begins
+			with a plus sign.  In the simpler case, the associated data is simply a column name:
+		</para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_0">
+				<lhs> condition </lhs>
+				<rhs> plus_class ':' field_name </rhs>
+			</production>
+
+		</productionset>
+
+		<para> A <literal>plus_class</literal> is a string that begins with a plus sign.
+			The rest of the string, after the plus sign, must be the class name for the table
+			to which the column belongs. </para>
+		<para> If the column is a boolean, then the resulting SQL uses it (as qualified by the
+			class name) as a stand-alone condition. </para>
+		<para> Otherwise, this kind of syntax provides a way to place a column on the right side
+			of a comparison operator.  For example: </para>
+
+		<informalexample>
+			<programlisting language="JSON">
+	{
+	    "from":"aou",
+	    "select": { "aou":[ "id", "name" ] },
+	    "where": {
+	        "id": {
+	            "&gt;": { "+aou":"parent_ou" }
+	        }
+	    }
+	}
+			</programlisting>
+		</informalexample>
+
+		<para> The resulting SQL: </para>
+
+		<informalexample>
+			<programlisting language="SQL">
+	SELECT
+	    "aou".id AS "id",
+	    "aou".name AS "name"
+	FROM
+	    actor.org_unit AS "aou"
+	WHERE
+	    (
+	        "aou".id &gt; (  "aou".parent_ou  )
+	    );
+			</programlisting>
+		</informalexample>
+
+		<para> The other type of <literal>condition</literal> that uses a
+			<literal>plus_class</literal> applies a specified class name to a
+			<literal>where_condition</literal>: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_1">
+				<lhs> condition </lhs>
+				<rhs> plus_class ':' where_condition </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The resulting SQL is enclosed in parentheses, and qualifies the columns with
+			the specified class name.  This syntax provides a mechanism to shift the class
+			context – i.e. to refer to one class in a context that would otherwise refer to
+			a different class. </para>
+		<para> Ordinarily the class name must be a valid non-virtual class defined in the IDL,
+			and applicable to the associated <literal>where_condition</literal>.  There is at
+			least one peculiar exception.  The JSON fragment: </para>
+
+		<informalexample>
+			<programlisting language="JSON">
+	"+abc": { "+xyz":"frobozz" }
+			</programlisting>
+		</informalexample>
+
+		<para> ...is rendered as: </para>
+
+		<informalexample>
+			<programlisting language="SQL">
+	(  "xyz".frobozz  )
+			</programlisting>
+		</informalexample>
+
+		<para> ...even though neither <literal>“abc”</literal>, nor <literal>“xyz”</literal>,
+			nor <literal>“frobozz”</literal> is defined in the IDL.  The class name
+			<literal>“abc”</literal> isn't used at all because the <literal>“+xyz”</literal>
+			operator overrides it.  Such a query won't fail until json_query tries
+			to execute it in the database. </para>
+		<para> The other operators that may occur at this level all begin with a minus sign,
+			and they all represent familiar SQL operators.  For example, the
+			<literal>“-or”</literal> operator joins the conditions within a
+			<literal>where_condition</literal> by OR (instead of the default AND), and
+			encloses them all in parentheses: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_2">
+				<lhs> condition </lhs>
+				<rhs> '”-or”' ':' where_condition </rhs>
+			</production>
+
+		</productionset>
+
+		<para> In fact the <literal>“-or”</literal> operator is the only way to get OR into
+			the WHERE clause. </para>
+		<para> The <literal>“-and”</literal> operator is similar, except that it uses AND: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_3">
+				<lhs> condition </lhs>
+				<rhs> '”-and”' ':'  where_condition </rhs>
+			</production>
+
+		</productionset>
+
+		<para> Arguably the <literal>“-and”</literal> operator is redundant, because you can
+			get the same effect by wrapping the subordinate <literal>where_condition</literal>
+			in a JSON array.  Either technique merely adds a layer of parentheses, since AND
+			connects successive conditions by default. </para>
+		<para> The <literal>“-not”</literal> operator expands the subordinate
+			<literal>where_condition</literal> within parentheses, and prefaces the result
+			with NOT: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_4">
+				<lhs> condition </lhs>
+				<rhs> '”-not”' ':'  where_condition </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The <literal>“-exists”</literal> or <literal>“-not-exists”</literal> operator
+			constructs a subquery within an EXISTS  or NOT EXISTS clause, respectively: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_5">
+				<lhs> condition </lhs>
+				<rhs> '”-exists”' ':' query </rhs>
+			</production>
+
+			<production xml:id="ebnf.condition_6">
+				<lhs> condition </lhs>
+				<rhs> '”-not-exists”' ':' query </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The remaining kinds of <literal>condition</literal> all have a
+			<literal>field_name</literal> on the left and some kind of <literal>predicate</literal>
+			on the right.  A <literal>predicate</literal> places a constraint on the value of
+			the column – or, in some cases, on the value of the column as transformed by some
+			function call: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.condition_7">
+				<lhs> condition </lhs>
+				<rhs> field_name ':' predicate </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The simplest such constraint is to require that the column have a specific value,
+			or be null: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_0">
+				<lhs> predicate </lhs>
+				<rhs> lit_value | 'null' </rhs>
+			</production>
+
+			<production xml:id="ebnf.lit_value">
+				<lhs> lit_value </lhs>
+				<rhs> string | number </rhs>
+			</production>
+
+		</productionset>
+
+		<para> You can also compare a column to a literal value using some kind of inequality.
+			However it's a bit more complicated because you have to specify what kind of comparison
+			to make: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.compare_op_0">
+				<lhs> predicate </lhs>
+				<rhs> '{' compare_op ':' lit_value '}' </rhs>
+			</production>
+
+			<production xml:id="ebnf.compare_op_1">
+				<lhs> compare_op </lhs>
+				<rhs> string </rhs>
+			</production>
+
+		</productionset>
+
+		<para> A <literal>compare_op</literal> is a string that defines a comparison operator.
+			Valid values include the following: </para>
+
+		<programlisting language="SQL">
+	=    &lt;&gt;   !=
+	&lt;    &gt;    &lt;=   &gt;=
+	~    ~*   !~   !~*
+	like      ilike
+	similar to
+		</programlisting>
+
+		<para> Strictly speaking, json_query accepts any <literal>compare_op</literal>
+			that doesn't contain semicolons or white space (or
+			<literal>“similar to”</literal> as a special exception).  As a result, it
+			is possible – and potentially useful – to use a custom operator like
+			<literal>“&gt;100*”</literal> in order to insert an expression that would
+			otherwise be difficult or impossible to create through a JSON query.  The ban
+			on semicolons and white space prevents certain kinds of SQL injection. </para>
+		<para> Note that json_query does <emphasis>not</emphasis> accept two operators that
+			PostgreSQL <emphasis>does</emphasis> accept: <literal>“is distinct from”</literal>
+			and <literal>“is not distinct from”</literal>. </para>
+		<para> You can also compare a column to a null value: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_1">
+				<lhs> predicate </lhs>
+				<rhs> '{' compare_op ':' 'null' '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The equality operator <literal>“=”</literal> turns into IS NULL.  Any other
+			operator turns into IS NOT NULL. </para>
+		<para> When a <literal>compare_op</literal> is paired with an array, it defines a
+			function call: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_2">
+				<lhs> predicate </lhs>
+				<rhs> '{' compare_op ':' '[' string { ',' parameter } ']' '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The first entry in the array is the function's name.  Subsequent entries in
+			the array, if any, represent the parameters of the function call.  They may be
+			strings, numbers, or nulls.  In the generated SQL, the function call appears on
+			the right of the comparison. </para>
+		<para> The <literal>“between”</literal> operator creates a BETWEEN clause: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_3">
+				<lhs> predicate </lhs>
+				<rhs> '{'  “between”  ':'  '['  lit_value  ','  lit_value  ']'  '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> Although PostgreSQL allows a null value in a BETWEEN clause, json_query
+			requires literal non-null values.  It isn't sensible to use null values in a
+			BETWEEN clause.  A few experiments show that the results of the comparison are
+			peculiar and erratic. </para>
+		<para> There are two ways to create an IN list of allowed values.  The simplest is
+			to put literal values into a JSON array: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_4">
+				<lhs> predicate </lhs>
+				<rhs> '[' lit_value { ',' lit_value }  ']' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> As with BETWEEN clauses, json_query does not accept null values in an IN list,
+			even though PostgreSQL does allow them.  Nulls are not sensible in this context
+			because they never match anything. </para>
+		<para>  </para>
+	</sect2>
+
+	<sect2><title>Having Clause</title>
+		<para>For the HAVING clause, json_query accepts exactly the same syntax as it accepts for
+			the WHERE clause.</para>
+		<para> The other way to create an IN list is to use an explicit
+			<literal>“in”</literal> operator with an array of literal values.  This format
+			also works for the <literal>“not in”</literal> operator: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_5">
+				<lhs> predicate </lhs>
+				<rhs> '{' in_operator ';'  '[' lit_value [ ',' lit_value ]  ']'  '}' </rhs>
+			</production>
+
+			<production xml:id="ebnf.in_operator">
+				<lhs> in_operator </lhs>
+				<rhs> “in”  |  “not in” </rhs>
+			</production>
+
+		</productionset>
+
+		<para> Another kind of IN or NOT IN clause uses a subquery instead of a list of
+			values: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_6">
+				<lhs> predicate </lhs>
+				<rhs> '{' in_operator ':'  query  '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The remaining types of <literal>predicate</literal> can put a function call on
+			the left of the comparison, by using a <literal>transform_spec</literal> together
+			with a <literal>“value”</literal> tag.   The <literal>transform_spec</literal> is
+			optional, and if you don't need it, the same SQL would in many cases be easier to
+			express by other means. </para>
+		<para> The <literal>transform_spec</literal> construct was described earlier in
+			connection with the SELECT clause, but here it is again: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.transform_spec_1">
+				<lhs> transform_spec </lhs>
+				<rhs>
+					'”transform”'  ':'  string  ]<sbr/>
+					[ ',' '”result_field”  ':'  string  ]<sbr/>
+					[ ',' '”params”  ':' param_list  ]
+				</rhs>
+			</production>
+
+		</productionset>
+
+		<para> As in the SELECT clause, the <literal>“transform”</literal> string names the
+			function.  The first parameter is always the column identified by the field_name.
+			Additional parameters, if any, appear in the <literal>param_list</literal>.  The
+			<literal>“result_field”</literal> string, if present, identifies one column of a
+			multicolumn return value. </para>
+		<para> Here's a second way to compare a value to a literal value (but not to a null
+			value): </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_7">
+				<lhs> predicate </lhs>
+				<rhs> '{' compare_op ':' '{' '”value”' ':' lit_value<sbr/>
+					[ transform_spec ] '}' '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> ...and a way to compare a value to a boolean expression: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_8">
+				<lhs> predicate </lhs>
+				<rhs> '{' compare_op ':' '{' '”value”' ':' '{'<sbr/>
+					condition { ',' condition } [ transform_spec ] '}' '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The final predicate is another way to put a function call on the right side
+			of the comparison: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.predicate_9">
+				<lhs> predicate </lhs>
+				<rhs> '{' compare_op ':' '{' '”value”' ':' '['<sbr/>
+					string { ',' parameter } ']' [ transform_spec ] '}' '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> This format is available for the sake of consistency, but offers no advantage
+			over the simpler version. </para>
+	</sect2>
+
+	<sect2><title>ORDER BY Clause</title>
+		<para> There are two ways to encode an ORDER BY clause: as an array, or as a list.
+			Either may be empty, in which case the generated SQL will not include an ORDER BY
+			clause: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.order_by_list_0">
+				<lhs> order_by_list </lhs>
+				<rhs> '['  ']'  |  '{'  '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> If not empty, the array contains one or more objects, each defining a sort
+			field: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.order_by_list_1">
+				<lhs> order_by_list </lhs>
+				<rhs> '{' sort_field_def  {  ','  sort_field_def }  '}' </rhs>
+			</production>
+
+			<production xml:id="ebnf.sort_field_def">
+				<lhs> sort_field_def </lhs>
+				<rhs> '{'<sbr/>
+					'”class”'  ':'  class_name<sbr/>
+					','  '”field”'  ':'  field_name<sbr/>
+					[  ','  '”direction”'  ':'  lit_value  ]<sbr/>
+					[  ','  transform_spec  ]<sbr/>
+					'}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> The <literal>“class”</literal> and <literal>“field”</literal> entries are
+			required, and of course the field must belong to the class.  Furthermore, at
+			least one field from the class must appear in the SELECT clause. </para>
+		<para> The <literal>“direction”</literal> entry, if present, specifies whether the
+			sort will be ascending or descending for the column in question.  If the associated
+			value begins with “D” or “d”, the sort will be descending; otherwise the sort will
+			be ascending.  If the value is a number, it will be treated as a string that does not
+			start with “D” or “d”, resulting in an ascending sort. </para>
+		<para> In the absence of a <literal>“direction”</literal> entry, the sort will be
+			ascending. </para>
+		<para> The <literal>transform_spec</literal> works here the same way it works in the
+			SELECT clause and the WHERE clause, enabling you to pass the column through a
+			transforming function before the sort: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.transform_spec_2">
+				<lhs> transform_spec </lhs>
+				<rhs>
+					'”transform”'  ':'  string  ]<sbr/>
+					[ ',' '”result_field”  ':'  string  ]<sbr/>
+					[ ',' '”params”  ':' param_list  ]
+				</rhs>
+			</production>
+
+		</productionset>
+
+		<para> When the <literal>order_by_list</literal> is an object instead of an array,
+			the syntax is less verbose, but also less flexible.  The keys for the object are
+			class names: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.order_by_list_2">
+				<lhs> order_by_list </lhs>
+				<rhs> '{' class_name ':' sort_class_def<sbr/>
+					{ ',' class_name ':' sort_class_def } '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> Each class must be referenced in the SELECT clause. </para>
+		<para> As in the SELECT clause, all the fields for a given class must be grouped
+			together.  You can't sort by a column from one table, then a column from a second
+			table, then another column from the first table. If you need this kind of sort,
+			you must encode the ORDER BY clause as an array instead of an object. </para>
+		<para> The data associated with a <literal>class_name</literal> may be either an array
+			or an object.  If an array, it's simply a list of field names, and each field must
+			belong to the class: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.sort_class_def_0">
+				<lhs> sort_class_def </lhs>
+				<rhs> '['  field_name  { ','  field_name }  ']' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> With this syntax, the direction of sorting will always be ascending. </para>
+		<para> If the data is an object, the keys are field names, and as usual the fields
+			must belong to the class: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.sort_class_def_1">
+				<lhs> sort_class_def </lhs>
+				<rhs> '{'  field_name  ':' sort_class_subdef<sbr/>
+					{ ','  field_name  ':' sort_class_subdef  }  '}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> Since the <literal>field_name</literal> is the key for the object, it cannot
+			appear more than once.  As a result, some kinds of sorts are not possible with this
+			syntax.  For example, one might want to sort by UPPER( family_name ), and then by
+			family_name with case unchanged, to make sure that “diBona” comes before “Dibona”.
+			For situations like this, you must encode the ORDER BY clause as an array rather
+			than an object. </para>
+		<para> The data associated with each <literal>field_name</literal> may take either of
+			two forms.  In the simplest case, it's a literal value to specify the direction
+			of sorting: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.sort_class_subdef_0">
+				<lhs> sort_class_subdef </lhs>
+				<rhs> lit_value </rhs>
+			</production>
+
+		</productionset>
+
+		<para> If the literal is a string starting with “D” or “d”, json_query sorts the field
+			in descending order.  Otherwise it sorts the field in ascending order. </para>
+		<para> In other cases, the <literal>field_name</literal> may be paired with an object
+			to specify more details: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.sort_class_subdef_1">
+				<lhs> sort_class_subdef </lhs>
+				<rhs> '{'<sbr/>
+					[  '”direction”'  ':'  lit_value ]<sbr/>
+					[  transform_spec  ]<sbr/>
+					'}' </rhs>
+			</production>
+
+		</productionset>
+
+		<para> As before, the value tagged as <literal>“direction”</literal> specifies the
+			direction of the sort, depending on the first character.  If not otherwise
+			specified, the sort direction defaults to ascending. </para>
+		<para> Also as before, the <literal>transform_spec</literal> may specify a function
+			through which to pass the column. </para>
+		<para> Since both the <literal>“direction”</literal> tag and the
+			<literal>transform_spec</literal> are optional, the object may be empty: </para>
+
+		<productionset>
+
+			<production xml:id="ebnf.sort_class_subdef_2">
+				<lhs> sort_class_subdef </lhs>
+				<rhs> '{'  '}' </rhs>
+			</production>
+
+		</productionset>
+	</sect2>
+
+</sect1>

Added: trunk/docs/TechRef/TechRef.xml
===================================================================
--- trunk/docs/TechRef/TechRef.xml	                        (rev 0)
+++ trunk/docs/TechRef/TechRef.xml	2009-05-12 01:16:48 UTC (rev 13134)
@@ -0,0 +1,12 @@
+<?xml version="1.0" encoding="utf-8"?>
+
+<chapter version="5.0" xmlns="http://docbook.org/ns/docbook"
+  xmlns:xi="http://www.w3.org/2003/XInclude"
+  xmlns:xlink="http://www.w3.org/1999/xlink">
+  
+    <title>Evergreen Technical Reference</title>
+
+<xi:include  href="JSONGrammar.xml" 
+    xmlns:xi="http://www.w3.org/2001/XInclude" />
+  
+</chapter>



More information about the open-ils-commits mailing list