[open-ils-commits] [GIT] Evergreen ILS branch master updated. 41054e3873084f49e50bdfce297bd7f8329d88ba

Evergreen Git git at git.evergreen-ils.org
Thu Apr 9 22:02:31 EDT 2015


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "Evergreen ILS".

The branch, master has been updated
       via  41054e3873084f49e50bdfce297bd7f8329d88ba (commit)
       via  a3665b06eb318be90c98501bc7f32b9d18df88f8 (commit)
       via  b9a302955fd56396b4e92e3d95cb85d6a77d03de (commit)
       via  427cb93b5b5d2ce7ae2c9e41bfa54b1c13f3ea0a (commit)
      from  892fcc25dbd1fa3489524c0238456a922f7956a7 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 41054e3873084f49e50bdfce297bd7f8329d88ba
Author: Galen Charlton <gmc at esilibrary.com>
Date:   Thu Apr 9 20:50:24 2015 +0000

    LP#1435494: add release notes entry
    
    Signed-off-by: Galen Charlton <gmc at esilibrary.com>
    Signed-off-by: Ben Shum <bshum at biblio.org>

diff --git a/docs/RELEASE_NOTES_NEXT/Administration/set_resource_limits_for_reporter.txt b/docs/RELEASE_NOTES_NEXT/Administration/set_resource_limits_for_reporter.txt
new file mode 100644
index 0000000..9c04c95
--- /dev/null
+++ b/docs/RELEASE_NOTES_NEXT/Administration/set_resource_limits_for_reporter.txt
@@ -0,0 +1,32 @@
+Set resource limits for Clark Kent
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Several parameters are now available for the reporter daemon process
+(`clark-kent.pl`) to control resource usage.  These can be used to
+reduce the chances that a malformed report can cause indigestion
+on a database or reports server.  The new parameters, which can be
+set in `opensrf.xml` or as command-line switches for `clark-kent.pl` are
+
+* `//reporter/setup/statement_timeout` / `--statement-timeout`
+
+Number of minutes to allow a report's underlying SQL query
+to run before it gets cancelled.  Default value is
+60 minutes.  If a report's query gets cancelled, the
+error_text value will be set to a valid that indicates that
+the allowed time was exceeded.
+
+* `//reporter/setup/max_rows_for_charts` / `--max-rows-for-charts`
+
+Number of rows permitted in the query's output before
+Clark Kent refuses to attempt to draw a graph. Default
+value is 1,000 rows.
+
+* `//reporter/setup/resultset_limit` / `--resultset-limit`
+
+If set, truncates the report's output to the specified
+number of hits.  Note that it will not be apparent
+to a staff user if the report's output has been
+truncated.  Default value is unlimited.
+
+The report concurrency (i.e., the number of reports that Clark
+Kent will run in parallel) can now also be controlled via
+the `opensrf.xml` setting `//reporter/setup/parallel`.

commit a3665b06eb318be90c98501bc7f32b9d18df88f8
Author: Galen Charlton <gmc at esilibrary.com>
Date:   Thu Apr 9 20:09:59 2015 +0000

    LP#1435494: suggest 1048575 as a default resultset_limit
    
    Per a suggestion by Thomas Berezansky; this magic number
    represents the number of rows supported by XSLX, less one
    for a header row.
    
    Signed-off-by: Galen Charlton <gmc at esilibrary.com>
    Signed-off-by: Ben Shum <bshum at biblio.org>

diff --git a/Open-ILS/examples/opensrf.xml.example b/Open-ILS/examples/opensrf.xml.example
index 62ab1b8..dd7667a 100644
--- a/Open-ILS/examples/opensrf.xml.example
+++ b/Open-ILS/examples/opensrf.xml.example
@@ -201,7 +201,7 @@ vim:et:ts=4:sw=4:
 
                      A value of 0 means that no limit should be set.
                 -->
-                <resultset_limit>0</resultset_limit>
+                <resultset_limit>1048575</resultset_limit>
             </setup>
         </reporter>
 

commit b9a302955fd56396b4e92e3d95cb85d6a77d03de
Author: Galen Charlton <gmc at esilibrary.com>
Date:   Thu Apr 9 20:06:25 2015 +0000

    LP#1435494: do not encourage <resultset_limit></resultset_limit>
    
    An empty node in opensrf.xml gets parsed as an empty hashref,
    not an empty scalar, so we'll use <resultset_limit>0</resultlet_limit>
    in the suggested opensrf.xml config.  This fixes an issue
    noticed by Ben Shum during testing where reports could fail with
    the following message:
    
      DBD::Pg::st execute failed: ERROR: syntax error at or near "0"
      LINE 43: ) limited_to_HASH(0x2a974f8)_hits LIMIT HASH(0x2a974f8)
                                 ^ at /openils/bin/clark-kent.pl line 243.
    
    Signed-off-by: Galen Charlton <gmc at esilibrary.com>
    Signed-off-by: Ben Shum <bshum at biblio.org>

diff --git a/Open-ILS/examples/opensrf.xml.example b/Open-ILS/examples/opensrf.xml.example
index fdaf150..62ab1b8 100644
--- a/Open-ILS/examples/opensrf.xml.example
+++ b/Open-ILS/examples/opensrf.xml.example
@@ -198,8 +198,10 @@ vim:et:ts=4:sw=4:
                      has been limited in this fashion.  This setting can be
                      overriden by the -resultset-limit command-line switch of
                      clark-kent.pl.
+
+                     A value of 0 means that no limit should be set.
                 -->
-                <resultset_limit></resultset_limit>
+                <resultset_limit>0</resultset_limit>
             </setup>
         </reporter>
 
diff --git a/Open-ILS/src/reporter/clark-kent.pl b/Open-ILS/src/reporter/clark-kent.pl
index c2a6e9f..0e8a99b 100755
--- a/Open-ILS/src/reporter/clark-kent.pl
+++ b/Open-ILS/src/reporter/clark-kent.pl
@@ -109,7 +109,9 @@ my $max_rows_for_charts = $opt_max_rows_for_charts //
                           1000;
 $max_rows_for_charts = 1000 unless $max_rows_for_charts =~ /^\d+$/;
 my $resultset_limit     = $opt_resultset_limit //
-                          $sc->config_value( reporter => setup => 'resultset_limit' );
+                          $sc->config_value( reporter => setup => 'resultset_limit' ) //
+                          0;
+$resultset_limit = 0 unless $resultset_limit =~ /^\d+$/; # 0 means no limit
 
 my ($dbh,$running,$sth, at reports,$run, $current_time);
 

commit 427cb93b5b5d2ce7ae2c9e41bfa54b1c13f3ea0a
Author: Galen Charlton <gmc at esilibrary.com>
Date:   Fri Mar 20 20:33:39 2015 +0000

    LP#1435494: set limits on Clark Kent's resource usage
    
    Clark Kent can sometimes consume more RAM, swap space, or CPU
    than is reasonable or productive. For example:
    
    - a badly constructed query with multiple Cartesian joins may
      never terminate, potentially tying up a Clark child process,
      pegging a CPU on the database server, and/or causing significant
      scratch disk usage on the database server keeping a snapshot alive.
    - a query that returns a very large number of rows can cause a Clark
      child to bloat, and in extreme cases cause a OOM on the server
      running Clark.
    - a report that asks for a chart of an unreasonably large number of
      rows can peg a CPU on the Clark server as GD::Graph attempts to
      compute sub-pixel graph elements.
    
    In each of these cases, a requested report may never finish.
    
    This patch adds the ability set set some limits on Clark.  These
    limits can be set either in opensrf.xml for the settings service
    to distribute or via command-line switches to clark-kent.pl:
    
    //reporter/setup/statement_timeout / --statement-timeout
    
      Number of minutes to allow a report's underlying SQL query
      to run before it gets cancelled.  Default value is
      60 minutes.  If a report's query gets cancelled, the
      error_text value will be set to a valid that indicates that
      the allowed time was exceeded.
    
    //reporter/setup/max_rows_for_charts / --max-rows-for-charts
    
      Number of rows permitted in the query's output before
      Clark Kent refuses to attempt to draw a graph. Default
      value is 1,000 rows.
    
    //reporter/setup/resultset_limit / --resultset-limit
    
      If set, truncates the report's output to the specified
      number of hits.  Note that it will not be apparent
      to a staff user if the report's output has been
      truncated.  Default value is unlimited.
    
    This patch also adds the ability for the concurrency
    to be set via an opensrf.xml setting (//reporter/setup/parallel).
    
    If both a command-line switch and an opensrf.xml setting
    are supplied, the value set in the command line takes
    precedence.
    
    Signed-off-by: Galen Charlton <gmc at esilibrary.com>
    Signed-off-by: Ben Shum <bshum at biblio.org>

diff --git a/Open-ILS/examples/opensrf.xml.example b/Open-ILS/examples/opensrf.xml.example
index 9f3209d..fdaf150 100644
--- a/Open-ILS/examples/opensrf.xml.example
+++ b/Open-ILS/examples/opensrf.xml.example
@@ -174,6 +174,32 @@ vim:et:ts=4:sw=4:
                     <success_template>LOCALSTATEDIR/data/report-success</success_template>
                     <fail_template>LOCALSTATEDIR/data/report-fail</fail_template>
                 </files>
+                <!-- Number of reports that can be processed simultaneously.  This
+                     value can overriden by the -c/-concurrency command-line switch
+                     of clark-kent.pl.
+                -->
+                <parallel>1</parallel>
+                <!-- Maximum number of rows in the query results allowed before
+                     Clark will refuse to draw a pie, bar, or line chart.  This
+                     value can be overriden by the -max-rows-for-charts command-line
+                     switch of clark-kent.pl.
+                -->
+                <max_rows_for_charts>1000</max_rows_for_charts>
+                <!-- Maximum amount of time (in minutes) that an SQL query initiated
+                     by Clark Kent will be allowed to run before it is terminated.
+                     This value can be overriden by the -statement-timeout
+                     command-line switch of clark-kent.pl.
+                -->
+                <statement_timeout>60</statement_timeout>
+                <!-- Maximum number of results permitted.  If set to a numeric value,
+                     Clark will limit the number of rows returned by report queries
+                     to this value.  Note that it will not be apparent to a user
+                     running a report from the staff interface that their report
+                     has been limited in this fashion.  This setting can be
+                     overriden by the -resultset-limit command-line switch of
+                     clark-kent.pl.
+                -->
+                <resultset_limit></resultset_limit>
             </setup>
         </reporter>
 
diff --git a/Open-ILS/src/perlmods/lib/OpenILS/Reporter/SQLBuilder.pm b/Open-ILS/src/perlmods/lib/OpenILS/Reporter/SQLBuilder.pm
index 940972a..41d76ba 100644
--- a/Open-ILS/src/perlmods/lib/OpenILS/Reporter/SQLBuilder.pm
+++ b/Open-ILS/src/perlmods/lib/OpenILS/Reporter/SQLBuilder.pm
@@ -38,6 +38,13 @@ sub relative_time {
     return $self->builder->{_relative_time};
 }
 
+sub resultset_limit {
+    my $self = shift;
+    my $limit = shift;
+    $self->builder->{_resultset_limit} = $limit if (defined $limit);
+    return $self->builder->{_resultset_limit};
+}
+
 sub resolve_param {
     my $self = shift;
     my $val = shift;
@@ -237,6 +244,8 @@ sub toSQL {
 
     if ($self->is_subquery) {
         $sql = '(';
+    } elsif ($self->resultset_limit) {
+        $sql = 'SELECT * FROM (';
     }
 
     $sql .= "SELECT\t" . join(",\n\t", map { $_->toSQL } @{ $self->{_select} }) . "\n" if (@{ $self->{_select} });
@@ -251,6 +260,9 @@ sub toSQL {
 
     if ($self->is_subquery) {
         $sql .= ') '. $self->{_alias} . "\n";
+    } elsif ($self->resultset_limit) {
+        $sql .= ') limited_to_' . $self->resultset_limit .
+                '_hits LIMIT ' . $self->resultset_limit . "\n";
     }
 
     return $self->{_sql} = $sql;
diff --git a/Open-ILS/src/reporter/clark-kent.pl b/Open-ILS/src/reporter/clark-kent.pl
index 8893fc5..c2a6e9f 100755
--- a/Open-ILS/src/reporter/clark-kent.pl
+++ b/Open-ILS/src/reporter/clark-kent.pl
@@ -29,12 +29,20 @@ use Email::Send;
 use open ':utf8';
 
 
-my ($count, $config, $sleep_interval, $lockfile, $daemon) = (1, 'SYSCONFDIR/opensrf_core.xml', 10, '/tmp/reporter-LOCK');
+my ($config, $sleep_interval, $lockfile, $daemon) = ('SYSCONFDIR/opensrf_core.xml', 10, '/tmp/reporter-LOCK');
+
+my $opt_count;
+my $opt_max_rows_for_charts;
+my $opt_statement_timeout;
+my $opt_resultset_limit;
 
 GetOptions(
 	"daemon"	=> \$daemon,
 	"sleep=i"	=> \$sleep_interval,
-	"concurrency=i"	=> \$count,
+	"concurrency=i"	=> \$opt_count,
+	"max-rows-for-charts=i" => \$opt_max_rows_for_charts,
+	"resultset-limit=i" => \$opt_resultset_limit,
+	"statement-timeout=i" => \$opt_statement_timeout,
 	"bootstrap|boostrap=s"	=> \$config,
 	"lockfile=s"	=> \$lockfile,
 );
@@ -88,6 +96,21 @@ my $base_uri         = $sc->config_value( reporter => setup => 'base_uri' );
 my $state_dsn = "dbi:" . $state_db{db_driver} . ":dbname=" . $state_db{db_name} .';host=' . $state_db{db_host} . ';port=' . $state_db{db_port};
 my $data_dsn  = "dbi:" .  $data_db{db_driver} . ":dbname=" .  $data_db{db_name} .';host=' .  $data_db{db_host} . ';port=' .  $data_db{db_port};
 
+my $count               = $opt_count //
+                          $sc->config_value( reporter => setup => 'parallel' ) //
+                          1;
+$count = 1 unless $count =~ /^\d+$/ && $count > 0;
+my $statement_timeout   = $opt_statement_timeout //
+                          $sc->config_value( reporter => setup => 'statement_timeout' ) //
+                          60;
+$statement_timeout = 60 unless $statement_timeout =~ /^\d+$/;
+my $max_rows_for_charts = $opt_max_rows_for_charts //
+                          $sc->config_value( reporter => setup => 'max_rows_for_charts' ) //
+                          1000;
+$max_rows_for_charts = 1000 unless $max_rows_for_charts =~ /^\d+$/;
+my $resultset_limit     = $opt_resultset_limit //
+                          $sc->config_value( reporter => setup => 'resultset_limit' );
+
 my ($dbh,$running,$sth, at reports,$run, $current_time);
 
 if ($daemon) {
@@ -167,6 +190,7 @@ while (my $r = $sth->fetchrow_hashref) {
 	$r->{resultset}->set_pivot_label($report_data->{__pivot_label}) if $report_data->{__pivot_label};
 	$r->{resultset}->set_pivot_default($report_data->{__pivot_default}) if $report_data->{__pivot_default};
 	$r->{resultset}->relative_time($r->{run_time});
+	$r->{resultset}->resultset_limit($resultset_limit) if $resultset_limit;
 	push @reports, $r;
 }
 
@@ -203,6 +227,7 @@ for my $r ( @reports ) {
 		  RaiseError => 1
 		}
 	);
+	$data_dbh->do('SET statement_timeout = ?', {}, ($statement_timeout * 60 * 1000));
 
 	try {
 		$state_dbh->do(<<'		SQL',{}, $r->{id});
@@ -544,28 +569,40 @@ sub build_html {
 
 	# Time for a pie chart
 	if ($r->{chart_pie}) {
-		my $pics = draw_pie($r, $file);
-		for my $pic (@$pics) {
-			print $index "<img src='report-data.html.$pic->{file}' alt='$pic->{name}'/>$br4";
+		if (scalar(@{$r->{data}}) > $max_rows_for_charts) {
+			print $index "<strong>Report output has too many rows to make a pie chart</strong>$br4";
+		} else {
+			my $pics = draw_pie($r, $file);
+			for my $pic (@$pics) {
+				print $index "<img src='report-data.html.$pic->{file}' alt='$pic->{name}'/>$br4";
+			}
 		}
 	}
 
 	print $index $br4;
 	# Time for a bar chart
 	if ($r->{chart_bar}) {
-		my $pics = draw_bars($r, $file);
-		for my $pic (@$pics) {
-			print $index "<img src='report-data.html.$pic->{file}' alt='$pic->{name}'/>$br4";
+		if (scalar(@{$r->{data}}) > $max_rows_for_charts) {
+			print $index "<strong>Report output has too many rows to make a bar chart</strong>$br4";
+		} else {
+			my $pics = draw_bars($r, $file);
+			for my $pic (@$pics) {
+				print $index "<img src='report-data.html.$pic->{file}' alt='$pic->{name}'/>$br4";
+			}
 		}
 	}
 
 	print $index $br4;
 	# Time for a bar chart
 	if ($r->{chart_line}) {
-		my $pics = draw_lines($r, $file);
-		for my $pic (@$pics) {
-			print $index "<img src='report-data.html.$pic->{file}' alt='$pic->{name}'/>$br4";
-		}
+		if (scalar(@{$r->{data}}) > $max_rows_for_charts) {
+			print $index "<strong>Report output has too many rows to make a line chart</strong>$br4";
+		} else {
+			my $pics = draw_lines($r, $file);
+			for my $pic (@$pics) {
+				print $index "<img src='report-data.html.$pic->{file}' alt='$pic->{name}'/>$br4";
+			}
+	    }
 	}
 
 	# and that's it!

-----------------------------------------------------------------------

Summary of changes:
 Open-ILS/examples/opensrf.xml.example              |   28 +++++++++
 .../perlmods/lib/OpenILS/Reporter/SQLBuilder.pm    |   12 ++++
 Open-ILS/src/reporter/clark-kent.pl                |   63 ++++++++++++++++----
 .../set_resource_limits_for_reporter.txt           |   32 ++++++++++
 4 files changed, 123 insertions(+), 12 deletions(-)
 create mode 100644 docs/RELEASE_NOTES_NEXT/Administration/set_resource_limits_for_reporter.txt


hooks/post-receive
-- 
Evergreen ILS


More information about the open-ils-commits mailing list