[OPEN-ILS-GENERAL] Reporting module appears stuck
jason at sigio.com
Sun Jun 2 08:34:00 EDT 2019
I sounds like you have dead reports that are preventing new reports from
starting. When a report dies or is killed, they aren't cleaned up and
Clark will think that they are still running.
First, check if Clark is running and running any reports:
pgrep -af Clark
If you run that on the server where the reporter runs, you should get
output like this:
7180 Clark Kent, waiting for trouble
The number is the process ID, so will be different. If any reports are
running, there will be additional lines similar to the above, but will
have some portion of the report's name:
7201 Clark Kent reporting: [Report Name]
If no reports are currently running, then it is safe to do the following
To check for dead reports, run the following query:
select * from reporter.currently_running
There can be up to "parallel" number of rows in that view, and when
there are that many, Clark will not start new reports. ("Parallel" is
the reporter/setup/parallel setting from opensrf.xml.)
If you have any rows in that view, and no reports are currently running,
it is advisable to clear them out. You do that by setting the
complete_time on the listed reports. I have attached a SQL script that
I use for this purpose. It not only sets the complete_time, but also
sets the error_code and error_text to something semi-useful for our
environment. You might want to change that to suit your situation.
On 6/1/19 6:12 PM, JonGeorg SageLibrary wrote:
> Greetings, I've run into an issue where the reporting module does not
> appear to want to restart.
> Reports are run on the log server against the replicated database server.
> Normally what I do is:
> * just restart it
> per http://docs.evergreen-ils.org/3.1/_starting_and_stopping_the_reporter_daemon.html as
> opensrf user
> I've also done the following:
> * Restarted all osrf services on the application and log servers along
> with ejabberd/memcached where applicable.
> * Killed all processes on the database server older than 2 minutes.
> * Re-ran replication of the production server to replicated database
> server. I did this just to rule out that there was not an issue with
> the replicated copy because we did have a fines issue that was
> related to the replication at one point.
> * I ran "SELECT now()-query_start,pid,state,application_name,waiting
> FROM pg_stat_activity;" but had to remove ",waiting" as it threw an
> o That returns a list of processes like open-ils.cstore,
> open-ils.pcrud, open-ils.reporter-store and the like. I
> attempted to kill the old reporter-store processes with the
> command "SELECT pg_cancel_backend(backend_pid);" and Clark
> stopped, and while it returned a value of true showing the
> process was dead, when I re-ran it, it appears to still be present.
> I don't see anything else
> under http://docs.evergreen-ils.org/reorg/3.1/command_line_admin/Evergreen_Documentation.pdf
> or https://wiki.evergreen-ils.org/doku.php?id=scratchpad:random_magic_spells.
> The only thing I haven't tried, but shouldn't need to, is to actually
> restart that server, but am waiting until there is someone physically
> present in case it does not properly restart on its own.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 363 bytes
Desc: not available
More information about the Open-ils-general