[OPEN-ILS-GENERAL] Reporting module appears stuck

Jason Stephenson jason at sigio.com
Sun Jun 2 15:07:57 EDT 2019


At this point, you are beyond my knowledge of the reporter.  However, it
sounds like you stopped two reports while they were running.  That's
generally not a good idea.

On 6/2/19 1:21 PM, JonGeorg SageLibrary wrote:
> Thank you so much for responding and that script is essentially what I
> was looking for, as I knew there had to be a way to view stuck reports. 
> 
> The first time I ran *pgrep -af Clark* it returned 2 report names.
> However, since restarting Clark that first time after killing those two
> processes I get nothing but the Clark, waiting for trouble output. Yes,
> I removed the reporter-lock folder when restarting Clark and did
> everything as the opensrf user.
> 
> When I run *select * from reporter.currently_running* I get no output.
> Just to double check I ran it on both the production and replicated
> databases with the same result. However when I go to my version of the
> staff client under reports, I show a stuck report from yesterday still
> present in the queue.
> 
> Ideas?
> -Jon
> 
> On Sun, Jun 2, 2019 at 5:34 AM Jason Stephenson <jason at sigio.com
> <mailto:jason at sigio.com>> wrote:
> 
>     I sounds like you have dead reports that are preventing new reports from
>     starting.  When a report dies or is killed, they aren't cleaned up and
>     Clark will think that they are still running.
> 
>     First, check if Clark is running and running any reports:
> 
>         pgrep -af Clark
> 
>     If you run that on the server where the reporter runs, you should get
>     output like this:
> 
>         7180 Clark Kent, waiting for trouble
> 
>     The number is the process ID, so will be different.  If any reports are
>     running, there will be additional lines similar to the above, but will
>     have some portion of the report's name:
> 
>         7201 Clark Kent reporting: [Report Name]
> 
>     If no reports are currently running, then it is safe to do the following
>     steps.
> 
>     To check for dead reports, run the following query:
> 
>         select * from reporter.currently_running
> 
>     There can be up to "parallel" number of rows in that view, and when
>     there are that many, Clark will not start new reports.  ("Parallel" is
>     the reporter/setup/parallel setting from opensrf.xml.)
> 
>     If you have any rows in that view, and no reports are currently running,
>     it is advisable to clear them out.  You do that by setting the
>     complete_time on the listed reports.  I have attached a SQL script that
>     I use for this purpose.  It not only sets the complete_time, but also
>     sets the error_code and error_text to something semi-useful for our
>     environment.  You might want to change that to suit your situation.
> 
>     HtH,
>     Jason
> 
>     On 6/1/19 6:12 PM, JonGeorg SageLibrary wrote:
>     > Greetings, I've run into an issue where the reporting module does not
>     > appear to want to restart. 
>     >
>     > Reports are run on the log server against the replicated database
>     server. 
>     > Normally what I do is: 
>     >
>     >   * just restart it
>     >   
>      per http://docs.evergreen-ils.org/3.1/_starting_and_stopping_the_reporter_daemon.html as
>     >     opensrf user
>     >
>     > I've also done the following:
>     >
>     >   * Restarted all osrf services on the application and log servers
>     along
>     >     with ejabberd/memcached where applicable.
>     >   * Killed all processes on the database server older than 2 minutes.
>     >   * Re-ran replication of the production server to replicated database
>     >     server. I did this just to rule out that there was not an
>     issue with
>     >     the replicated copy because we did have a fines issue that was
>     >     related to the replication at one point. 
>     >   * I ran "SELECT now()-query_start,pid,state,application_name,waiting
>     >     FROM pg_stat_activity;" but had to remove ",waiting" as it
>     threw an
>     >     error.
>     >       o That returns a list of processes like open-ils.cstore,
>     >         open-ils.pcrud, open-ils.reporter-store and the like. I
>     >         attempted to kill the old reporter-store processes with the
>     >         command "SELECT pg_cancel_backend(backend_pid);" and Clark
>     >         stopped, and while it returned a value of true showing the
>     >         process was dead, when I re-ran it, it appears to still be
>     present.
>     >
>     > I don't see anything else
>     >
>     under http://docs.evergreen-ils.org/reorg/3.1/command_line_admin/Evergreen_Documentation.pdf
>     >
>     or https://wiki.evergreen-ils.org/doku.php?id=scratchpad:random_magic_spells. 
>     >
>     > The only thing I haven't tried, but shouldn't need to, is to actually
>     > restart that server, but am waiting until there is someone physically
>     > present in case it does not properly restart on its own.
>     >
>     > -Jon
>     >
>     >
> 


More information about the Open-ils-general mailing list