[OPEN-ILS-DEV] Circ History Sorting

Bill Erickson berickxx at gmail.com
Fri Aug 19 10:07:50 EDT 2016


Hi Josh,

It looks like a bug in the code.  The {substream => 1} bit is there to
prevent this type of Ejabberd overload, but it's not sitting in the right
spot.  Try moving the {substream => 1} chunk outside of the editor query
array.  Something like so:

+++ b/Open-ILS/src/perlmods/lib/OpenILS/WWW/EGCatLoader/Account.pm
@@ -1629,9 +1629,9 @@ sub fetch_user_circ_history {
             order_by => {auch => 'xact_start DESC'},
             $flesh ? %flesh_ops : (),
             %limits
-        },
-        {substream => 1}
-    ]);
+        }
+        ], {substream => 1}
+    );
     $e->rollback;

If that works, I'll post a branch.

-b


On Fri, Aug 19, 2016 at 9:29 AM, Josh Stompro <stomproj at exchange.larl.org>
wrote:

> Jason, thanks for the suggestions.  I checked the max_stanza_size  value
> and it matches the opensrf install docs recommendation of 2,000,000.  I'll
> try doubling that and see if it makes any difference.
>
> No change, I still see the same log messages and the same symptoms.
>
> I tried it with 50 items in the history, and the sort does work. 416426
> Bytes.  It looks like it takes 5 seconds for the "Returned 50 result(s)"
> message.  So 260 results might take 26 seconds?
>
> open-ils.cstore 2016-08-19 08:12:47 [INFO:18721:osrf_application.c:1059:14716108371914417]
> CALL: open-ils.cstore open-ils.cstore.direct.action.
> user_circ_history.search.atomic {"usr":136306},{"order_by":{"auch":"xact_start
> DESC"},"flesh":3,"flesh_fields":{"acn":["record"],"
> auch":["target_copy"],"acp":["call_number"]}},{"substream":1}
>
> open-ils.cstore 2016-08-19 08:12:47 [INFO:18721:osrf_app_session.c:1017:14716108371914417]
> [open-ils.cstore] sent 416426 bytes of data to opensrf at private.localhost/
> client_at_virt-egdev1.larl.org_19144
>
> open-ils.cstore 2016-08-19 08:12:47 [INFO:18721:osrf_stack.c:163:14716108371914417]
> Message processing duration 0.390298
>
> [2016-08-19 08:12:52] /usr/sbin/apache2 [INFO:19144:CStoreEditor.pm:139:14716108371914417]
> editor[1|136306] open-ils.cstore.direct.action.user_circ_history.search.atomic:
> returned 50 result(s)
>
> [2016-08-19 08:12:52] /usr/sbin/apache2 [INFO:19144:CStoreEditor.pm:139:14716108371914417]
> editor[1|136306] rolling back db session
>
> [2016-08-19 08:12:52] /usr/sbin/apache2 [INFO:19144:CStoreEditor.pm:139:14716108371914417]
> editor[1|136306] request en-US open-ils.cstore.transaction.rollback []
>
> Josh Stompro - LARL IT Director
>
>
> -----Original Message-----
> From: Open-ils-dev [mailto:open-ils-dev-bounces at list.georgialibraries.org]
> On Behalf Of Jason Stephenson
> Sent: Thursday, August 18, 2016 6:12 PM
> To: Evergreen Development Discussion List
> Subject: Re: [OPEN-ILS-DEV] Circ History Sorting
>
> Josh,
>
> The timeout bit has to do with the client of the cstore making a
> "connected" session and no request coming in. After 6 seconds, the drone
> will close the stateful session so it can be used by another client.
> This happens when something breaks the communication between client and
> drone, or if the client code forgets to disconnect after connecting.
> (I've fixed a bug or two related to this in the past.)
>
> If you look at the line above that, the cstore drone sent 1,582,282 bytes
> of data to the client. At least, it tried to do that.
>
> You should check the max_stanza_size in your ejabberd config. It is
> probably set at 1,000,000 and that response was, therefore, too big.
>
> This would explain beam and apache hitting 100% CPU for a while.
>
> So, my guess is the cstore response was too large given a typical ejabberd
> config. Ejabberd threw the response out and the client never found out, so
> it sat waiting on a response that was not coming. After 6 seconds, the
> drone closed the session opened by the client.
>
> You can try increasing the max_stanza_size in your ejabberd configuration.
>
> HtH,
> Jason
>
> On 08/18/2016 04:21 PM, Josh Stompro wrote:
> > Hello, I'm having trouble with sorting the columns in the my opac,
> > circ history screen.  I've have a 2.10.5 test system setup with all
> > our production data on a single VM.  Maybe the VM just doesn't have
> > enough resources.
> >
> >
> >
> > I have 226 items in my history when I look at action.usr_circ_history.
> >
> >
> >
> > Viewing and paging through the history works fine.  It looks like when
> > doing normal viewing and paging the  cstore query is used directly,
> > and it is limited to 15 items per page.
> >
> >
> >
> > When I click the title column to sort I see the following.  The web
> > page takes about 30 seconds to re-load and says that I have no data in
> > my history.  When looking at atop I see beam.smp go to 100% CPU for 60
> > seconds, and then and apache process at 100% CPU for 45 seconds.
> >
> >
> >
> > When I look at the logs I see the following.  It looks like cstore
> > waits
> > 6 seconds for a response and then gives up... but the system must still
> > be pulling in the data and then processing it, even though cstore gave
> up.
> >
> > [2016-08-17 15:52:25] /usr/sbin/apache2
> > [INFO:8529:CStoreEditor.pm:139:1471433103852926] editor[1|127625]
> > request en-US
> > open-ils.cstore.direct.action.user_circ_history.search.atomic
> > [{"usr":127625},{"flesh_fields":{"auch":["target_copy"],"acp":["call_n
> > umber"],"acn":["record"]},"order_by":{"auch":"xact_start
> > DESC"},"flesh":3},{"substream":1}]
> >
> > open-ils.cstore 2016-08-17 15:52:25
> > [INFO:23530:osrf_application.c:1059:1471433103852927] CALL:
> > open-ils.cstore
> > open-ils.cstore.direct.action.user_circ_history.search.atomic
> > {"usr":127625},{"order_by":{"auch":"xact_start
> > DESC"},"flesh":3,"flesh_fields":{"auch":["target_copy"],"acp":["call_n
> > umber"],"acn":["record"]}},{"substream":1}
> >
> > open-ils.cstore 2016-08-17 15:52:26
> > [INFO:23530:osrf_app_session.c:1017:1471433103852927]
> > [open-ils.cstore] sent 1582282 bytes of data to
> > opensrf at private.localhost/client_at_virt-egdev1.larl.org_8529
> >
> > open-ils.cstore 2016-08-17 15:52:26
> > [INFO:23530:osrf_stack.c:163:1471433103852927] Message processing
> > duration 0.761759
> >
> > open-ils.cstore 2016-08-17 15:52:32
> > [INFO:23530:osrf_prefork.c:496:1471433103852927] No request was
> > received in 6 seconds, exiting stateful session
> >
> > open-ils.cstore 2016-08-17 15:52:32
> > [INFO:23530:osrf_app_session.c:1017:1471433103852927]
> > [open-ils.cstore] sent 196 bytes of data to
> > opensrf at private.localhost/client_at_virt-egdev1.larl.org_8529
> >
> > [2016-08-17 15:53:25] /usr/sbin/apache2
> > [INFO:8529:CStoreEditor.pm:139:1471433103852927] editor[1|127625]
> > request returned no data :
> > open-ils.cstore.direct.action.user_circ_history.search.atomic
> >
> > [2016-08-17 15:53:25] /usr/sbin/apache2
> > [INFO:8529:CStoreEditor.pm:139:1471433103852927] editor[1|127625]
> > rolling back db session
> >
> > [2016-08-17 15:53:25] /usr/sbin/apache2
> > [INFO:8529:CStoreEditor.pm:139:1471433103852927] editor[1|127625]
> > request en-US open-ils.cstore.transaction.rollback []
> >
> >
> >
> > I was looking where the 6 second cstore timeout was set, and all I see
> > in opensrf.xml is the <keepalive> set to 6, is that the timeout?
> >
> >
> >
> > From looking at the code for this feature,
> >
> > http://git.evergreen-ils.org/?p=Evergreen.git;a=commitdiff;h=c8493cabf
> > 211ebfd14bfd179923f8bc6622334d9;hp=f1e09f95eb069494e93da74111c91a5788e
> > e369d
> >
> >
> >
> > It looks like when sorting is enabled, the full list of history items
> > is pulled and the TT code in circ_history.tt2 does the sorting and
> paging.
> > So on each page load the entire list of history items needs to be
> > pulled and processed.  So it goes from having to deal with 15 items
> > max, to
> > 250-5000 items.  I just checked our history table and see that the
> > most history someone has is 5000 items.  Is there any chance that this
> > code can deal with 5000 items in a reasonable time period?  It looks
> > like it is fleshing the record, so including the full marcxml data for
> > each of those 5000 items, mainly to pull out the title non filing
> > info, that gets to be a big chunk of data to move around for each page
> load.
> >
> >
> >
> > As it is, If I go through and click on each of the sort options while
> > I'm waiting for the page to load, which I can see an impatient patron
> > doing to try to elicit a response from the page, I start a new request
> > for each click.  This causes my test vm to become unresponsive to all
> > other catalog requests, with beam.smp sitting at 300% Cpu usage, so an
> > easy way for a user to DOS the system intentionally or unintentionally.
> >
> >
> >
> > What would be a good strategy for handling this more efficiently?
> > Push everything to the database with a custom query that generates the
> > title sort field and negates the need to do processing in
> circ_history.tt2?
> > Are there any materialized views that generate a title_sort field
> > already?  Could this data be cached in memcached?
> >
> >
> >
> > I can see that if I'm really trying to look at my history, I might
> > make a bunch of requests in a short time period, various sorts, and
> > paging through the results, so making this more efficient would help.
> >
> >
> >
> > Josh
> >
> >
> >
> >
> >
> > Lake Agassiz Regional Library - Moorhead MN larl.org
> >
> > Josh Stompro     | Office 218.233.3757 EXT-139
> >
> > LARL IT Director | Cell 218.790.2110
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-dev/attachments/20160819/dae1984e/attachment.html>


More information about the Open-ils-dev mailing list