[OPEN-ILS-DEV] AC timeout change => [GIT] Evergreen ILS branch master updated. e08a5cbfaa14fcd386f63a5f3e29539e6be3f544

Bill Erickson erickson at esilibrary.com
Tue Jun 14 16:15:02 EDT 2011


On 6/14/11 9:17 AM, Dan Scott wrote:
> On Tue, Jun 14, 2011 at 08:41:59AM -0400, Bill Erickson wrote:
>>
>> Hi Dan,
>>
>> I'd like to suggest we not make this change or at least make the
>> default significantly lower.  With a 30-second timeout and a slow or
>> crippled added content provider, it would not take long for the
>> Apache processes to be gobbled up, leaving EG unusable.
>
> Hmm. I guess as you say below that depends on load and the added content
> provider; we've been running with timeout set to 45 seconds and using
> the new OpenLibrary Read API where some requests do take a long time to
> resolve (30 seconds for an ISBN with many editions is not unusual, at
> least in this early stage before they've optimized their own service). I
> thought that with caching integrated into added content, the idea was
> that the initial request would be costly but subsequent requests would
> be cached - therefore spreading out the pain.

Yes, in some environments a high timeout works fine.  I think it's very 
subjective.  And, yes, that is the goal of caching.  It helps a lot, but 
obviously it doesn't remove the need to make network calls.

>
>> My preference would be to keep it at 1 w/ the understanding that
>> users can raise the value if they want to take that risk.  If that's
>> too aggressive for a default, I could maybe see using 2 or 3
>> seconds. Anything higher is unsafe, IMO.  Of course, it depends on
>> the environment.
>
> Keeping it at 1 would be the status quo, and status quo was that I was
> seeing plenty of timeouts at that setting both when we had Syndetics as
> our AC provider and when we switched to OpenLibrary. 2 or 3 would
> definitely be better.

Right, I understand and agree with all of this.  I am suggesting that 
(by default) added content suffer in favor of avoiding denial of 
service.  Since I don't think there is a timeout that will work for 
everyone, my preference is to default to the safest (or reasonably safe) 
option, even if it means losing some content.

> When I raised the default timeout value on IRC a
> week or two back, the general reaction was that 1 seemed low.

I'm sorry I missed that conversation.

>
> If you see AC caching presenting a possible denial of service issue, then
> maybe we should just eliminate the caching entirely, or overhaul it so
> that it draws from a different pool of Apache processes than the main
> Evergreen processes?

I don't see caching as a problem.  The problem (as you explain below) is 
from Apache process gobbling.  An overhaul to allow AC calls to pull 
from a different set of Apache servers would solve the problem.  It 
would have to be a true overhaul to added content delivery, though, 
given AJAX domain restrictions.

> From what you've described, it sounds like as it
> currently is architected on a single-server system,

Single or Multi-server, since AC requests are spread across all of the 
Apache servers, regardless of the number of bricks.

> ...a sufficient
> number of concurrent AC requests would exhaust the available Apache
> processes no matter what the timeout value is set to; it's less likely
> to happen at 1, but still a denial of service waiting to happen.

Agreed.  It's similar to the suggestion in the Evergreen install 
instructions that direct users to set the KeepAliveTimeout value to 1 
(instead of the default 25).  It's for the same reason.  We're 
sacrificing speed for reduced likelihood of DOS.

-b


-- 
Bill Erickson
| VP, Software Development & Integration
| Equinox Software, Inc. / Your Library's Guide to Open Source
| phone: 877-OPEN-ILS (673-6457)
| email: erickson at esilibrary.com
| web: http://esilibrary.com

Equinox is going to New Orleans! Please visit us at booth 550
at ALA Annual to learn more about Koha and Evergreen.


More information about the Open-ils-dev mailing list