[Evergreen-dev] Problematic bot traffic

Shula Link slink at gchrl.org
Thu Feb 13 16:49:41 EST 2025


It's not just Evergreen sites. I had to block all traffic from Hong Kong to
our system website after we had a greater than 10x increase in visitors
overnight. I tried doing it by IP, but they just changed, so it ended up
just being easier to just block everything.

Shula Link (she/her)
Systems Services Librarian
Greater Clarks Hill Regional Library
slink at columbiacountyga.gov | slink at gchrl.org
706-447-6702


On Thu, Feb 13, 2025 at 4:46 PM Blake Graham-Henderson via Evergreen-dev <
evergreen-dev at list.evergreen-ils.org> wrote:

> All,
>
> I almost replied with the arstechnica article that Josh linked when the
> thread was started. But I decided not to put it out there until I had setup
> a test system to see if I could get that code working. A tarpit, I think,
> serves them right. And, of course, the whole issue is destined to receive
> the fate of spam and spam filters forever and ever.
>
> It was a serendipitous timed article. It's existence at this moment in
> time signals to me that this isn't a "just us" problem. It's the entire
> planet.
>
> -Blake-
> Conducting Magic
> Will consume any data format
> MOBIUS
>
>
> On 2/13/2025 3:10 PM, Josh Stompro via Evergreen-dev wrote:
>
> Jeff, thanks for bringing this up on the list.
>
> We are seeing a lot of requests like
>  "GET /eg/opac/mylist/delete?anchor=record_184821&record=184821" from
> never seen before IPs, and they make 1-12 requests and then stop.
>
> And they seem like they usually have a random out of date chrome version
> in the user agent string.
> Chrome/88.0.4324.192
> Chrome/86.0.4240.75
>
> I've been trying to slow down the bots by collecting logs and grabbing all
> the obvious patterns and blocking netblocks for non US ranges.  ipinfo.io
> offers a free country & ASN database download that I've been using to look
> up the ranges and countries. (https://ipinfo.io/products/free-ip-database)
> I would be happy to share a link to our current blocklist that has 10K non
> US ranges.
>
> I've also been reporting the non US bot activity to
> https://www.abuseipdb.com/ just to bring some visibility to these bad
> bots.  I noticed initially that many of the IPs that we were getting hit
> from didn't seem to be listed on any blocklists already, so I figured some
> reporting might help.  I'm kind of curious if Evergreen sites are getting
> hit from the same IPs, so an evergreen specific blocklist would be useful.
> If you look up your bot IPs on abuseipdb.com you can see if I've already
> reported any of them.
>
> I've also been making use of block lists from https://iplists.firehol.org/
> Such as
> https://iplists.firehol.org/files/cleantalk_30d.ipset
> https://iplists.firehol.org/files/botscout_7d.ipset
> https://iplists.firehol.org/files/firehol_abusers_1d.netset
>
> We are using HAProxy so I did some looking into the CrowdSec HAProxy
> Bouncer (https://docs.crowdsec.net/u/bouncers/haproxy/) but I'm not sure
> that would help since these IPs don't seem to be on blocklists.  But I may
> just not quite understand how CrowdSec is supposed to work.
>
> HAProxy Enterprise has a ReCaptcha module that I think would allow us to
> feed any non-us connections that haven't connected before through a
> recaptcha, but the price for HAProxy Enterprise is out of our budget.
> https://www.haproxy.com/blog/announcing-haproxy-enterprise-3-0#new-captcha-and-saml-modules
>
> There is also a fairly up to date project for adding Captchas through
> haproxy at
> https://github.com/ndbiaw/haproxy-protection, This looks promising as a
> transparent method, requires new connections to perform a javascript proof
> of work calculation before allowing access.  Could be a good transparent
> way of handling it.
>
> We were taken out by ChatGTP bots back in December, which were a bit
> easier to block the netblocks since they were not as spread out.  I
> recently saw this article about how some people are fighting back against
> bots that ignore robots.txt,
> https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/
>
> Josh
>
> On Mon, Jan 27, 2025 at 6:33 PM Jeff Davis via Evergreen-dev <
> evergreen-dev at list.evergreen-ils.org> wrote:
>
>> Hi folks,
>>
>> Our Evergreen environment has been experiencing a higher-than-usual
>> volume of unwanted bot traffic in recent months. Much of this traffic looks
>> like webcrawlers hitting Evergreen-specific URLs from an enormous number of
>> different IP addresses. Judging from discussion in IRC last week, it sounds
>> like other EG admins have been seeing the same thing. Does anyone have any
>> recommendations for managing this traffic and mitigating its impact?
>>
>> Some solutions that have been suggested/implemented so far:
>> - Geoblocking entire countries.
>> - Using Cloudflare's proxy service. There's some trickiness in getting
>> this to work with Evergreen.
>> - Putting certain OPAC pages behind a captcha.
>> - Deploying publicly-available blocklists of "bad bot"
>> IPs/useragents/etc. (good but limited, and not EG-specific).
>> - Teaching EG to identify and deal with bot traffic itself (but arguably
>> this should happen before the traffic hits Evergreen).
>>
>> My organization is currently evaluating CrowdSec as another possible
>> solution. Any opinions on any of these approaches?
>> --
>> Jeff Davis
>> BC Libraries Cooperative
>> _______________________________________________
>> Evergreen-dev mailing list
>> Evergreen-dev at list.evergreen-ils.org
>> http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev
>>
>
> _______________________________________________
> Evergreen-dev mailing listEvergreen-dev at list.evergreen-ils.orghttp://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev
>
>
> _______________________________________________
> Evergreen-dev mailing list
> Evergreen-dev at list.evergreen-ils.org
> http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.evergreen-ils.org/pipermail/evergreen-dev/attachments/20250213/a0eec812/attachment.htm>


More information about the Evergreen-dev mailing list