[Evergreen-general] Dealing with significant traffic increase caused by AI bots

Jane Sandberg sandbergja at gmail.com
Fri Apr 19 07:05:36 EDT 2024


Hi Linda,

It's not for Evergreen, but my colleague recently blocked claudebot using
fail2ban on our load balancer
<https://github.com/pulibrary/princeton_ansible/commit/6f9009249a168442391d90e2b75028d40a8a9e91>.
Essentially, fail2ban is configured to watch Nginx's access log, and if
more than 10 claudebot requests appear within the past minute from a
particular IP, it automatically blocks all requests from that IP for the
next 24 hours.  I would think that something similar could work for
Apache's access log.

Good luck with the bots!

  -Jane

El vie, 19 abr 2024 a la(s) 3:42 a.m., Linda Jansová via Evergreen-general (
evergreen-general at list.evergreen-ils.org) escribió:

> Dear all,
>
> Have any of you encountered an extensive crawling by Bytespider and
> Bytedance (see e.g.,
>
> https://wordpress.org/support/topic/psa-bytedance-and-bytespider-bots-recommend-blocking/),
>
> Claudebot or other AI bots?
>
> If so, do you have any secret recipe how to disable the crawler from
> accessing the site?
>
> Thank you very much for sharing your experience!
>
> Linda
>
> _______________________________________________
> Evergreen-general mailing list
> Evergreen-general at list.evergreen-ils.org
> http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.evergreen-ils.org/pipermail/evergreen-general/attachments/20240419/23df852b/attachment.htm>


More information about the Evergreen-general mailing list