[OPEN-ILS-GENERAL] Apache leaking sockets/FD
Josh Stompro
stomproj at exchange.larl.org
Wed Jul 22 21:14:41 EDT 2015
Greetings, I've been trying to figure out why my two front end Evergreen application servers keep hitting some resource limits having to do with tcp sockets (numtcpsock openvz beancounters).
I'm running EG 2.8.2, OpenSRF 2.4.1, Debian Jessie in an Openvz container on Proxmox VE 3.4
Nothing looks out of the ordinary when I look at the output of 'ss -s' or 'netstat -a', but the numtcpsock counter keeps going up, until I have 5000+ reported open tcp socket connections.
I think I've narrowed it down to apache, since restarting apache resets the numtcpsock numbers back in line with what is reported by 'ss -s'
If I take a look at all the open fd's of an apache process, I see a bunch of the following. So I think some socket connections are being opened but not closed properly.
(lsof -p <pid>)
/usr/sbin 11821 opensrf 171u sock 0,6 0t0 61135031 can't identify protocol
/usr/sbin 11821 opensrf 172u sock 0,6 0t0 61135034 can't identify protocol
/usr/sbin 11821 opensrf 173u sock 0,6 0t0 61135037 can't identify protocol
/usr/sbin 11821 opensrf 174u sock 0,6 0t0 61321969 can't identify protocol
/usr/sbin 11821 opensrf 175u sock 0,6 0t0 61321972 can't identify protocol
/usr/sbin 11821 opensrf 176u sock 0,6 0t0 61321975 can't identify protocol
/usr/sbin 11821 opensrf 177u sock 0,6 0t0 61321978 can't identify protocol
/usr/sbin 11821 opensrf 178u sock 0,6 0t0 61321981 can't identify protocol
/usr/sbin 11821 opensrf 179u sock 0,6 0t0 61458539 can't identify protocol
/usr/sbin 11821 opensrf 180u sock 0,6 0t0 61458542 can't identify protocol
/usr/sbin 11821 opensrf 181u sock 0,6 0t0 61458545 can't identify protocol
/usr/sbin 11821 opensrf 182u sock 0,6 0t0 61458548 can't identify protocol
/usr/sbin 11821 opensrf 183u sock 0,6 0t0 61458551 can't identify protocol
/usr/sbin 11821 opensrf 184u sock 0,6 0t0 62085495 can't identify protocol
/usr/sbin 11821 opensrf 185u sock 0,6 0t0 62085498 can't identify protocol
/usr/sbin 11821 opensrf 186u sock 0,6 0t0 62085501 can't identify protocol
/usr/sbin 11821 opensrf 187u sock 0,6 0t0 62085504 can't identify protocol
/usr/sbin 11821 opensrf 188u sock 0,6 0t0 62085507 can't identify protocol
/usr/sbin 11821 opensrf 189u sock 0,6 0t0 63801157 can't identify protocol
/usr/sbin 11821 opensrf 190u sock 0,6 0t0 63801160 can't identify protocol
/usr/sbin 11821 opensrf 191u sock 0,6 0t0 63801163 can't identify protocol
/usr/sbin 11821 opensrf 192u sock 0,6 0t0 63801166 can't identify protocol
/usr/sbin 11821 opensrf 193u sock 0,6 0t0 63801169 can't identify protocol
/usr/sbin 11821 opensrf 194u sock 0,6 0t0 63961716 can't identify protocol
/usr/sbin 11821 opensrf 195u sock 0,6 0t0 63961719 can't identify protocol
/usr/sbin 11821 opensrf 196u sock 0,6 0t0 63961722 can't identify protocol
/usr/sbin 11821 opensrf 197u sock 0,6 0t0 63961725 can't identify protocol
/usr/sbin 11821 opensrf 198u sock 0,6 0t0 63961728 can't identify protocol
/usr/sbin 11821 opensrf 199u sock 0,6 0t0 64808966 can't identify protocol
/usr/sbin 11821 opensrf 200u sock 0,6 0t0 64808971 can't identify protocol
/usr/sbin 11821 opensrf 201u sock 0,6 0t0 64808974 can't identify protocol
/usr/sbin 11821 opensrf 202u sock 0,6 0t0 64808977 can't identify protocol
/usr/sbin 11821 opensrf 203u sock 0,6 0t0 64808980 can't identify protocol
I'm not sure how to track down the problem, I'll try using strace to see what connections are being created, but I'm not quite sure what to look for.
If anyone has run into this before, please let me know.
Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20150723/273e3000/attachment-0001.html>
More information about the Open-ils-general
mailing list