[OPEN-ILS-GENERAL] Apache leaking sockets/FD

Josh Stompro stomproj at exchange.larl.org
Wed Jul 22 21:14:41 EDT 2015


Greetings,  I've been trying to figure out why my two front end Evergreen application servers keep hitting some resource limits having to do with tcp sockets (numtcpsock openvz beancounters).

I'm running EG 2.8.2, OpenSRF 2.4.1, Debian Jessie in an Openvz container on Proxmox VE 3.4

Nothing looks out of the ordinary when I look at the output of 'ss -s' or 'netstat -a', but the numtcpsock counter keeps going up, until I have 5000+ reported open tcp socket connections.

I think I've narrowed it down to apache, since restarting apache resets the numtcpsock numbers back in line with what is reported by 'ss -s'

If I take a look at all the open fd's of an apache process, I see a bunch of the following.  So I think some socket connections are being opened but not closed properly.

(lsof -p <pid>)

/usr/sbin 11821 opensrf  171u  sock                0,6      0t0 61135031 can't identify protocol
/usr/sbin 11821 opensrf  172u  sock                0,6      0t0 61135034 can't identify protocol
/usr/sbin 11821 opensrf  173u  sock                0,6      0t0 61135037 can't identify protocol
/usr/sbin 11821 opensrf  174u  sock                0,6      0t0 61321969 can't identify protocol
/usr/sbin 11821 opensrf  175u  sock                0,6      0t0 61321972 can't identify protocol
/usr/sbin 11821 opensrf  176u  sock                0,6      0t0 61321975 can't identify protocol
/usr/sbin 11821 opensrf  177u  sock                0,6      0t0 61321978 can't identify protocol
/usr/sbin 11821 opensrf  178u  sock                0,6      0t0 61321981 can't identify protocol
/usr/sbin 11821 opensrf  179u  sock                0,6      0t0 61458539 can't identify protocol
/usr/sbin 11821 opensrf  180u  sock                0,6      0t0 61458542 can't identify protocol
/usr/sbin 11821 opensrf  181u  sock                0,6      0t0 61458545 can't identify protocol
/usr/sbin 11821 opensrf  182u  sock                0,6      0t0 61458548 can't identify protocol
/usr/sbin 11821 opensrf  183u  sock                0,6      0t0 61458551 can't identify protocol
/usr/sbin 11821 opensrf  184u  sock                0,6      0t0 62085495 can't identify protocol
/usr/sbin 11821 opensrf  185u  sock                0,6      0t0 62085498 can't identify protocol
/usr/sbin 11821 opensrf  186u  sock                0,6      0t0 62085501 can't identify protocol
/usr/sbin 11821 opensrf  187u  sock                0,6      0t0 62085504 can't identify protocol
/usr/sbin 11821 opensrf  188u  sock                0,6      0t0 62085507 can't identify protocol
/usr/sbin 11821 opensrf  189u  sock                0,6      0t0 63801157 can't identify protocol
/usr/sbin 11821 opensrf  190u  sock                0,6      0t0 63801160 can't identify protocol
/usr/sbin 11821 opensrf  191u  sock                0,6      0t0 63801163 can't identify protocol
/usr/sbin 11821 opensrf  192u  sock                0,6      0t0 63801166 can't identify protocol
/usr/sbin 11821 opensrf  193u  sock                0,6      0t0 63801169 can't identify protocol
/usr/sbin 11821 opensrf  194u  sock                0,6      0t0 63961716 can't identify protocol
/usr/sbin 11821 opensrf  195u  sock                0,6      0t0 63961719 can't identify protocol
/usr/sbin 11821 opensrf  196u  sock                0,6      0t0 63961722 can't identify protocol
/usr/sbin 11821 opensrf  197u  sock                0,6      0t0 63961725 can't identify protocol
/usr/sbin 11821 opensrf  198u  sock                0,6      0t0 63961728 can't identify protocol
/usr/sbin 11821 opensrf  199u  sock                0,6      0t0 64808966 can't identify protocol
/usr/sbin 11821 opensrf  200u  sock                0,6      0t0 64808971 can't identify protocol
/usr/sbin 11821 opensrf  201u  sock                0,6      0t0 64808974 can't identify protocol
/usr/sbin 11821 opensrf  202u  sock                0,6      0t0 64808977 can't identify protocol
/usr/sbin 11821 opensrf  203u  sock                0,6      0t0 64808980 can't identify protocol

I'm not sure how to track down the problem, I'll try using strace to see what connections are being created, but I'm not quite sure what to look for.

If anyone has run into this before, please let me know.
Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://libmail.georgialibraries.org/pipermail/open-ils-general/attachments/20150723/273e3000/attachment-0001.html>


More information about the Open-ils-general mailing list