I have a production MongoDB 2.4 that's giving this error in the logs:
SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:37934]
The client's logs report a connection timeout.
mongod --version
reports v2.4.10 (I'm on Debian jessie; MongoDB is installed from the jessie repositories) and the net.ipv4.tcp_keepalive_time
sysctl is set to 300 as recommended by SocketException in Mongo logs. Also, the error is still present when triggering a database query directly after restarting MongoDB and the application, which to me suggests that this is not SERVER-5632 (mentioned in the above question).
/proc/pid/limits
reports unlimited for everything except stack size, core file size, processes, locked memory, pending signals, msgqueue size, nice and realtime priority, and open files. This last one in particular has a soft limit of 1024 and a hard limit of 4096, which isn't what the MongoDB docs recommend but shouldn't be a problem anyway since ls /proc/pid/fd | wc -l
reports only 29 open fds.
free -g
reports 3 GB of free memory, +/- buffers and cache. Rebooting the entire server seems to help somewhat for a short time, but then it gets just as bad again. Here's a snippet from the log:
Wed Mar 1 20:14:17.466 [conn1] authenticate db: pumpio { authenticate: 1, user: "pump.io", nonce: "7ac439ffd4c01a6d", key: "abe7862c7eb1f9765ebaa007d7c20ec2" }
Wed Mar 1 20:14:26.445 [conn1] command pumpio.$cmd command: { authenticate: 1, user: "pump.io", nonce: "7ac439ffd4c01a6d", key: "abe7862c7eb1f9765ebaa007d7c20ec2" } ntoreturn:1 keyUpdates:0 locks(micros) W:8978364 r:412 reslen:74 8979ms
Wed Mar 1 20:14:41.809 [initandlisten] connection accepted from 127.0.0.1:58216 #2 (2 connections now open)
Wed Mar 1 20:14:46.302 [initandlisten] connection accepted from 127.0.0.1:58218 #3 (3 connections now open)
Wed Mar 1 20:14:49.212 [initandlisten] connection accepted from 127.0.0.1:58220 #4 (4 connections now open)
Wed Mar 1 20:14:49.212 [initandlisten] connection accepted from 127.0.0.1:58222 #5 (5 connections now open)
Wed Mar 1 20:14:49.212 [initandlisten] connection accepted from 127.0.0.1:58224 #6 (6 connections now open)
Wed Mar 1 20:14:49.213 [initandlisten] connection accepted from 127.0.0.1:58226 #7 (7 connections now open)
Wed Mar 1 20:14:49.213 [initandlisten] connection accepted from 127.0.0.1:58228 #8 (8 connections now open)
Wed Mar 1 20:14:49.213 [initandlisten] connection accepted from 127.0.0.1:58230 #9 (9 connections now open)
Wed Mar 1 20:14:49.213 [initandlisten] connection accepted from 127.0.0.1:58232 #10 (10 connections now open)
Wed Mar 1 20:14:49.213 [initandlisten] connection accepted from 127.0.0.1:58234 #11 (11 connections now open)
Wed Mar 1 20:14:49.213 [initandlisten] connection accepted from 127.0.0.1:58236 #12 (12 connections now open)
Wed Mar 1 20:14:50.685 [conn1] query pumpio.session query: { _id: "9n_7gseqgT8i07tQbnBj7gNgURdQNmS8" } ntoreturn:1 idhack:1 keyUpdates:0 locks(micros) r:13911898 reslen:202 17129ms
Wed Mar 1 20:14:50.695 [conn2] authenticate db: pumpio { authenticate: 1, user: "pump.io", nonce: "7e918deb40e23f92", key: "0b4b2d78fa404e594dc1d066c9402cc4" }
Wed Mar 1 20:14:52.960 [conn1] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:58214]
Wed Mar 1 20:14:52.960 [conn7] assertion 16550 not authorized for query on reportr.system.indexes ns:reportr.system.indexes query:{ ns: "reportr.alerts" }
Wed Mar 1 20:14:52.960 [conn12] assertion 16550 not authorized for query on reportr.system.indexes ns:reportr.system.indexes query:{ ns: "reportr.alerts" }
Wed Mar 1 20:14:52.960 [conn5] assertion 16550 not authorized for query on reportr.system.indexes ns:reportr.system.indexes query:{ ns: "reportr.events" }
Wed Mar 1 20:14:52.960 [conn10] assertion 16550 not authorized for query on reportr.system.indexes ns:reportr.system.indexes query:{ ns: "reportr.events" }
Wed Mar 1 20:14:58.713 [initandlisten] connection accepted from 127.0.0.1:58238 #13 (12 connections now open)
Wed Mar 1 20:14:58.715 [conn13] authenticate db: pumpio { authenticate: 1, user: "pump.io", nonce: "fe508e5612e933fa", key: "492c9d11f582c045941da7f7eb16b8bd" }
Wed Mar 1 20:15:07.599 [conn2] update pumpio.session query: { _id: "i_vJ67QJA56LsQvLMF8GuiosMHUj7--P" } update: { cookie: { originalMaxAge: null, expires: null, httpOnly: true, path: "/" }, _id: "i_vJ67QJA56LsQvLMF8GuiosMHUj7--P" } idhack:1 nupdated:1 upsert:1 keyUpdates:0 locks(micros) w:16896964 16897ms
Wed Mar 1 20:15:07.600 [conn2] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:58216]
Wed Mar 1 20:15:07.600 [conn13] command pumpio.$cmd command: { authenticate: 1, user: "pump.io", nonce: "fe508e5612e933fa", key: "492c9d11f582c045941da7f7eb16b8bd" } ntoreturn:1 keyUpdates:0 locks(micros) r:1225 reslen:74 8885ms
Wed Mar 1 20:15:08.050 [conn13] update pumpio.session query: { _id: "deiIhAPc1vaXYZ_JzU72slghiCMOu1Zf" } update: { cookie: { originalMaxAge: null, expires: null, httpOnly: true, path: "/" }, _id: "deiIhAPc1vaXYZ_JzU72slghiCMOu1Zf" } idhack:1 nupdated:1 upsert:1 keyUpdates:0 locks(micros) w:109 447ms
Wed Mar 1 20:15:15.592 [TTLMonitor] query local.system.indexes query: { expireAfterSeconds: { $exists: true } } ntoreturn:0 ntoskip:0 nscanned:0 keyUpdates:0 locks(micros) r:142680 nreturned:0 reslen:20 142ms
Wed Mar 1 20:15:30.773 [conn13] query pumpio.recentdialbackrequests query: { _id: 0 } ntoreturn:1 idhack:1 keyUpdates:0 locks(micros) r:2751358 reslen:95317 2751ms
Wed Mar 1 20:16:14.855 [conn13] query pumpio.comment query: { _uuid: "oudHqMliREq5HSXd-7N62Q" } ntoreturn:1000 ntoskip:0 nscanned:1 keyUpdates:0 locks(micros) r:3247907 nreturned:1 reslen:900 4951ms
Wed Mar 1 20:16:14.855 [conn13] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [127.0.0.1:58238]
This is right after I rebooted the MongoDB server and the application server. Both of these are on the same host and the application connects to localhost
to issue database queries.
Any idea what's going on?
0 Answers