Description
Hello!
I use a big number of independent connections to push ~10 lines of data to InfluxDB. With my data rate, I have about 2500 of TIME_WAIT connections in any period of time.
It's an issue for me because our systems have a pretty small amount of resources. I started to look for options to reduce number of TIME_WAIT connections.
I started from "Conneciton: close" http option to terminate connections immediately because I have never re-use them.
I implemented my own (pretty complicated code based on https://github.com/vinniefalco/Beast/blob/master/example/http-crawl/http_crawl.cpp) implementation and got an issue from InfluxDB side.
This code is complicated to isolated and I moved to simple example with curl.
Curl example:
curl -d "networks_traffic,network=185.86.148.0/22 bits_incoming=0,bits_outgoing=0,packets_incoming=0,packets_outgoing=0 1498413453000000000" -v "http://127.0.0.1:8086/write?db=fastnetmon" -sL -H 'Connection: close'
curl debug below:
curl -d "networks_traffic,network=185.86.148.0/22 bits_incoming=0,bits_outgoing=0,packets_incoming=0,packets_outgoing=0 1498413453000000000" -v "http://127.0.0.1:8086/write?db=fastnetmon" -sL -H 'Connection: close'
* Hostname was NOT found in DNS cache
* Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 8086 (#0)
> POST /write?db=fastnetmon HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 127.0.0.1:8086
> Accept: */*
> Connection: close
> Content-Length: 130
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 130 out of 130 bytes
< HTTP/1.1 204 No Content
< Content-Type: application/json
< Request-Id: cfd9829a-59da-11e7-9a97-000000000000
< X-Influxdb-Version: 1.2.4
< Date: Sun, 25 Jun 2017 19:16:38 GMT
< Connection: close
<
* Closing connection 0
According to this output connection from client side (cURL) was closed immediately.
Then I captured data with TCPDUMP:
20:16:38.296716 IP localhost.44252 > localhost.8086: Flags [S], seq 1370079652, win 43690, options [mss 65495,sackOK,TS val 1326659 ecr 0,nop,wscale 9], length 0
20:16:38.296739 IP localhost.8086 > localhost.44252: Flags [S.], seq 2874189319, ack 1370079653, win 43690, options [mss 65495,sackOK,TS val 1326659 ecr 1326659,nop,wscale 9], length 0
20:16:38.296758 IP localhost.44252 > localhost.8086: Flags [.], ack 1, win 86, options [nop,nop,TS val 1326659 ecr 1326659], length 0
20:16:38.296904 IP localhost.44252 > localhost.8086: Flags [P.], seq 1:318, ack 1, win 86, options [nop,nop,TS val 1326659 ecr 1326659], length 317
20:16:38.296915 IP localhost.8086 > localhost.44252: Flags [.], ack 318, win 88, options [nop,nop,TS val 1326659 ecr 1326659], length 0
20:16:38.301835 IP localhost.8086 > localhost.44252: Flags [P.], seq 1:193, ack 318, win 88, options [nop,nop,TS val 1326660 ecr 1326659], length 192
20:16:38.301849 IP localhost.44252 > localhost.8086: Flags [.], ack 193, win 88, options [nop,nop,TS val 1326660 ecr 1326660], length 0
20:16:38.301889 IP localhost.8086 > localhost.44252: Flags [F.], seq 193, ack 318, win 88, options [nop,nop,TS val 1326660 ecr 1326660], length 0
20:16:38.302299 IP localhost.44252 > localhost.8086: Flags [F.], seq 318, ack 194, win 88, options [nop,nop,TS val 1326661 ecr 1326660], length 0
20:16:38.302318 IP localhost.8086 > localhost.44252: Flags [.], ack 319, win 88, options [nop,nop,TS val 1326661 ecr 1326661], length 0
Well, as I understand here both sides sent FIN packet to close any conversations between them.
But in few seconds after this log messages I still see TIME_WAIT connections on my system:
sudo netstat -apnt|grep 44252
tcp 0 0 127.0.0.1:8086 127.0.0.1:44252 TIME_WAIT -
And that is incorrect behaviour. If we closed connection from both sides it should disappear immediately from netstat.
Could you confirm that InfluxDB supports "Connection: close" from client side properly?
Thank you for a response!
Activity
pavel-odintsov commentedon Jun 25, 2017
Also, I tried to use http 1.0 which does not have any pipelining at all and should not wait for consequent connections:
tcpdump:
Still have TIME_WAIT:
lwhile commentedon Jun 26, 2017
I have a same problem. My client query data from influxdb every 2s, and I found influxdb take up 6k+ tcp connections which in TIME_WAIT status
jsternberg commentedon Jun 29, 2017
I don't think there's anything we can do about this from the looks of it: https://serverfault.com/questions/478691/avoid-time-wait-connections
For writes, you should try to pool connections or use something like UDP. But, UDP won't give you a guarantee that your data actually reaches the server. For queries (which automatically close the connection each time), we likely need to start using HTTP/2.
pavel-odintsov commentedon Jun 30, 2017
Yes, I could not use UDP because I use pretty big batches of data :(
lwhile commentedon Jul 3, 2017
@jsternberg
Does influxdb has a method to reuse connection now? In the other word I want influxdb don't add CLOSE to a http connection header .My server is openning too many connection now which in TIME_WAIT status
https://github.com/influxdata/influxdb/blob/master/services/httpd/handler.go#L429
jsternberg commentedon Jul 3, 2017
For
/write
, yes. The same way you could with any other HTTP connection. For/query
, no. Unfortunately, the portion fo code that detects if a connection has been abandoned by the client also makes it impossible to reuse the connection.lwhile commentedon Jul 4, 2017
@jsternberg
But Influxdb always add connection-token "close" to header when querys were finished.It cause every query connection be abandoned although I did not closed it in client.
https://github.com/influxdata/influxdb/blob/master/services/httpd/handler.go#L429
jsternberg commentedon Jul 4, 2017
Yes. I'm looking into this to see if anything has changed since I originally added that code and there's a possibility we can remove the
Connection: close
being set on/query
, but that's preliminary until I determine howCloseNotify()
works when using pipelining. The docs state you're not supposed to and the behavior is a bit undefined.GatewayJ commentedon Jul 29, 2019
Hasn't the problem been solved? Or is there an alternative
1 remaining item