I made an application that uses the nim asynchttpserver library. But sometimes when there are a lot of requests the server die with the following error: Exception message: Too many open files.
I did a test with wrk and when the requests are made directly to the server without nginx as a reverse proxy in between the server don't crash. But when I use the reverse proxy the server goes down!
I opened a issue #18161 on the github, but so far I haven't had any comments on how to fix the issue.
Does anyone have an idea how to fix this issue?
I get it! Nginx is itself async and create file descriptor for its async calls,
Using asyncHttpServer with nginx causes lots of opened file descriptors and server crashes...
I made some changes to the nginx configuration. With the argument keepalive_timeout = 0 or any other value the server dies. The same is true with the keepalive argument in upstram.
upstream test_backend {
server localhost:9000;
keepalive 15;
}
server {
listen 8080;
server_name test;
location / {
proxy_pass http://test_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
# keepalive_timeout 0;
}
}
After it crashes, the netstat shows a lot of lines like these:
...
tcp 0 0 localhost:opsmessaging localhost:36396 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:55988 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:51270 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:53180 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:57476 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:37794 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:33832 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:50984 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:40112 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:57562 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:45750 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:58576 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:33854 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:39440 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:48368 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:53416 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:33364 TIME_WAIT
tcp 0 0 localhost:opsmessaging localhost:38736 TIME_WAIT
...
I did another test but with an app made in golang and in this case the issue also happens, but the server doesn't die #18161.
But unlike the app in Nim, the app in golang detects when there are too many open files and puts the connection in standby.
...
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 5ms
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 10ms
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 5ms
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 5ms
...
It was really cool to implement something like that in Nim! Not clear, what is the problem you raised?
Why with Nginx there are "Too many open files"? Because when you use a proxy, you double the requirements for open sockets (files), and in this case, you faster meet limits.
Why program crash? Because there is an unhandled exception :)
OK, maybe the correct answer will be... To solve this issue, which is typical for every high load HTTP server, you have to modify the limits of your OS.
Check current limit: ulimit -n
Set new limit: ulimit -S -n <new limit value>.
This change is not persistent, and to make it persistent, you have to apply changes to /etc/sysctl.conf file fs.file-max setting.
There are more ways to achieve this by finetuning the Nginx configuration (worker_rlimit_nofile, worker_connections, worker_processes), but this depends on your preferences and more suitable to production servers. An easy fix for development is to increase the global limit using ulimit command mentioned above.
By modifying example in this way:
import asynchttpserver, asyncdispatch
proc main {.async.} =
var server = newAsyncHttpServer()
proc cb(req: Request) {.async.} =
let headers = {"Date": "Tue, 29 Apr 2014 23:40:08 GMT",
"Content-type": "text/plain; charset=utf-8"}
echo($activeDescriptors(), "/", $maxDescriptors())
await req.respond(Http200, "Hello World", headers.newHttpHeaders())
server.listen Port(9000)
while true:
if server.shouldAcceptRequest(6):
await server.acceptRequest(cb)
else:
poll()
asyncCheck main()
runForever()
I was able to prevent server from crash. Looks like default value for shouldAcceptRequest's argument assumedDescriptorsPerRequest which is 5 - incorrect (in this our case). By setting it to 6, server newer crash.
I do not know, how properly guess value of assumedDescriptorsPerRequest.
Sorry for spamming.
Investigated a bit of correct assumedDescriptorsPerRequest value. Looks like it requires a single file description to handle a single request (tested by simple curl call). But this does not take into account the fact, that idle processes consume some amount of file descriptors.
For me (mac os) it was 9:
lsof -a -p 31299
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
server 31299 romka cwd DIR 1,4 128 22182574 /Users/romka/t/nim-server-error
server 31299 romka txt REG 1,4 527336 22190701 /Users/romka/t/nim-server-error/server
server 31299 romka txt REG 1,4 2547760 1152921500312767057 /usr/lib/dyld
server 31299 romka 0u CHR 16,0 0t5236958 3121 /dev/ttys000
server 31299 romka 1u CHR 16,0 0t5236958 3121 /dev/ttys000
server 31299 romka 2u CHR 16,0 0t5236958 3121 /dev/ttys000
server 31299 romka 3u IPv4 0x50c0746d84061b8d 0t0 TCP *:cslistener (LISTEN)
server 31299 romka 4u KQUEUE count=0, state=0xa
server 31299 romka 5u IPv4 0x50c0746d84061165 0t0 TCP *:* (CLOSED)
I'm not sure, why 6 works for assumedDescriptorsPerRequest, possible not all lines in the output of lsof are real file descriptors, but looks like a good strategy for setting assumedDescriptorsPerRequest will be a prior investigation of idle process needs and then setting safe value. The value will be different depending on dozen of factors, i.e. connections to DB, logging to files, etc.
Thank you @ZadaZorg for investigating the correct value of the assumedDescriptorsPerRequest argument. I have tested values from -1 to 10 and the result is always the same. The assumedDescriptorsPerRequest doesn't work :-( My OS is Arch Linux! I don't know if this is important.
import asynchttpserver, asyncdispatch
proc main {.async.} =
var server = newAsyncHttpServer()
proc cb(req: Request) {.async.} =
let headers = {"Date": "Tue, 29 Apr 2014 23:40:08 GMT",
"Content-type": "text/plain; charset=utf-8"}
await req.respond(Http200, "Hello World", headers.newHttpHeaders())
server.listen Port(9000)
while true:
echo "Max Descriptors: ", maxDescriptors()
echo "Active Descriptors: ", activeDescriptors()
if server.shouldAcceptRequest(assumedDescriptorsPerRequest = 1):
echo "Accept Request: ", activeDescriptors()
await server.acceptRequest(cb)
else:
poll()
asyncCheck main()
runForever()
Result:
...
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2