nimforum mirror - Issue when deploying a application with Nginx as a reverse proxy!

mrhdias (orginal) [2021-06-04T17:15:02+02:00] view original

I made an application that uses the nim asynchttpserver library. But sometimes when there are a lot of requests the server die with the following error: Exception message: Too many open files.

I did a test with wrk and when the requests are made directly to the server without nginx as a reverse proxy in between the server don't crash. But when I use the reverse proxy the server goes down!

I opened a issue #18161 on the github, but so far I haven't had any comments on how to fix the issue.

Does anyone have an idea how to fix this issue?

sky_khan (orginal) [2021-06-04T18:35:25+02:00] view original

Sounds like the same issue:

https://forum.nim-lang.org/t/7929#50486

mrhdias (orginal) [2021-06-04T18:50:54+02:00] view original

I don't think it's the same problem. Without the reverse proxy the issue doesn't happen! The argument "assumedDescriptorsPerRequest" works fine if the access is done locally without the reverse proxy. But to deploy the app and configure the https I have to use a reverse proxy.

alexeypetrushin (orginal) [2021-06-04T18:56:26+02:00] view original

Yea, using Nim for Web feels like walking on mine fields :)

hamidrb80 (orginal) [2021-06-04T19:16:31+02:00] view original

I think it's good to try this solution. Then tell us the result

hamidrb80 (orginal) [2021-06-04T20:04:36+02:00] view original

I get it! Nginx is itself async and create file descriptor for its async calls,

Using asyncHttpServer with nginx causes lots of opened file descriptors and server crashes...

enthus1ast (orginal) [2021-06-04T20:14:33+02:00] view original

Maybe keepalive settings in nginx? Can you test with eg netstat if the connection are closed properly?

mrhdias (orginal) [2021-06-04T23:28:33+02:00] view original

I made some changes to the nginx configuration. With the argument keepalive_timeout = 0 or any other value the server dies. The same is true with the keepalive argument in upstram.


upstream test_backend {
    server localhost:9000;
    keepalive 15;
}

server {
    listen  8080;
    server_name test;
    
    location / {
        proxy_pass http://test_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        # keepalive_timeout 0;
    }
}

After it crashes, the netstat shows a lot of lines like these:


...
tcp        0      0 localhost:opsmessaging  localhost:36396         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:55988         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:51270         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:53180         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:57476         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:37794         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:33832         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:50984         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:40112         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:57562         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:45750         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:58576         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:33854         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:39440         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:48368         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:53416         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:33364         TIME_WAIT
tcp        0      0 localhost:opsmessaging  localhost:38736         TIME_WAIT
...

mrhdias (orginal) [2021-06-05T13:37:17+02:00] view original

I did another test but with an app made in golang and in this case the issue also happens, but the server doesn't die #18161.

But unlike the app in Nim, the app in golang detects when there are too many open files and puts the connection in standby.


...
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 5ms
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 10ms
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 5ms
2021/06/05 11:16:57 http: Accept error: accept tcp [::]:9000: accept4: too many open files; retrying in 5ms
...

It was really cool to implement something like that in Nim!

ZadaZorg (orginal) [2021-06-05T14:06:55+02:00] view original

Not clear, what is the problem you raised?

Why with Nginx there are "Too many open files"? Because when you use a proxy, you double the requirements for open sockets (files), and in this case, you faster meet limits.

Why program crash? Because there is an unhandled exception :)

ZadaZorg (orginal) [2021-06-05T14:24:03+02:00] view original

OK, maybe the correct answer will be... To solve this issue, which is typical for every high load HTTP server, you have to modify the limits of your OS.

Check current limit: ulimit -n

Set new limit: ulimit -S -n <new limit value>.

This change is not persistent, and to make it persistent, you have to apply changes to /etc/sysctl.conf file fs.file-max setting.

There are more ways to achieve this by finetuning the Nginx configuration (worker_rlimit_nofile, worker_connections, worker_processes), but this depends on your preferences and more suitable to production servers. An easy fix for development is to increase the global limit using ulimit command mentioned above.

ZadaZorg (orginal) [2021-06-05T14:53:36+02:00] view original

By modifying example in this way:

import asynchttpserver, asyncdispatch

proc main {.async.} =
  var server = newAsyncHttpServer()
  proc cb(req: Request) {.async.} =
    let headers = {"Date": "Tue, 29 Apr 2014 23:40:08 GMT",
        "Content-type": "text/plain; charset=utf-8"}
    echo($activeDescriptors(), "/", $maxDescriptors())
    await req.respond(Http200, "Hello World", headers.newHttpHeaders())
  
  server.listen Port(9000)
  while true:
    if server.shouldAcceptRequest(6):
      await server.acceptRequest(cb)
    else:
      poll()


asyncCheck main()
runForever()

I was able to prevent server from crash. Looks like default value for shouldAcceptRequest's argument assumedDescriptorsPerRequest which is 5 - incorrect (in this our case). By setting it to 6, server newer crash.

I do not know, how properly guess value of assumedDescriptorsPerRequest.

ZadaZorg (orginal) [2021-06-05T15:25:18+02:00] view original

Sorry for spamming.

Investigated a bit of correct assumedDescriptorsPerRequest value. Looks like it requires a single file description to handle a single request (tested by simple curl call). But this does not take into account the fact, that idle processes consume some amount of file descriptors.

For me (mac os) it was 9:


lsof -a -p 31299
COMMAND   PID  USER   FD     TYPE             DEVICE  SIZE/OFF                NODE NAME
server  31299 romka  cwd      DIR                1,4       128            22182574 /Users/romka/t/nim-server-error
server  31299 romka  txt      REG                1,4    527336            22190701 /Users/romka/t/nim-server-error/server
server  31299 romka  txt      REG                1,4   2547760 1152921500312767057 /usr/lib/dyld
server  31299 romka    0u     CHR               16,0 0t5236958                3121 /dev/ttys000
server  31299 romka    1u     CHR               16,0 0t5236958                3121 /dev/ttys000
server  31299 romka    2u     CHR               16,0 0t5236958                3121 /dev/ttys000
server  31299 romka    3u    IPv4 0x50c0746d84061b8d       0t0                 TCP *:cslistener (LISTEN)
server  31299 romka    4u  KQUEUE                                                  count=0, state=0xa
server  31299 romka    5u    IPv4 0x50c0746d84061165       0t0                 TCP *:* (CLOSED)

I'm not sure, why 6 works for assumedDescriptorsPerRequest, possible not all lines in the output of lsof are real file descriptors, but looks like a good strategy for setting assumedDescriptorsPerRequest will be a prior investigation of idle process needs and then setting safe value. The value will be different depending on dozen of factors, i.e. connections to DB, logging to files, etc.

mrhdias (orginal) [2021-06-05T18:06:24+02:00] view original

Thank you @ZadaZorg for investigating the correct value of the assumedDescriptorsPerRequest argument. I have tested values from -1 to 10 and the result is always the same. The assumedDescriptorsPerRequest doesn't work :-( My OS is Arch Linux! I don't know if this is important.

import asynchttpserver, asyncdispatch

proc main {.async.} =
  var server = newAsyncHttpServer()
  proc cb(req: Request) {.async.} =
    let headers = {"Date": "Tue, 29 Apr 2014 23:40:08 GMT",
        "Content-type": "text/plain; charset=utf-8"}
    await req.respond(Http200, "Hello World", headers.newHttpHeaders())
  
  server.listen Port(9000)
  while true:
    echo "Max Descriptors: ", maxDescriptors()
    echo "Active Descriptors: ", activeDescriptors()
    if server.shouldAcceptRequest(assumedDescriptorsPerRequest = 1):
      echo "Accept Request: ", activeDescriptors()
      await server.acceptRequest(cb)
    else:
      poll()

asyncCheck main()
runForever()

Result:


...
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2
Max Descriptors: 1023
Active Descriptors: 2
Accept Request: 2

Mirror of forum.nim-lang.org

8082 :: Issue when deploying a application with Nginx as a reverse proxy!