9

What are the optimal settings for Nginx to handle LOTS of requests at the same time?

My server is configured with Nginx and PHP7.3 on Ubuntu 20.04 LTS. Application that is running is build with Laravel 7.

This is my current config:

location ~ \.php$ {
    fastcgi_split_path_info ^(.+\.php)(/.+)$;
    fastcgi_pass unix:/var/run/php/php7.3-fpm.sock;
    fastcgi_index index.php;
    fastcgi_buffer_size 14096k;
    fastcgi_buffers 512 14096k;
    fastcgi_busy_buffers_size 14096k;
    fastcgi_connect_timeout 60;
    fastcgi_send_timeout 300;
    fastcgi_read_timeout 300;
}

The fastcgi-parameters I placed I found via Google and tweaked the number to some high values.

Application does the following:

  • 1500+ users are online
  • they get a multiple-choice-question pushed directly via Pusher
  • they answer the question all together almost at once = 1 request to the server via Ajax for each answer
  • every time an answer is giving, the results are fetched from the server for each user

The all four steps can be done within couple of seconds.

Server is not peaking in CPU nor Memory when this is done, the only thing that is happening is that some users get a 502 timeout.

Looks like a server config-issue in Nginx.

This are stats of the server of the moment it happened:

  • System: 25%, CPU: 22%, Disk IO: 0% - available 8 processor cores
  • RAM: 1.79GB - available 3GB

Side note is that I disabled the VerifyCsrfToken in Laravel to the routes that are called to prevent extra server-load.

What am I missing? Do I have to change some PHP-FPM settings also? If so, to which and were can I do that?

This is what the Nginx-error logs of the domain tells me:

2020/04/25 13:58:14 [error] 7210#7210: *21537 connect() to unix:/var/run/php/php7.3-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 54.221.15.18, server: website.url, request: "GET /loader HTTP/1.1", upstream: "fastcgi://unix:/var/run/php/php7.3-fpm.sock:", host: "website.url"

Settings of www.conf:

pm.max_children = 100
pm.start_servers = 25
pm.min_spare_servers = 25
pm.max_spare_servers = 50
pm.max_requests = 9000
;pm.process_idle_timeout = 10s;
;pm.status_path = /status
user1469734
  • 851
  • 14
  • 50
  • 81
  • "What am I missing?" To test how much memory is in use for exactly single request. And to divide whole server available RAM with that number. That way you will get exact theoretic amount of requests at once. – Tpojka Apr 24 '20 at 11:30
  • added the information i found out about this – user1469734 Apr 24 '20 at 11:37
  • Thing is that one request-response cycle in Laravel takes amount of available RAM. On your machine it is 3 GB but you need to consider RAM used for other parts of system like for OS functioning, other processes and applications mandatory for server running. So let's assume you have 2 GB of RAM available for your laravel application. Maybe it is more, but let's take 2 GB in calculation. Even empty laravel application takes ~10 MB of RAM for each req/res cycle. It would be 200 parallel connection/processes. – Tpojka Apr 24 '20 at 13:24
  • 1
    Let's say one cycle finishes in 400 ms that says we have 2 and a half requests per second. 200x2.5 = 500 requests per seconds available (theoretically, if all is ok). But, if in some peak time of the day you have more than 500 requests that are staggered this sterile way (exactly maximum 200 requests started in same moment) it is why it breaks. RAM on machine is not infinite. – Tpojka Apr 24 '20 at 13:29
  • And when out of ram, a 502 timeout or Nginx occurs? So upgrading ram solves it? @Tpojka – user1469734 Apr 25 '20 at 06:23
  • Can't tell for sure without deeper analysis. Best would be to check how much of RAM is used in each of those four steps. But I rather wanted to tell you that RAM is limited resource. Test, get results, optimize, test, get results, upgrade server, test, get results. That's what would I do in your case. – Tpojka Apr 25 '20 at 09:09
  • Tips how to test this? – user1469734 Apr 25 '20 at 13:05
  • Google for "how to test how much RAM is used by php request" and check first few links. There should also be the links to PHP functions that can be used for that. – Tpojka Apr 25 '20 at 13:10
  • used loader.io and found that php-fpm socked fails.. – user1469734 Apr 25 '20 at 14:07
  • I haven't used it. I wouldn't trust much third party services since those don't have access to machine processes. best you can do is to test it yourself. At least I'd do that if I were you. – Tpojka Apr 25 '20 at 14:34
  • What does your php error log say? – miknik Apr 28 '20 at 17:53
  • @user1469734 can you share your `/etc/php/7.3/fpm/pool.d/www.conf`? especially `pm.*` values. If you didn't change the config file, the small default values are probably the culprit. – Razor Apr 29 '20 at 07:37
  • @Razor added it in the startpost – user1469734 Apr 29 '20 at 09:41
  • @user1469734 Run `ps -ylC php-fpm7.3 --sort:rss` to approximately know how much each child process consumes (RSS column). Assuming an average Laravel app will take 40MB and 1GB is used by other resources: `pm.max_children = (3GB - 1GB) / 40MB = 50` (adapt the other directives accordingly). You should also reduce `pm.max_requests` to avoid memory leaks, set it to 500. You said your memory usage is not peaking, can you confirm the swap is not used? Now I'm just spitballing: set `worker_processes auto;` in `nginx.conf`, check if `listen = /var/run/php/php7.3-fpm.sock` in the `www.conf` file – Razor Apr 29 '20 at 22:21
  • @Razor RSS is apporx 100844, what does that mean? where exactly to place ```worker_processes``` in the nginx.conf? inside the ```server{}``` isnt working – user1469734 Apr 30 '20 at 07:18
  • @user1469734 it means that each child process is using 100MB of RAM, that's quite high than average. If your app isn't too complex, you're probably loading a lot of data into RAM, probably you're filtering database queries with php not with sql. In any case, enable xdebug and start profiling (KCachegrind,blackfire...). `worker_processes` should be set outside any block directive - e.g. see https://www.nginx.com/resources/wiki/start/topics/examples/full/ – Razor Apr 30 '20 at 07:52
  • Could you [increase max open file limits](https://gist.github.com/luckydev/b2a6ebe793aeacf50ff15331fb3b519d), restart nginx service, restart php-fpm service and tell what's going on in _nginx-error logs of the domain_ , system load? – ExploitFate May 02 '20 at 21:42
  • @ExploitFate not help either – user1469734 May 03 '20 at 20:28
  • Hi, did u solved ur problem? I am also facing the same issue? could u pls help me, what u changed? – Ravi Prakash Yadav Aug 16 '20 at 05:10

2 Answers2

1

(11: Resource temporarily unavailable)

That's EAGAIN/EWOULDBLOCK, that means nginx did accept client connections, but it cannot connect to PHP-FPM's UNIX socket without blocking (waiting), and probably, without looking at nginx's source code, nginx had tried several times connecting to said UNIX socket but failed, so nginx throws a Connection refused.

There's a few ways to solve this, either:

  1. increase listen.backlog config value in your PHP-FPM pool config, with its corresponding net.ipv4.tcp_max_syn_backlog, net.ipv6.tcp_max_syn_backlog, and net.core.netdev_max_backlog values in sysctl.
  2. create multiple php-fpm pools, then use upstream nginx config to use these pools.
mforsetti
  • 413
  • 6
  • 8
0

Edit /etc/security/limits.conf, enter:

# vi /etc/security/limits.conf

Set soft and hard limit for all users or nginx user as follows:

nginx       soft    nofile   10000
nginx       hard    nofile  30000
Omer YILMAZ
  • 1,234
  • 1
  • 7
  • 15
  • Not helping. Still same error in the Nginx error log: ```2020/05/03 20:18:08 [error] 3729#3729: *110606 connect() to unix:/var/run/php/php7.3-fpm.sock failed (11: Resource temporarily unavailable) while connecting to upstream, client: 300.24.116.18, server: website.url, request: "GET /loader HTTP/1.1", upstream: "fastcgi://unix:/var/run/php/php7.3-fpm.sock:", host: "website.url"``` – user1469734 May 03 '20 at 20:19