0

I have a LAMP server on which I run a PHP script that makes a SELECT query on a table containing about 1 million rows.

Here is my script (PHP 8.2 and mariaDB 10.5.18) :

$db = new PDO("mysql:host=$dbhost;dbname=$dbname;", $dbuser, $dbpass); 
  
$req = $db->prepare('SELECT * FROM '.$f_dataset);
$req->execute();
$fetch = $req->fetchAll(PDO::FETCH_ASSOC);
$req->closeCursor();

My problem is that each execution of this script seems to consume about 500MB of RAM on my server, and this memory is not released at the end of the execution, so having only 2GB of RAM, after 3 executions, the server kills the Apache2 task, which forces me to restart the Apache server each time.

Is there a solution to this? A piece of code that allows to free the used memory?

I tried to use unset($fetch) and gc_collect_cycles() but nothing works and I haven't found anyone who had the same problem as me.


EDIT

After the more skeptical among you about my problem posted several responses asking for evidence as well as additional information, here is what else I can tell you:

I am currently developing a trading strategy testing tool where I set the parameters manually via an HTML form. This one is then processed by a PHP script that will first perform calculations in order to reproduce technical indicators (using the Trader library for some of them, and reprogrammed for others) from the parameters returned by the form.

In a second step, after having reproduced the technical indicators and having stored their values in my database, the PHP script will simulate a buy or sell order according to the values of the stock market price I am interested in, and according to the values of the technical indicators calculated just before.

To do this, I have in my database for example 2 tables, the first one stores the information of the candles of size 1 minute (opening price, closing price, max price, min price, volume ...), that is to say 1 candle per line, the second table stores the value of a technical indicator, corresponding to a candle, thus to a line of my 1st table. The reason why I need to make calculations, and therefore to get my 1 million candles, is that my table contains 1 million candles of 1 minute on which I want to test my strategy. I could do this with 500 candles as well as with 10 million candles.

My problem now, is only with the candle retrieval, there are not even any calculations yet. I shared my script above which is very short and there is absolutely nothing else in it except the definitions of my variables $dbname, $dbhost etc. So look no further, you have absolutely everything here.

When I run this script on my browser, and I look at my RAM load during execution, I see that an apache process consumes up to 697 MB of RAM. I'd like to say that so far, nothing abnormal, the table I'm retrieving candles from is a little over 100 MB. The real problem is that once the script is executed, the RAM load remains the same. If I run my script a second time, the RAM load is 1400 MB. And this continues until I have used up all the RAM, and my Apache server crashes.

So my question is simple, do you know a way to clear this RAM after my script is executed?

geopra
  • 1
  • 4
  • Do you explicitly close the DB connection? Do you use connection pooling? If so - what happens if you turn it off? – obe Jan 28 '23 at 13:14
  • Yes, I also would like to know how you open and close the database connection. Any other code of interest in the script? You should have the whole script you use in your question, which doesn't need to be very long, given what it has to do. Which version of PHP and MySQL? – KIKO Software Jan 28 '23 at 13:21
  • I'm using PDO, and apart from the $req->closeCursor() I'm not doing anything because the rest of my script (not yet developed) will still need to connect to my database. Regarding the connection pool, it's already off by default in my php.ini file. – geopra Jan 28 '23 at 13:22
  • You could try a [gc_collect_cycles()](https://www.php.net/manual/en/function.gc-collect-cycles.php) at the end of your script. This will [free up any memory directly](https://stackoverflow.com/questions/19715048/what-gc-collect-cycles-function-is-useful-for), which might help. It will not help if the memory is used by MySQL itself. I have to say, if your script needs 500 MB, and you only have 2 GB there's a mismatch between what you want to do and the hardware you have. – KIKO Software Jan 28 '23 at 13:28
  • @KIKOSoftware I already tried the gc_collect_cycles(), but it does not change anything. It's a small VPS which is more than enough for the development of my project, but increasing the RAM of my server will not solve my problem, since my real concern is that at the end of the execution of my script, the memory is not freed, if it was, 1GB would be enough. – geopra Jan 28 '23 at 13:33
  • OK, I'm out of ideas. So, are you really requesting 500 MB of data? That's a huge amount. Do you really need that? What happens if you request just 1 MB many-many times? Same problem? – KIKO Software Jan 28 '23 at 13:39
  • In fact I am realizing that this is really strange. The table contains 1 million rows and is only 109MB. I just do a simple SELECT * on this table, and monitoring my memory usage, when the script is launched, the /usr/sbin/apache2 -k start task is started. The peak RAM usage is 1.18 GB and at the end of the execution, it goes down to 694 MB but the task still exists and the RAM used does not go down. If I run the script a second time, a new task /usr/sbin/apache2 -k start is created and the same thing happens. I end up with 2 processes of 694 MB without any of them being automatically killed. – geopra Jan 28 '23 at 13:46
  • Maybe a memory leak but i'm don't know how is it possible when I see my simple code.. ther's nothing difficult. – geopra Jan 28 '23 at 13:47
  • Your code is quite simple indeed, which is good. Anything else you're running or something special, that's not normally part of a LAMP server? I do think your MySQL version is old, which contrasts with your PHP version, but I doubt that this is the cause. You could try an upgrade. – KIKO Software Jan 28 '23 at 13:53
  • I said anything sorry I got confused with one of my other servers... I use mariaDB under version 10.5.18 and I have no other notable processes running on my server – geopra Jan 28 '23 at 14:01
  • Could you perhaps explain why you need to fetch over a million records in one go and then store them all in a single array? It's easier to come with alternatives when you know what the end goal is. – M. Eriksson Jan 28 '23 at 14:08
  • MariaDB 10.5 is not that old. – KIKO Software Jan 28 '23 at 14:10
  • @M.Eriksson There is no particular interest in fetching so much data at once, I need to do some statistical calculations on this data but I will probably use a while($data = $fetch->()) to reduce the load on RAM memory. The problem is that whatever the alternative, the memory used by my script is not freed at the end of its execution. And I will need to run this script about once a minute, and if the problem persists, there will always come a time when the memory limit will be reached. – geopra Jan 28 '23 at 14:21
  • I’m wondering if you are running into a timeout issue. Maybe Apache is killing the browser HTTP connection but PHP is still running in the background and doesn’t know it isn’t talking to anyone anymore. I don’t know if `connection_aborted` would be helpful in this situation or not. How long have you waited to see if the memory has been released? – Chris Haas Jan 28 '23 at 15:18
  • @Chris Haas That's also what I thought, so I created a counter to simply count the lines fetched, and so if the counter gives me the right number of lines at the end it means that the script has been executed in full and has no reason to run afterwards. And I got the right number of lines. As for the waiting time before the release of the memory, I waited almost 2 hours at least – geopra Jan 28 '23 at 15:24
  • Hmmm, weird. Maybe try removing the database from the equation temporarily, and just make something that loops a million times and randomly generates similar data – Chris Haas Jan 28 '23 at 17:42
  • "I will need to run this script about once a minute" - If that's true, this should probably be a background task, probably a CLI thing, and Apache shouldn't be involved in the problem. – Chris Haas Jan 28 '23 at 18:54
  • Of course, except that there are two problems in your answer. The first is that despite the possibility of doing a cron task, I also need to be able to do it manually from my browser. The second is that even if I go through CLI, it will not solve the problem I encountered with Apache, the goal is not only to find an alternative, but also to understand why my script doesn’t work well when I see no objective reason that would prevent it from doing what it has to do correctly. – geopra Jan 28 '23 at 23:42
  • @geopra, I hope my comment didn’t come across in a way I didn’t intend, but if there was anything that was offending, I apologize. I understand the desire to figure out why it is doing that when it shouldn’t, ands that is very frustrating. I will stand by my CLI recommendation, however. It doesn’t explain the Apache problem, it just removes is. Even if you have a traditional web page, switching to a task-based queue system feels more appropriate for this application. You can initiate the request in a browser, the background task runs, and the browser polls for completion. – Chris Haas Jan 30 '23 at 00:25

2 Answers2

0

What you describe is improbable and you don't say how you made these measurements. If your assertions are valid then there are a couple of ways to solve the memory issue, however this is the xy problem. There is no good reason to read a million rows into a web page script.

symcbean
  • 47,736
  • 6
  • 59
  • 94
  • 1
    I agree. I would still like to see [a complete code example](https://stackoverflow.com/help/minimal-reproducible-example), instead of just some snippets, and more information about the database table, so we can try it ourselves. However, I don't think these few lines of text constitute an answer to the question. – KIKO Software Jan 28 '23 at 18:29
  • No, you didn't. – symcbean Jan 28 '23 at 23:43
  • Oh yes, I did ;) – geopra Jan 28 '23 at 23:50
0

After several hours of research and discussion, it seems that this problem of unreleased memory has no solution. It is simply the current technical limitations of Apache compared to my case, which is not able to free the memory it uses unless it is restarted every time.

I have however found a workaround in the Apache configuration, by only allowing one maximum request per server process instead of the default 5.

This way, the process my script is running on gets killed at the end of the run and is replaced by another one that starts automatically.

geopra
  • 1
  • 4