1

While trying to print the output of the simplehtmldom it gives me 500 error. I tried followed methods but error was same.

  • Method 1

    $html = file_get_html("http://www.google.com");

    print_r($html);

    After reading responses to other questions, I checked if allow_url_fopen was working and it was.

  • Method 2

    $html = file_get_contents("http://www.google.com");

    print_r($html);

    This works but when I parse it with following code, again 500 error.

    $object = new simple_html_dom();

    $object->load($html);

    var_dump($object);

  • Method 3

    Then as last resort I thought I should try using curl and then parse. So I used curl and to make sure curl was working i printed the output at it was working. But when I parsed into the simplehtmldom again 500 error on printing the output.

[Sat Sep 08 21:26:19.456961 2018] [:error] [pid 703804] ModSecurity: Output filter: Response body too large (over limit of 404800001, total not specified).

I increased the limit almost a 100 times but still the same error.

Manmohan
  • 720
  • 6
  • 18
Saad Bashir
  • 4,341
  • 8
  • 30
  • 60

1 Answers1

1

The error message indicates ModSecurity is complaining about Response body being too large. This does not mean there is something wrong with loading HTML using Simple HTML DOM library, it is about the size of response generated by your code (print_r or var_dump parts). I guess this is because the structure of the HTML you're loading requires lots of nested objects to represent DOM tree, so when you try to output the full structure using print_r or var_dump the response becomes too large.

You can verify that the HTML is loaded and parsed by simply printing the plain HTML of the page (use print instead of print_r to print simple_html_dom object):

$html = file_get_html("http://www.google.com");

print($html);

and you will see the HTML is retrieved correctly, and you can work with $html object to manipulate DOM the way you expect to work with simple_html_dom objects.

If you want to change the output limit for ModSecurity so you can generate larger responses, please have a look at this question: Mod Security response/request body size?

Nima
  • 3,309
  • 6
  • 27
  • 44
  • Thank you @Nima using print brings the output. Will now work on the second part of your answer about Mod Security and lets see how that goes! :D – Saad Bashir Sep 11 '18 at 05:07
  • tried increasing the body size by almost 100 times and yet no luck – Saad Bashir Sep 11 '18 at 09:33
  • As the answer on the other question says, the default response size is 512K so 100 times might be still less than enough, and it also says there is a hard limit of 1GB. – Nima Sep 11 '18 at 19:40
  • my previous limit was 3.86M which i increased to 386M. But same error persisted – Saad Bashir Sep 12 '18 at 10:18
  • To check if the cause of the problem is response limit, I suggest you go for the maximum possible limit, which is 1GB. If it worked, you can adjust it. But I'm curious, why do you need to `print_r` this object at all? – Nima Sep 12 '18 at 10:28