1

Hello guys I want to get an HTML node from website to show it in my website, but I can't do it.

This is my code.

$html = htmlentities(file_get_contents("http://stackoverflow.com/"));
$doc = new DOMDocument();
$doc->loadHTML($html);
$h1 = $doc->getElementsByTagName("title");
var_dump($h1);

And this is the result.

object(DOMNodeList)#2 (1) {
  ["length"]=>
  int(0)
}

Please help. Thanks in advance.

Mohamed Kamel
  • 93
  • 1
  • 13
  • You definitely shouldn't be calling `htmlentities`. `DOMDocument` expects you to load the original HTML, not the HTML converted to entities. – Barmar Mar 08 '17 at 22:33
  • Because this code `$doc = new DOMDocument(); $doc->loadHTML("http://stackoverflow.com/"); $h1 = $doc->getElementsByTagName("title")->item(0)->textContent; print_r($h1);` gives me Null. – Mohamed Kamel Mar 08 '17 at 22:38

1 Answers1

2

There's no need to apply htmlentities on an html string before to parse it. If you do that, all angle brackets are replaced and the parser will no more find any tags.

There's also no need to use file_get_contents to load a file, since DOMDocument has a method to do it.

In your comment, you didn't use the good method to load an HTML file with its URL (and not an HTML string).

The DOMDocument method is DOMDocument::loadHTMLFile and not DOMDocument::loadHTML:

$doc = new DOMDocument();
$doc->loadHTMLFile("http://stackoverflow.com/");
$h1 = $doc->getElementsByTagName("title")->item(0)->textContent;
echo $h1, PHP_EOL;

Note that you can prevent the different warnings to be displayed using libxml_use_internal_errors(true); before this method.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
  • Thank you very much Can you recommend any document or website to me to know more about this topic?!! and what is the best methodology to use in php to get a web page content?!!! – Mohamed Kamel Mar 09 '17 at 06:29