can't get xhtml – olu mide Jan 16 '20 at 13:05

  • If `//script` works then the issue is with your original XPath query and not your XML parser. – Botje Jan 16 '20 at 13:08
  • The xPath works, but i couldn't get the content from libxml++, no class method to give me the content; which should be the text – olu mide Jan 16 '20 at 13:10
  • 1 Answers1

    1

    I tried reproducing your issue locally and could not get root->find(xpath) to produce any nodes. According to this issue, you need to tell XPath which namespace your nodes are under, even if it is the default namespace.

    I changed the XPath string and find invocation as follows:

    std::string xpath("/x:html/x:body/x:div/x:div/x:div[2]/x:script");
    xmlpp::Node::PrefixNsMap nsMap = {{"x",root->get_namespace_uri()}};
    xmlpp::Node::NodeSet elemns = root->find(xpath, nsMap);
    
    xmlpp::Node* element = elemns[0];
    const auto nodeText = dynamic_cast<const xmlpp::Element*>(element);
    if (nodeText) {
        std::cout << nodeText->get_first_child_text()->get_content() << std::endl;
    }
    
    Botje
    • 26,269
    • 3
    • 31
    • 41
    • Comparing your answer with the current implementation i resolved to from one of the test unit implementation, this is the right answer.. Thanks so much, yours is way lighter and more direct than the copy-paste i was trying to use.... Thanks once again – olu mide Jan 16 '20 at 13:53
    • 1
      The namespace is not dynamically changing. It would be better to use just `"http://www.w3.org/1999/xhtml"` – Alejandro Jan 16 '20 at 22:56
    • In my usage, I didn't use the name space. Possibly, html-tidy might have added namespace somewhere while cleaning and converting HTML to XML. – olu mide Jan 17 '20 at 09:12