1

I got som difficulties to output "HTML-string code" through XML.I have presented a example below. In the server-side I have some code written in PHP.

$htmlCode = "<div>...........................</div>";

header("Content-type: text/xml");

echo "<?xml version='1.0' encoding='ISO-8859-1'?>";
echo "<info>";
echo "<htmlCode>";
echo $htmlCode;
echo "</htmlCode>";
echo "</info>";

The problem lies in that "HTML string code" or $htmlCode above has tag elements, so the "HTML string codes" will be treated as XML code. And I want the output to be treated as a string.

And in the clientside I have a "AJAX call" to retrieve the string of HTML code.

document.getElementById('someID').innerHTML=xmlhttp.responseXML.getElementsByTagName("htmlCode")[0].childNodes[0].nodeValue;//I got nothing because the string is treated as XML code.

How do I solve this problem? I hope I have been specific enough for you to understand my problem.

einstein
  • 13,389
  • 27
  • 80
  • 110

3 Answers3

5

You are looking for CDATA.

The term CDATA is used about text data that should not be parsed by the XML parser.

Everything inside a CDATA section is ignored by the parser.

A CDATA section starts with <![CDATA[ and ends with ]]>:

// escape closing tags
$htmlCode = str_replace("]]>", "<![CDATA[]]]]><![CDATA[>]]>", $htmlCode);

echo "<?xml version='1.0' encoding='ISO-8859-1'?>";
echo "<info>";
echo "<htmlCode>";
echo "<![CDATA[".$htmlCode."]]>";
echo "</htmlCode>";
echo "</info>";

Added escaping fix from here

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • 2
    This is not watertight. If the HTML data contains the sequence `]]>` (which is quite valid to have in HTML), it will end the CDATA section prematurely. – bobince Oct 05 '10 at 22:09
  • Hi Pekka! I used the first answer you gave me and it worked after some time without ]]>. I dont know how I did it, but it worked at least. I was about to post a question of how to remove "]]>" – einstein Oct 05 '10 at 22:58
3

Don't build XML from strings. Just don't. There are ready-to-use libraries that do the right thing. PHP's DOM implementation is one of them.

$myHtmlString = "<div>Some HTML</div>";

$xml  = new DOMDocument('1.0', 'utf-8');

$info = $xml->createElement('info');
$xml->appendChild($info);

$htmlCode = $xml->createElement('htmlCode', $myHtmlString);
$info->appendChild($htmlCode);

echo $xml->saveXML();

This seems like the more complicated approach, but in fact this makes sure your XML is correct. In contrast, just throwing a few strings together will go wrong at some point.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Using DOM class is the best way to handle XML. However, I think it'd be even better if the example have the string $myHtmlString loaded as a XML piece. – Davis Peixoto Oct 05 '10 at 23:50
1

Use htmlspecialchars(). Although this function is named after HTML, it is also quite acceptable for XML. (htmlentities() wouldn't be, but you almost never want to use that one anyway.)

<?php
    $htmlCode = "<div>...........................</div>";

    header("Content-type: text/xml");
    echo '<?xml encoding="ISO-8859-1"?>'; // really? sure?
?>
<info>
    <htmlCode><?php echo htmlspecialchars($htmlCode); ?></htmlCode>
</info>

Using a CDATASection is also OK, but given that you need to escape the sequence ]]> in that case, there's really not much advantage over XML-encoding.

bobince
  • 528,062
  • 107
  • 651
  • 834