0

I have to parse NSData with XML string, does somebody know simple category to do it? I have such for JSON, but I forced to use XML. I tried to use XMLReader, it's interface looks clean, but I found some issues:

  1. Mysterious new line characters and spaces everywhere:

    "comment_count" = {text = "\n              \n              21";};
    
  2. My cyrillic symbols looks so:

    "description_text" = {text = "\n              \U041f\U0438\U043a\U0430\U0431\U0443\U0448};
    

Example:

<?xml version="1.0" encoding="UTF-8" ?>
<news>
    <xml_count>43</xml_count>
    <hot_count>449</hot_count>
    <item type="text">
        <id>1469845</id>
        <rating>147</rating>
        <pluses>171</pluses>
        <minuses>24</minuses>
        <title>
            <![CDATA[Обновление огромного архива Пикабу!]]>
        </title>
        <comment_count>26</comment_count>
        <comment_link>http://pikabu.ru/story/obnovlenie_ogromnogo_arkhiva_pikabu_1469845</comment_link>
        <author>icq677555</author>
        <description_text>
            <![CDATA[Пикабушники, я обновил свой огромный архив текстовых постов из горячего!]]>
        </description_text>
    </item>
</news>
Timur Bernikovich
  • 5,660
  • 4
  • 45
  • 58

1 Answers1

1

I just realized whats' going on. Your data samples are obviously NSDictionary instances printed in the debugger. So the issues you found are:

  1. As XML was originally designed as an annotated text format, the whitespace (spaces, newlines) handling doesn't perfectly fit for data only usage. You can either trim all resulting strings ([stringVar stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]), adapt XMLReader to do it or use the XML parser at http://ios.biomsoft.com/2011/09/11/simple-xml-to-nsdictionary-converter/ (which does it by default).

  2. The funny output you get for Cyrillic characters is the proper escaping for non-ASCII characters in the debugger output (which uses the old-style property list format). It's an artifact of the debugger output. Your variables contain the proper characters.

BTW: While JSON contains implicit type information (strings are always quoted, numbers are never quoted etc.), XML without a schema file does not. So all the parsed simple values will be strings even if they originally were numbers.

Update:

The XML parser you're using still contains the old whitespace handling code described in Pesky new lines and whitespace in XML reader class (though the comment tells otherwise). Apply the fix mentioned at the bottom of the answer, namely change the line:

[dictInProgress setObject:textInProgress forKey:kXMLReaderTextNodeKey];

to:

[dictInProgress setObject:[textInProgress stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] forKey:kXMLReaderTextNodeKey];
Community
  • 1
  • 1
Codo
  • 75,595
  • 17
  • 168
  • 206
  • Hah, I used this class. So 1: as you can see whitespace are still there, 2: it's not only in the debugger. – Timur Bernikovich Aug 12 '13 at 18:32
  • Where do you get invalid Cyrillic characters (except in the debugger)? And please show the code how you call the XML reader. – Codo Aug 12 '13 at 18:39
  • I'm sorry, I've tried to set text in UITextView from NSDictionary (with - (NSString)description method), but when I got NSString from NSDictionary before set it to UITextView it looks good. But there is still trouble with whitespaces. – Timur Bernikovich Aug 12 '13 at 18:46