-1

​ I'm trying to parse XML into JSON using Java. JSON.parse is throwing this error on this character: 

JSON.parse: bad control character in string literal

I attempt to replace these characters before I send them to JSON.parse but this line of code is not working. Is there a better method of replacing/removing these characters completely?

String trim = desc.replaceAll("
", "\\n");

XML to be parsed

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod 
    tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim 
    veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea 
    commodo consequat. Duis aute irure dolor in reprehenderit in voluptate 
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint 
    occaecat cupidatat non proident, sunt in culpa qui officia deserunt 
    mollit anim id est laborum.
ControlAltDel
  • 33,923
  • 10
  • 53
  • 80
PT_C
  • 1,178
  • 5
  • 24
  • 57
  • 1
    Java or JavaScript? – Phix Sep 27 '17 at 20:01
  • 1
    clearly java. There's no `String trim = ` in javascript – ControlAltDel Sep 27 '17 at 20:02
  • @Phix XML > Java > JavaScript – PT_C Sep 27 '17 at 20:03
  • is html code to anticipate hex character, but the x in xD is not a valid hex digit! – ControlAltDel Sep 27 '17 at 20:05
  • The "xml to be parsed" is not xml, it is text that contains some xml entities. What do you really want to do? is an entity that represents a carriage return. – DwB Sep 27 '17 at 20:26
  • Where did you actually get this text from? If you got it from an XML document, you should have used an XML API, like DocumentBuilder or SAXParser, to read the text content; then you wouldn’t need to do the replacement yourself. – VGR Sep 27 '17 at 21:14

2 Answers2

0

When the example you have shown contains the complete XML input you have, you are not parsing XML.

Assuming this is a fragment. Your solution escapes only one thing but to get valid JSON you should escape all characters which are not allowed in JSON or would lead to unwanted behavoiur. So it would be a good idea to look for something, that can propperly escape JSON for you like:

Java escape JSON String?

0

Figured it out:

  public static String cleanDescription(String desc){

        String trim = desc.replaceAll("<.*?>", ""); //removes html elements
        //there's a phantom question mark that sometimes gets added to the the front and end of the string
        if(!Character.isLetter(trim.charAt(0))) trim = trim.substring(1, trim.length());

        Integer charCount = 0;
        for(int j = 1; j <= 3; j++){
            if(!Character.isLetter(trim.charAt(trim.length() - j)) &&
                    !Character.isDigit(trim.charAt(trim.length() - j))) charCount++;
        }
        if(charCount >= 2) trim = trim.substring(0, trim.length() - (charCount - 1));


        Pattern pt = Pattern.compile("[^a-zA-Z0-9()\\.\\,]");
        Matcher match= pt.matcher(trim);
        while(match.find())
        {
            String s = match.group();
            trim = trim.replaceAll("\\" + s, " ");
        }

        return trim.trim();
    }
PT_C
  • 1,178
  • 5
  • 24
  • 57