0

I want to generate a new xml file (new.xml) based on a xml template (template.xml) using xml.etree.ElementTree. The idea is to change only the value of the <name> tag from 'all' to 'New' leaving the rest of the new.xml file looking exactly as the template.xml. I can change the value of the<name> but the new.xml does not look exactly the same as template.xml

Here is the template.xml:

<?xml version="1.0"?>
<example>
  <version>15.0</version>
  <lastchange/>
  <theme>black</theme>
  <group>
    <name>all</name>
    <description><![CDATA[All Users]]></description>
    <scope>system</scope>
    <gid>1998</gid>
  </group>
</example>

and here is the new.xml:

<example>
  <version>15.0</version>
  <lastchange />
  <theme>black</theme>
  <group>
    <name>New</name>
    <description>All Users</description>
    <scope>system</scope>
    <gid>1998</gid>
  </group>
</example>

As you can notice, in the new.xml the first line is missing and the value of the <description> tag does not have ![CDATA][] structure. This is the script I wrote and I am using:

import xml.etree.ElementTree as ET

def load_xml(name):
    ''' Takes an xml file as input. Outputs ElementTree and element'''
    tree = ET.parse(name)
    root = tree.getroot()
    return tree, root

if __name__ == "__main__":
     # Change and write the new xml
     tree, root = load_xml('template.xml')
     group = root.find('group')
     group.find('name').text = 'New'
     tree.write('new.xml')

Any help? Thank you

diegus
  • 1,168
  • 2
  • 26
  • 57
  • `ElementTree` in the core python doesn't seem to support CData section : [1](http://stackoverflow.com/questions/174890/how-to-output-cdata-using-elementtree), [2](http://stackoverflow.com/questions/9027081/lxml-etree-fromsting-and-tostring-are-not-returning-the-same-data). How about switch to `lxml`? – har07 May 09 '16 at 08:39
  • 1
    Is it a problem to get rid of xml declaration and CDATA sections? In the end, it's the same information set. – Vincent Biragnet May 09 '16 at 08:44
  • Are you saying that the presence or absence of xml declaration and CDATA section does not change the result? i.e. template.xml and new.xml are interpreted as the same file? – diegus May 09 '16 at 08:47
  • @diegus yes, they should be considered the same.. – har07 May 09 '16 at 08:49
  • @har07 How would I write the same python code I wrote using lxml? thanks – diegus May 09 '16 at 08:49
  • I just open the XMLs as pure text and search for the string and substitute sometimes. – curious_weather May 09 '16 at 08:55
  • @krork I prefer to use the appropriate library because this is just a part of the much longer xml file I have. – diegus May 09 '16 at 09:05

1 Answers1

0

lxml provides compatible API, so you only need to specify strip_cdata=False parameter, and use the exact same codes everywhere else :

form lxml import etree as ET

def load_xml(name):
    ''' Takes an xml file as input. Outputs ElementTree and element'''
    # specify parser setting
    parser = ET.XMLParser(strip_cdata=False)
    # pass parser to do the actual parsing
    tree = ET.parse(name, parser)

    root = tree.getroot()
    return tree, root

if __name__ == "__main__":
     # Change and write the new xml
     tree, root = load_xml('template.xml')
     group = root.find('group')
     group.find('name').text = 'New'
     tree.write('new.xml')
har07
  • 88,338
  • 12
  • 84
  • 137