1

I have a large number of xml files with a structure similar to the following, although they are far larger:

<?xml version="1.0" encoding="UTF-8"?>
<a a1="3.0" a2="ABC">
  <b b1="P1" b2="123">first
  </b>
  <b b1="P2" b2="456" b3="xyz">second
  </b>
</a>

I want to get the following output:

1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3

where:

  1. Field 1 is the sequence number for nodes /a/b
  2. Field 2 is the sequence number of the attribute as it appears in the xml file
  3. Field 3 is the attribute name (not value)

I don't quite know how to calculate field 2 correctly.

I've prepared the following xslt file:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="@*|node()">
  <xsl:copy>
   <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/">
  <xsl:for-each select="a/b/@*">
   <xsl:value-of select="count(../preceding-sibling::*)+1"/>
   <xsl:text>|</xsl:text>
   <!-- TODO: This is not correct -->
   <xsl:value-of select="count(preceding-sibling::*)+1"/>
   <xsl:text>|</xsl:text>
   <xsl:value-of select="name()"/>
   <xsl:text>&#10;</xsl:text>
  </xsl:for-each>
 </xsl:template>

</xsl:stylesheet>

but when I run the following command:

xsltproc a.xslt a.xml > a.csv

I get an incorrect output, as field 2 does not represent the attribute sequence number:

1|1|b1
1|1|b2
2|1|b1
2|1|b2
2|1|b3

Do you have any suggestions on how to get the correct output please?

Please notice that the answers provided in XSLT to order attributes do not provide a solution to this problem.

The order of attributes is irrelevant in XML. For instance, <a a1="3.0" a2="ABC"> and <a a1="3.0" a2="ABC"> are equivalent.

However this specific question is part of a larger application where it is essential to establish the order in which attributes appear in given xml files (and not in xml files that are equivalent to them).

Yalmar
  • 435
  • 1
  • 4
  • 16
  • Attribute order is insignificant in XML. Possible duplicate of [XSLT to order attributes](https://stackoverflow.com/questions/19718597/xslt-to-order-attributes) – kjhughes Apr 15 '19 at 13:06
  • It is well known that attribute order is insignificant in XML, but that is not the point of this post. Attribute order is significant for the particular application that I'm examining. – Yalmar Apr 15 '19 at 14:53
  • @Yalmar: And the point of my comment was to discourage you from perpetuating that problematic practice. – kjhughes Apr 16 '19 at 13:31

2 Answers2

2

Although, as kjhughes says in comments, attribute order is insignificant. However, you can still select them, and use the position() element to get the numbers you are after (You just can't be sure the order they are output will be the order they appear in the XML, although generally this will be the case).

Try this XSLT. Do note the nested use of xsl:for-each to select only b elements first, to get their position, before getting the attributes, which then have their own separate position.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text" />

 <xsl:template match="/">
   <xsl:for-each select="a/b">
     <xsl:variable name="bPosition"  select="position()"/>
     <xsl:for-each select="@*"> 
       <xsl:value-of select="$bPosition"/>
       <xsl:text>|</xsl:text>
       <xsl:value-of select="position()"/>
       <xsl:text>|</xsl:text>
       <xsl:value-of select="name()"/>
       <xsl:text>&#10;</xsl:text>
    </xsl:for-each>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>
kjhughes
  • 106,133
  • 27
  • 181
  • 240
Tim C
  • 70,053
  • 14
  • 74
  • 93
1

You could use the position() of the items in the sequence of attributes that you are iterating over and combine with logic for the position of its parent element.

<xsl:template match="/">
    <xsl:for-each select="a/b/@*">
        <xsl:value-of select="count(../preceding-sibling::*)+1"/>
        <xsl:text>|</xsl:text>
        <!-- TODO: This is not correct -->
        <xsl:value-of select="position() - 
               (if (count(../preceding-sibling::*)) then count(../preceding-sibling::*)+1 else 0)"/>
        <xsl:text>|</xsl:text>
        <xsl:value-of select="name()"/>
        <xsl:text>&#10;</xsl:text>
    </xsl:for-each>
</xsl:template>

Which produces the following output:

1|1|b1
1|2|b2
2|1|b1
2|2|b2
2|3|b3
kjhughes
  • 106,133
  • 27
  • 181
  • 240
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147