0

XML (actually TCX) files produced by the routing site I use (plotaroute.com) produce a root tag line that is very long. I am not able to get Python to read this file. (My temporary workaround is to "manually" delete all the extra text after the root tag before running my program.)

If there's any text after "<TrainingCenterDatabase" in line 2 of the TCX file, the program won't print anything.

from dotenv import load_dotenv
import os

from xml.etree import ElementTree

from pathlib import Path

load_dotenv()
data_folder = Path(os.getenv('DATA_FOLDER'))
TCX_full = data_folder / "Poudre_Up024.tcx"
dom = ElementTree.parse(TCX_full)
trackpoints = dom.findall('Courses/Course/Track/Trackpoint')

i=0
for t in trackpoints:
    lat = float(t.find('Position/LatitudeDegrees').text)
    long = float(t.find('Position/LongitudeDegrees').text)
    alt = float(t.find('AltitudeMeters').text)
    dist_meters = float(t.find('DistanceMeters').text)
    print('%5d %10.6f %10.6f %6.1f %8.2f' % (i, lat, long, alt, dist_meters))
    i += 1

Here's a snippet of my original TCX file.

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<TrainingCenterDatabase xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd">
<Folders>
<Courses>
<CourseFolder Name="Courses">
<CourseNameRef>
<Id>Poudre Up</Id>
</CourseNameRef>
</CourseFolder>
</Courses>
</Folders>
<Courses>
<Course>
<Name>Poudre Up</Name>
<Lap>
<TotalTimeSeconds>9714.643764117841</TotalTimeSeconds>
<DistanceMeters>50200.7974473868</DistanceMeters>
<BeginPosition>
<LatitudeDegrees>40.663732</LatitudeDegrees>
<LongitudeDegrees>-105.189232</LongitudeDegrees>
</BeginPosition>
<EndPosition>
<LatitudeDegrees>40.699534</LatitudeDegrees>
<LongitudeDegrees>-105.580839</LongitudeDegrees>
</EndPosition>
<Intensity>Active</Intensity>
<Cadence>0</Cadence>
</Lap>
<Track>
<Trackpoint>
<Time>2022-10-06T00:00:00Z</Time>
<Position>
<LatitudeDegrees>40.663732</LatitudeDegrees>
<LongitudeDegrees>-105.189232</LongitudeDegrees>
</Position>
<AltitudeMeters>1599</AltitudeMeters>
<DistanceMeters>0</DistanceMeters>
<SensorState>Absent</SensorState>
</Trackpoint>

MANY <Trackpoint></Trackpoint>

</Track>

SEVERAL <CoursePoint></CoursePoint>

</Course>
</Courses>
</TrainingCenterDatabase>
  • 1
    You need to learn how to handle **namespaces** in the XML – miriamka Oct 25 '22 at 21:09
  • @miriamka Using the {*} wildcard technique in [https://stackoverflow.com/a/62117710/407651](https://stackoverflow.com/a/62117710/407651), I was able to read the original file successfully. So far, I haven't been able to succeed with register_namespace. – virtualdynamo Oct 26 '22 at 01:55
  • What do you mean by "succeed with register_namespace"? `register_namespace` only affects serialization. See https://stackoverflow.com/a/58627058/407651 – mzjn Oct 26 '22 at 08:03
  • @mzjn I was hoping to use **register_namepace** "to handle things in a more procedural fashion" per [https://medium.datadriveninvestor.com/getting-started-using-pythons-elementtree-to-navigate-xml-files-dc9bc720eaa6] – virtualdynamo Oct 26 '22 at 12:38
  • That article is misleading IMHO, because it implies that `register_namespace` affects searching/querying of an XML document, which is false. – mzjn Oct 26 '22 at 14:51
  • @mzjn Well that makes me feel better because I was getting nowhere. – virtualdynamo Oct 27 '22 at 02:09
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Nov 02 '22 at 13:43

1 Answers1

0

Here's a snipppet of the quick and dirty solution of adding {*} to findall/find:

trackpoints = dom.findall('{*}Courses/{*}Course/{*}Track/{*}Trackpoint')

i=0
for t in trackpoints:
    lat = float(t.find('{*}Position/{*}LatitudeDegrees').text)
    long = float(t.find('{*}Position/{*}LongitudeDegrees').text)
    alt = float(t.find('{*}AltitudeMeters').text)
    dist_meters = float(t.find('{*}DistanceMeters').text)
    print('%5d %10.6f %10.6f %6.1f %8.2f' % (i, lat, long, alt, dist_meters))
    i += 1