0

I am trying to build a script that gets XMLs of different structures and create one unique XML containing all the data. A key point is that using an XSLT I rename/transform the nodes/elements because as I said the source XMLs have different structure. So the newly produced XML has a generic structure for me to use properly.

My question is how can I modify this to produce me a result of multiple XMLs?

The XMLs will be either relative or absolute path.

<?php

// create an XSLT processor and load the stylesheet as a DOM 
$xproc = new XsltProcessor();
$xslt = new DomDocument;
$xslt->load('stylesheet.xslt');    // this contains the code from above
$xproc->importStylesheet($xslt);


// DOM or the source XML
$xml = '';
$dom = new DomDocument;
$dom->loadXML($xml);


?>
EnexoOnoma
  • 8,454
  • 18
  • 94
  • 179
  • Possible [duplicate](http://stackoverflow.com/questions/6338485/xslt-document-function-folder-hierarchy/6342127#6342127) for the XSLT part. – Emiliano Poggi Jun 22 '11 at 06:14

1 Answers1

0

Hmm. Well, there's simply too many ways to skin this cat, but the two obvious ones are ;

  1. Import the XML files through the XSLT layer. This requires XSLT skills.
  2. Import the XML files through the PHP layer. This requires PHP skills.

First solution means you find a list of all the files you want to process, then create an XML file list from it like so which you pass in to the XSLT processor ;

<list>
   <file src="..." />
   <file src="..." />
   <file src="..." />
</list>

Now you loop through (or match) these files, and use something like ;

<xsl:template match="file">
   <xsl:apply-templates select="document(@src)/*" />
</xsl:template>

The other solution is to find all your files, and bind them together in a larger XML ;

$xml = "<all_files>" ;
foreach ( $files as $file )
   $xml .= file_get_contents ( $file ) ;
$xml .= "</all_files>" ;

And have your XSLT apply templates as normal. This obviously requires some cleanup (like removing XML declarations and inline commands if there are any, possibly also DTDs and some entity stuff, but it's not too hard)

Both solutions require tweaks. What's your skills? What do you prefer to do? Do you need it to be memory saving? Fast? Concise? Flexible?

AlexanderJohannesen
  • 2,028
  • 2
  • 13
  • 26
  • Hi there thank you for your answer! I will check the code now and reply you. Combining the XMLs manually is not an option, it has to load the XMLs from other websites so that I will have the updated document every time I run the code. I can handle through hard working both. In my case the script will have to handle many XMLs more than 1000. I guess I need memory saving and fast, right? How does this change my approach? – EnexoOnoma Jun 22 '11 at 06:11
  • 1
    You're right that loading 1000s of documents using the document() function is likely to lead to memory problems. It's not clear to me whether each file is processed independently of the others. If it is, I would run the same transformation 1000 times on 1000 different source files, rather than trying to run one transformation that processes all the files. – Michael Kay Jun 22 '11 at 09:21
  • Hard to disagree with you, Michael, although you could invoke SAX and hope for the best. :) Yes, if you've got insane amounts of files, split them, batch them or otherwise mangle the process. – AlexanderJohannesen Jun 22 '11 at 10:55
  • @Sampas "more than 1000. I guess I need memory saving and fast, right? How does this change my approach?" Well, it depends on the size of your files. Read the files in, check how much memory they each use on average, then process one document, find out how much memory that used, time it (still on average) across over 1000 files, and that's your answer. You may want to bump the memory usage a PHP script can use if it gets too crazy, but with small feed files you should be ok. – AlexanderJohannesen Jun 22 '11 at 22:42