-1

I'm looking for a library to create XLSX files which can contain upwards of a million rows, and several dozen columns. So far all the libraries I have found in Python consume too much memory, and I haven't found a suitable library to wrap in C. I'd prefer open source so I can modify the code if need be.

EDIT: I have found a solution. openpyxl has an "Optimized Writer": http://packages.python.org/openpyxl/optimized.html

Mike Pennington
  • 41,899
  • 19
  • 136
  • 174
rpkelly
  • 2,076
  • 1
  • 20
  • 19

2 Answers2

1

have you tried ElementTree? if that uses too much memory, use SAX and just process a row at a time. XML parsing - ElementTree vs SAX and DOM

Community
  • 1
  • 1
jcomeau_ictx
  • 37,688
  • 6
  • 92
  • 107
  • 1
    No, but I don't see how any of those packages will help me in writing the file. The whole document can't be created in memory: it is too large. – rpkelly Jun 21 '11 at 20:51
  • I misread your question; if you are simply creating from scratch, and not getting input from other files, you could just write out your data in the XLSX format a row at a time. the libraries you tried don't provide for that? – jcomeau_ictx Jun 21 '11 at 20:57
  • 1
    @rpkelly: If the whole document can't be created in memory, what are you going to do with the file once you've made it? Is Excel smart enough to just load the bit you're looking at? – Thomas K Jun 21 '11 at 21:01
  • @Thomas K: Yes, Excel is smart enough. @John Machin: 1) Yes, I have, and Excel can handle it. 2) If you would like to go back and answer the questions so I can accept them, I'd be happy to. – rpkelly Jun 21 '11 at 23:08
0

The XLSX format consists of a number of XML files that have been zipped. If the format of the output will not be changing, it would be trivial to use an existing file as a template and simply add rows to it as necessary. Unfortunately ZipFile.writestr doesn't allow you to write the file in pieces, so you'll have to write the entire XML to a temporary file then place that into the zip with ZipFile.write.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622