2

In my application, I am getting excel file in chunks as input. Let say, if excel file size is 20 MB then I will get 4 chunks, where each chunk(byte[]) is of 5MB. I am writing each chunk (byte[]) into temp file (with no extension). I am using this temp file to regenerate actual excel file which is sent to me in multiple chunks. My requirement is that generated excel file must be same as the excel file which I got in chunks.

Sample Code:

  1. Reading excel file, converting into chunks and writing those chunks to temp file.

    public static void readExcelFileBytes(String srcFile) throws IOException {

    File file = new File(srcFile);
    
    FileInputStream fis = new FileInputStream(file);
    
    byte[] buf = new byte[1024 * 5]; // 5KB
    int totalNoOfBytes = 0;
    try {
        for (int readNum; (readNum = fis.read(buf)) != -1;) {               
            appendByteArrayToTempFile(buf);
        }
    } catch (IOException ex) {
        ex.printStackTrace();
    }
    System.out.println("Read: Total Size Bytes" + totalNoOfBytes);
    
    }
    
    public static boolean appendByteArrayToTempFile(byte[] byteArray) {
    
    boolean result = false;
    BufferedWriter writeout = null;
    File bodfile = null;
    FileOutputStream out = null;
    try {
        bodfile = new File("C://tempFile");
        out = new FileOutputStream(bodfile, true);
        out.write(byteArray);           
    
        result = true;
    
    } catch (IOException ex) {
        ex.printStackTrace();
    } finally {
        try {
            //writeout.close();
            out.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }
    return result;
    }
    
  2. Using temp byte[] to regenerate excel file.

public static void tempToFile(String srcFilePath) throws IOException{

   File file = new File("C://tempFile");

   FileInputStream fis = new FileInputStream(file);

   ByteArrayOutputStream bos = new ByteArrayOutputStream();
   byte[] buf = new byte[1024];
   try {
       for (int readNum; (readNum = fis.read(buf)) != -1;) {
           bos.write(buf, 0, readNum); //no doubt here is 0                

       }
   } catch (IOException ex) {
       ex.printStackTrace();
   }
   byte[] bytes = bos.toByteArray();

   // Create source file from Temp's byte array
   FileOutputStream fileOuputStream =  new FileOutputStream(srcFilePath); 
   fileOuputStream.write(bytes);
   fileOuputStream.flush();
   fileOuputStream.close();
}

Issue:

public static void main(String s[]) throws IOException {

    // XLS POC
    readExcelFileBytes("C://Input.xlsx");
    tempToFile("C://Output.xlsx");

}

But when excel file is generated using temp file byte[], it gets currepted. Can somebody help me whether I am following proper way to re-generate excel file using temp file byte[]?

Alexis Pigeon
  • 7,423
  • 11
  • 39
  • 44
Narendra Verma
  • 2,372
  • 6
  • 34
  • 61
  • 1
    Why do you _reopen_ the temp file each time? Also, you don't `.flush()` after having written into the temp file. – fge Jan 04 '13 at 11:39
  • 1
    Do a bytewise comparison of the input and output files; this will tell you where the first error is. That should hopefully be a big clue. – Oliver Charlesworth Jan 04 '13 at 11:39
  • @fge You don't need to flush before closing. – Marko Topolnik Jan 04 '13 at 11:45
  • @MarkoTopolnik output streams are buffered – fge Jan 04 '13 at 11:49
  • @fge So? Do you have an example of a broken `OutputStream` implementation whose `close` method doesn't flush automatically? `FileOutputStream` is not such an example in any case. – Marko Topolnik Jan 04 '13 at 11:51
  • @MarkoTopolnik oh, plenty: sockets, for instance. In any case, it never hurts, so you'd better `.flush()`... If it is a noop, good. If it isn't, you have saved you day. – fge Jan 04 '13 at 12:00
  • @fge You mixed it up... if you close the output stream retrieved from `Socket.getOutputStream()`, you'll never lose a byte. If, on the other hand, you close the socket without previously closing the output stream, only then you may experience loss (probably not even then, though). – Marko Topolnik Jan 04 '13 at 12:03
  • @MarkoTopolnik "if you close the output stream retrieved from Socket.getOutputStream(), you'll never lose a byte" <-- wrong. Examples are plenty on SO where people `.write()` to it without flusing afterwards and wonder why the client hasn't received anything. OK, nevermind, do as you please and I'll keep flushing around in any case ;) – fge Jan 04 '13 at 12:10
  • @fge For a detailed account of the flushing behavior of socket output streams, see [this answer](http://stackoverflow.com/a/3428934/1103872). Main point: `close` guarantees that everything is queued into the OS-level send buffer, but the actual sending happens asynchronously (as always). This asynchrony is the only potential pitfall. – Marko Topolnik Jan 04 '13 at 12:17
  • @ Oli: Can you please suggest the way for bytewise comparison? – Narendra Verma Jan 04 '13 at 12:37
  • @fge Flushing before closing fits squarely within the category of [Cargo Cult Programming](http://en.wikipedia.org/wiki/Cargo_cult_programming). – Marko Topolnik Jan 04 '13 at 20:15

2 Answers2

2

Wouldn't it be better to use an existing library which copies files automatically?

One of that libraries is Jakarta Commons

It's been used by hundreds of developers; it is very well tested and will definitely help you with file copy task.

EDIT

If you are getting corrupted files, the best way to check if your file copy mechanism (either your own one of from any library) is working fine it to check the input and output file checksum. Input file checksum should be the same as output file checksum.

Assuming your input java.io.File is input and output File is output your code checking files checksum could look as follows:

long inputChecksum = FileUtils.checksumCRC32(input);

// if there is an issue with file copy IOException is thrown
FileUtils.copyFile(input, output);

// inputChecksum should be the same as outputChecksum
long outputChecksum = FileUtils.checksumCRC32(output);
Tom
  • 26,212
  • 21
  • 100
  • 111
  • Thanks tom for your reply. I even tried this code `FileUtils.writeByteArrayToFile(new File(srcFilePath), bytes)` in place of `FileOutputStream fileOuputStream = new FileOutputStream(srcFilePath); fileOuputStream.write(bytes); fileOuputStream.flush(); fileOuputStream.close();` in **tempToFile(String srcFilePath)** method. But no luck :( – Narendra Verma Jan 04 '13 at 12:32
  • Where FileUtils is from Apache Common. – Narendra Verma Jan 04 '13 at 12:43
  • `FileUtils` class is in `org.apache.commons.io` package. You can download the latest version of the library (2.4) from here: http://commons.apache.org/io/download_io.cgi – Tom Jan 04 '13 at 13:38
  • Yes, I tried with org.apache.commons.io.FileUtils as well. But, did not succeed. – Narendra Verma Jan 04 '13 at 14:07
  • @Narendra, when you say "no succeed" do you mean that your output file is still corrupted? – Tom Jan 04 '13 at 14:25
2

This piece of code is definitely wrong:

for (int readNum; (readNum = fis.read(buf)) != -1;) {               
    appendByteArrayToTempFile(buf);
}

You ignore the number of bytes that were actually read into buf and unconditionally write out the whole buf every time. You need

for (int readNum; (readNum = fis.read(buf)) != -1;) {               
    appendByteArrayToTempFile(buf, readNum);
}

and implement appendByteArrayToTempFile accordingly.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • Hi Marko, thanks for your reply. I used same code for images that is working fine. Below code is used to re-generate image from temp byte[] in **tempToFile(String srcFilePath)** method: `BufferedImage InputStream in = new ByteArrayInputStream(bytes); BufferedImage bImageFromConvert = ImageIO.read(in); ImageIO.write(bImageFromConvert, "png", new File(srcImagePath));` instead of this code `// Create source file from Temp's byte array FileOutputStream fileOuputStream = new FileOutputStream(srcFilePath); fileOuputStream.write(bytes); fileOuputStream.flush(); fileOuputStream.close(); ` – Narendra Verma Jan 04 '13 at 12:28