4

I have an array of 1.5 billion long entry. Now, I want to write in into disk and read it back again. Can anybody help me is there any java library (or custom procedure) to do it efficiently.

Usually, I do it using FileChannel and MappedByteBuffer. But, for 1.5 billion long entry it simply exceeds the limit.

Edit:

FileChannel ch = new RandomAccessFile(path, "r").getChannel();
MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, 0, ch.size());
mb.order(ByteOrder.nativeOrder());
alessandro
  • 1,681
  • 10
  • 33
  • 54
  • Do you have a 64-bit JVM installed? Why do you need to read it all into memory? How do you plan on processing it, as there may be other solutions to this problem. – James Black Jul 30 '12 at 02:23
  • What do you men by "simply exceeds the limit"? Are you trying to write it out in a single chunck? I read/write gigabyte files all the time without issue. I must be missing something. Do you have simple code? – MadProgrammer Jul 30 '12 at 02:24
  • @JamesBlack, I have 64 bit JVM. Actually, I have very large dataset contains long numbers (1.5 billion), I have to load it into memory. Has to perform some sorting kind of job then I have to write the results back. – alessandro Jul 30 '12 at 02:26
  • You can use a merge sort that involves temporary files instead of having all the data in RAM – Luiggi Mendoza Jul 30 '12 at 02:29
  • @MadProgrammer, I have gave the code that I used to read long. You can see, for 1.5 billion entry input of ch.size() will give error. – alessandro Jul 30 '12 at 02:30
  • @LuiggiMendoza, Thanks for your suggestion. But, I really need the whole data. – alessandro Jul 30 '12 at 02:31
  • .size() returns an int thats why it gives an error an int is of size 32 – Mitch Connor Jul 30 '12 at 02:34
  • @TylerHeiks, I don't think so. Even for (file size = int max number - 1) also gives error. I have checked. – alessandro Jul 30 '12 at 02:37
  • @MadProgrammer, How you read/write gigabyte files, can you inform. – alessandro Jul 30 '12 at 02:45
  • @alessandro from what I've read of your question, I'm not sure it will help, but basically I do standard file read/write but I use a small buffer and perform multiple loops over the data. – MadProgrammer Jul 30 '12 at 02:47
  • 2
    I would insert those lines into a database (mysel, oracle, whatever) and let the database do the sorting. That is what I would do. Refer here how to read the file line by line, http://stackoverflow.com/questions/5868369/how-to-read-a-large-text-file-line-by-line-in-java – Rosdi Kasim Jul 30 '12 at 02:59
  • I know nothing about your environment, but this sounds like a non-trivial problem. You're dealing with ~6GB of data. You'll need at least that much RAM and a very large JVM heap. You'll be dependent on your physical disk, and its file-system. Do you have access to a database? Perhaps you could use this for storage and sorting instead of Java+file-system? [You beat me to it, Rodsi!] – Muel Jul 30 '12 at 03:00
  • @Muel, I have no access to DB. And I have to perform many operations. That's why I need it on memory. And my Server is 64GB, so no worry. – alessandro Jul 30 '12 at 03:03
  • 64GB won't be enough taking into account you plan to sort those in memory. Just install JavaDB and you should be good to go. http://www.oracle.com/technetwork/java/javadb/overview/index.html – Rosdi Kasim Jul 30 '12 at 05:38

1 Answers1

1

I never tried with objects of this size but i think you can try to wrap your array inside a class implementig java.io.Serializable interface :

    class MyWrapper implements java.io.Serializable 
{ 
Object[] myArray; 
}

Then , when you need to store on disk your array, you will do it simply using the interface method:

FileOutputStream fouts = new 
        FileOutputStream("pathtofile");

    // Write object with ObjectOutputStream
    ObjectOutputStream outobj= new
        ObjectOutputStream (fouts);

    // Write object out to disk
    outobj.writeObject ( myWrapperInstance );

In Order to retrieve

FileInputStream infile = new 
    FileInputStream("pathtofile");

ObjectInputStream inobj = 
    new ObjectInputStream (infile);

Object obj = inobj.readObject();

MyWrapper myWrapperInstance = (MyWrapper) obj;
Jua
  • 26
  • 1