[Update at 06-15-2017]
It seems that csv2sqlite.py may be the way-to-go with SQLite. Definitely, Chuck-by-Chuck is too slow for big file (> 1GB). When I tested a 6.5GB of nyc311calls.csv with csv2sqlite.py, it took only ~ 24 minutes to create a SQLite database file with data type guessing. 24 minutes is similar to the spending time by MySQL with "LOAD DATA INFILE". This is not bad even though you may need to change the data type for some of columns. In my opinion, to use csv2sqlite.py is the most time-efficient method to create a SQLite database file from csv file right now.
1) Download a csv2sqlite.py from here and put it in the directory containing a csv file.
2) By using Windows Prompt
, go to the directory containing the csv2sqlite.py and the csv file (ex. nyc311calls.csv) you want to import.
3) Run the code of python csv2sqlite.py nyc311calls.csv database_name.db
and wait. Note: python PATH should be included in your Windows Environment Variables
.
This is a little old quest but it seems nobody gave the clear answers.
I hope my answer will help you. With Sqlite I recommend you to see this site, which gives you the idea and what you should do, a chunk-by-chunk load. I tested several approaches but until now this is the most reliable way in my opinion.
Basic procedure is like that:
1) Import a small portion of the big table to pandas.
2) Process and load them to SQLite.
3) Keep to continue this process.
I uploaded more detailed procedure of what I did here (Jupyter file) if you are interested in. You can find the NYC311call data here
A few comments from me.
1) Odo package is not fully working if you data included the empty strings. I hope that they could improve these issues. i.e. if you data is very clean and well-organized, Odo package might be the option.
2) The above approach is a really time-consuming work. Especially, a ~6GB of table take more than 24 hours. Because pandas is slow.
3) If you will not stick to SQLite, I would say MySQL with "LOAD DATA INFILE" is a good option for you. You can find how to do with it from internet searching. As long as I tested, this is much reliable and efficient way. Later you may convert to the sqlite if you really need to use sqlite. Especially, if the data have many empty strings and datetime columns, which are needed to convert to datetime type, I definitely go with MySQL.