I have a large CSV data file -- ~1,444,000 rows of data -- that I am reading in and converting to a numpy array. I read three of 22 columns. This is what I am currently doing:
import numpy as np
import csv
fid = open('data.csv', 'r')
csvfile = csv.reader(fid, dialect='excel', delimiter=',')
csvfile.next() # to skip header
t = []
u = []
w = []
for line in csvfile:
t += [line[1]] # time
u += [line[-4]] # velocity x
w += [line[-2]] # velocity z
t = np.array(t, dtype='float')
u = np.array(u, dtype='float')
w = np.array(w, dtype='float')
So my question is: Is this efficient? I was originally going to append the new data to an existing numpy array in the loop until I read that the whole array has to me moved each time in memory.