I am doing multiple nested loops for machine learning testing on large dataset
in these loops I test different algorithms and data set sizes
the problem is that I get crashes in the middle of the process with this error or any other error
MemoryError: Unable to allocate 194. MiB for an array with shape (1662, 15269) and data type float64
I want a way that no matter the error the code continues
here is my code
for Size in [10, 100, 1000, 10000,100000, 1000000]:
df_reps_all = pd.read_sql("EXEC GetReps_Idea14 " + str(Size) , conn)
df_results_all = pd.read_sql("EXEC GetResults14 " + str(Size) , conn)
for Model in ["SVC","LOG", "RF", "GNB", "KNN"]:
multilabel_binarizer = MultiLabelBinarizer()
multilabel_binarizer.fit(df['Code'])
y = multilabel_binarizer.transform(df['Code'])
if Model == "LOG":
mdl = LogisticRegression()
elif Model == "RF":
mdl = RandomForestClassifier()
elif Model == "GNB":
mdl = GaussianNB()
elif Model == "KNN":
mdl = KNeighborsClassifier()
elif Model == "SVC":
mdl = SVC()
print (120)
clf = OneVsRestClassifier(mdl)
y_pred = cross_val_predict(clf, dfMethod, y, cv=3, n_jobs=-1)
P.S. my question is not how to fix that error. my question is how to proceed with code no matter what error I get