0

I am doing multiple nested loops for machine learning testing on large dataset

in these loops I test different algorithms and data set sizes

the problem is that I get crashes in the middle of the process with this error or any other error

MemoryError: Unable to allocate 194. MiB for an array with shape (1662, 15269) and data type float64

I want a way that no matter the error the code continues

here is my code

  for Size in [10, 100, 1000, 10000,100000, 1000000]:
        

        df_reps_all = pd.read_sql("EXEC GetReps_Idea14 " + str(Size) , conn)
        df_results_all = pd.read_sql("EXEC GetResults14 " + str(Size)  , conn)

       for Model in ["SVC","LOG", "RF", "GNB", "KNN"]:
            
            multilabel_binarizer = MultiLabelBinarizer()
            multilabel_binarizer.fit(df['Code'])
            y = multilabel_binarizer.transform(df['Code'])
             
            
            if Model == "LOG":
                mdl = LogisticRegression()
            elif Model == "RF":
                mdl = RandomForestClassifier()
            elif Model == "GNB":
                mdl = GaussianNB()
            elif Model == "KNN":
                mdl = KNeighborsClassifier()
            elif Model == "SVC":
                mdl = SVC()
            print (120)
                
            
            
            clf = OneVsRestClassifier(mdl)
         
            y_pred = cross_val_predict(clf, dfMethod, y, cv=3, n_jobs=-1)

P.S. my question is not how to fix that error. my question is how to proceed with code no matter what error I get

asmgx
  • 7,328
  • 15
  • 82
  • 143
  • 1
    Does this answer your question? [How to solve the memory error in Python](https://stackoverflow.com/questions/37347397/how-to-solve-the-memory-error-in-python) – enzo Jul 18 '21 at 00:48
  • @enzo Thanks, but my question is not how to solve the error. my question is how to proceed with the code no matter what error I get – asmgx Jul 18 '21 at 00:52
  • 3
    Wrap the offending code in `try ... except ... finally`? Beyond that, there isn't much to say. Please clarify. – John Coleman Jul 18 '21 at 00:59

3 Answers3

3

You can try to use try/except. But it might not always work for memory exceptions (read here). A workaround might be to separate it to 2 scripts, a master script that submits the jobs, and a worker that runs it.

Something like this:

master.py
----------
for Size in [10, 100, 1000, 10000,100000, 1000000]:
    for Model in ["SVC","LOG", "RF", "GNB", "KNN"]:
        os.sysmem(f"python worker.py {Size} {Model}")


worker.py
---------
Size = int(sys.agv[1])
Model = sys.argv[2]

df_reps_all = pd.read_sql("EXEC GetReps_Idea14 " + str(Size) , conn)
df_results_all = pd.read_sql("EXEC GetResults14 " + str(Size)  , conn)

multilabel_binarizer = MultiLabelBinarizer()
multilabel_binarizer.fit(df['Code'])
y = multilabel_binarizer.transform(df['Code'])
    

if Model == "LOG":
    mdl = LogisticRegression()
elif Model == "RF":
    mdl = RandomForestClassifier()
elif Model == "GNB":
    mdl = GaussianNB()
elif Model == "KNN":
    mdl = KNeighborsClassifier()
elif Model == "SVC":
    mdl = SVC()
print (120)
    


clf = OneVsRestClassifier(mdl)

y_pred = cross_val_predict(clf, dfMethod, y, cv=3, n_jobs=-1)



So even if the worker fails, you will still be able to continue with the next parameters set

Oren
  • 4,711
  • 4
  • 37
  • 63
1

To continue in case of such "crash" you need to catch and handle the MemoryError exception.

Find the smallest part of your code where the exception occurs (there should be a line number in the full stack trace) and wrap it in try .. except block.

try:
    ... some code that can run out of memory ...
except MemoryError:
    pass

You may want to do something else, like some cleanup instead of pass which will do nothing, and just continue to the next line of code, but there isn't enough information in your question for a better recommendation.

If you want to ignore all other errors, replace MemoryError with Exception which is the base call, and will catch all exceptions that can be caught.

Lev M.
  • 6,088
  • 1
  • 10
  • 23
1

You may want to use try:... except:... to catch the error and ignore it, for example:

try:
    # your code that may throw an error
except:
    pass

However, ignore an error and do nothing about it is usually not recommended.

PRO
  • 169
  • 5
  • 2
    beware! because this doesn't specify an Exception type, it'll catch unexpected exceptions inheriting from `BaseException` like `SystemExit` and `KeyboardInterrupt` https://docs.python.org/3/library/exceptions.html#exception-hierarchy – ti7 Jul 18 '21 at 01:11