I was converting some pandas series and pandas dataframes to koalas for scalability. But in places where i was using np.where()
I tried to pass koalas dataframe like it was previously passing pandas dataframe. But I got an error an PandasNotImplementedError.
How can I overcome this error? I tried ks.where()
but it didn’t work.
Here is model of the code I am working on using pandas.
import pandas as pd
import numpy as np
pdf = np.where(condition, action1, action2)
The code is working if I convert the koalas back to pandas using toPandas()
or from_pandas()
, but due to performance and scalability reasons I can’t use pandas. If possible please suggest me an alternative approach in Koalas or an alternative library for numpy which can do this that works well with koalas.