Looking at the source, it doesn't look like this is implemented in 0.14
. Alternatively, you can down-sample the negative class to get an even balance:
import numpy as np
# Fake class labels skewed toward negative class:
real_p = 0.01 # true probability of class 1 (unknown in real cases)
Y = (np.random.rand(10000) < real_p).astype(np.int)
# Use average number of pos examples as an estimate of P(Y=1)
p = (Y==1).mean()
print "Label balance: %.3f pos / %.3f neg" % (p,1-p)
# Resample the training set:
inds = np.zeros(Y.shape[0], dtype=np.bool)
inds[np.where(Y==1)] = True
inds[np.where(Y==0)] = np.random.rand((Y==0).sum()) < p
resample_p = (Y[inds]==1).mean()
print "After resampling:"
print "Label balance: %.3f pos / %.3f neg" % (resample_p,1-resample_p)
Output:
Label balance: 0.013 pos / 0.987 neg
After resampling:
Label balance: 0.531 pos / 0.469 neg
Note that this is a very simplistic means of down-sampling the negative class. A better approach might be to integrate the down-sampling or weighting into the learning scheme - perhaps a boosting or cascade approach?