3

I am working on multiclass classification (10 classes). I am using sklearn.linear_model.SGDClassifier. I see that this model uses a one-versus-all approach. SGDClassifier has a paramenter class_weight: "Weights associated with classes. If not given, all classes are supposed to have weight one.

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))." How class_weight is used during the training? For example, we have label A - 5 samples, label B - 15 samples, and label C - 100 samples. Assume the model A vs B and C is training. Does the class_weights are included in the calculation of loss function? How about "scoring"? SGDClassifier has "accuracy" as default scoring option. Is it weighted?

  • Loss function is weighted, see https://stackoverflow.com/questions/30972029/how-does-the-class-weight-parameter-in-scikit-learn-work . Scorer is not. – Sergey Bushmanov Jan 15 '19 at 06:55

1 Answers1

0

This is my understanding. The classifier optimizes the following objective function: enter image description here

L is the loss function, w is the class weight. Loss function is user defined parameter.

zdarktknight
  • 73
  • 1
  • 5