Differences in probA, probB values between sklearn and libsvm

137 Views Asked by amiola At 04 December 2021 at 23:28

I'm digging into SVMs and Platt Scaling and I can't get through some differences I get in the values of probA, probB when considering sklearn SVC(probability=True), the _sigmoid_calibration() method used to apply Platt Scaling in CalibratedClassifierCV and libsvm directly.

Premises:

AFAIK SVC(probability=True) is implemented with the internal Platt scaling routine of libsvm; though, I suspect some differences wrt libsvm may arise eg from differences in random number generations rules (see here for reference).
_sigmoid_calibration() should be coherent with libsvm; indeed, with this PR, libsvm-like calibration procedure was added (readjusted, actually, I guess) to CalibratedClassifierCV(ensemble=False). Moreover, by looking at tests performed on the calibration module (this in particular), it is clear that _sigmoid_calibration() is fully coherent with libsvm in the computation of probA, probB.
I am not familiar with libsvm, therefore I might be neglecting something in the problem definition below.

This said, here is the snippet showing such differences I can't get through of. What am I missing?

import numpy as np
from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.calibration import _sigmoid_calibration

from libsvm.svmutil import *
from libsvm import *

X, y = load_digits(return_X_y=True)

mask = (y == 0) | (y == 1)
X = X[mask, :]
y = y[mask]  
X = StandardScaler().fit_transform(X) 

model = SVC(probability=True, random_state=42, gamma='auto')
model.fit(X,y)

probA, probB outputted by SVC(probability=True)

 model.probA_, model.probB_   # (array([-4.50114382]), array([-0.29587261]))

probA, probB obtained via the calibration module

 dec_fun = model.decision_function(X)
 probaA, probaB = _sigmoid_calibration(dec_fun, y)
 probaA, probaB   # (-4.426391129899554, -0.2173222213684269)

libsvm output

 np.random.seed(42)
 prob = svm_problem(y, X)
 param = svm_parameter('-s 0 -t 2 -c 1 -b 1')
 libsvm_model = svm_train(prob, param)
 libsvm_model.probA.contents.value, libsvm_model.probB.contents.value # (-4.6378404214097, -0.4000020264023851)

Original Q&A

Differences in probA, probB values between sklearn and libsvm

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in SCIKIT-LEARN

Related Questions in SVM

Related Questions in LIBSVM

Trending Questions

Popular # Hahtags

Popular Questions