Why does cross_val_score in sklearn flip the value of the metric?

I am fitting this model from sklearn.

LogisticRegressionCV(

        solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

    )

The fitting results in a model.score (on training set) of 0.67 and change. Since there is no way (or I don't know how) to access the results of the cross validation performed as part of the model fitting, I run as separate cross validation on the same model with

cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

This returns an array of negative numbers

[-0.69517214 -0.69211235 -0.64173978 -0.66429986 -0.77126878 -0.65127196

 -0.66302393 -0.65916281 -0.66893633 -0.67605681]

which, if signs were flipped, would seem in a range compatible with the training score.
I've read the discussion in an issue about cross_val_score flipping the sign of the given scoring function and the solution seemed that neg_* metrics were being introduced to make such flipping unnecessary and I am using neg_log_loss. The issue talks about mse but the arguments seem to apply to log_loss as well. Is there a way to have cross_val_score return the same metric as specified in its arguments? Or is this a bug I should file? Or a misunderstanding on my part and sign change is still to be expected from cross_val_score?

I hope this is a specific enough question for SO. Sklearn devs redirect users to SO for questions that are not clear-cut bug reports or feature reqs.

Adding minimal repro code per request in comments (sklearn v 0.19.1 python 2.7):

from numpy.random import randn, seed

from sklearn.linear_model import LogisticRegressionCV

from sklearn.model_selection import cross_val_score



seed (0)

X = randn(100,2)

y = randn(100)>0

model = LogisticRegressionCV(

    solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

)

model.fit(X=X, y=y)

model.score(X,y)



cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

With this code, it doesn't look anymore like it's a simple sign flip for the metric. The outputs are 0.59 for the score and array([-0.70578452, -0.68773683, -0.68627652, -0.69731349, -0.69198876, -0.70089103, -0.69476663, -0.68279466, -0.70066003, -0.68532253]) for the cross validation score.

edited Nov 20 '18 at 20:09

asked Nov 19 '18 at 19:48

piccolbo

1,073515

Can you show the complete code and possibly some data which reproduces positive score when model.score()? I am not able to duplicate it on scikit-learn inbuilt datasets.
– Vivek Kumar
Nov 20 '18 at 6:49

The complete code is at github.com/piccolbo/rightload branch basilica The ML code is in ml.py Sharing the data is more complex and running the code requires access to a web service. I need to think of something more self-contained for a more practical repro.
– piccolbo
Nov 20 '18 at 16:17

The code that generates the positive score is pretty trivial, in ml.py:127 and following lines. model.fit(X,y) followed by model.score(X,y), pretty much. I hope I got your question -- I still owe you some data for a complete repro, of course.
– piccolbo
Nov 20 '18 at 16:25

Got the repro but it requires sharing two pickles with data. Is there a SO preferred way of doing that?
– piccolbo
Nov 20 '18 at 19:11

Replaced repro with one that is self-contained and quick. Doesn't look like a simple sign flip anymore, though.
– piccolbo
Nov 20 '18 at 20:10

|
show 1 more comment

I am fitting this model from sklearn.

LogisticRegressionCV(

        solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

    )

cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

This returns an array of negative numbers

[-0.69517214 -0.69211235 -0.64173978 -0.66429986 -0.77126878 -0.65127196

 -0.66302393 -0.65916281 -0.66893633 -0.67605681]

I hope this is a specific enough question for SO. Sklearn devs redirect users to SO for questions that are not clear-cut bug reports or feature reqs.

Adding minimal repro code per request in comments (sklearn v 0.19.1 python 2.7):

from numpy.random import randn, seed

from sklearn.linear_model import LogisticRegressionCV

from sklearn.model_selection import cross_val_score



seed (0)

X = randn(100,2)

y = randn(100)>0

model = LogisticRegressionCV(

    solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

)

model.fit(X=X, y=y)

model.score(X,y)



cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

edited Nov 20 '18 at 20:09

asked Nov 19 '18 at 19:48

piccolbo

1,073515

Can you show the complete code and possibly some data which reproduces positive score when model.score()? I am not able to duplicate it on scikit-learn inbuilt datasets.
– Vivek Kumar
Nov 20 '18 at 6:49

The complete code is at github.com/piccolbo/rightload branch basilica The ML code is in ml.py Sharing the data is more complex and running the code requires access to a web service. I need to think of something more self-contained for a more practical repro.
– piccolbo
Nov 20 '18 at 16:17

The code that generates the positive score is pretty trivial, in ml.py:127 and following lines. model.fit(X,y) followed by model.score(X,y), pretty much. I hope I got your question -- I still owe you some data for a complete repro, of course.
– piccolbo
Nov 20 '18 at 16:25

Got the repro but it requires sharing two pickles with data. Is there a SO preferred way of doing that?
– piccolbo
Nov 20 '18 at 19:11

Replaced repro with one that is self-contained and quick. Doesn't look like a simple sign flip anymore, though.
– piccolbo
Nov 20 '18 at 20:10

|
show 1 more comment

I am fitting this model from sklearn.

LogisticRegressionCV(

        solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

    )

cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

This returns an array of negative numbers

[-0.69517214 -0.69211235 -0.64173978 -0.66429986 -0.77126878 -0.65127196

 -0.66302393 -0.65916281 -0.66893633 -0.67605681]

I hope this is a specific enough question for SO. Sklearn devs redirect users to SO for questions that are not clear-cut bug reports or feature reqs.

Adding minimal repro code per request in comments (sklearn v 0.19.1 python 2.7):

from numpy.random import randn, seed

from sklearn.linear_model import LogisticRegressionCV

from sklearn.model_selection import cross_val_score



seed (0)

X = randn(100,2)

y = randn(100)>0

model = LogisticRegressionCV(

    solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

)

model.fit(X=X, y=y)

model.score(X,y)



cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

edited Nov 20 '18 at 20:09

asked Nov 19 '18 at 19:48

piccolbo

1,073515

I am fitting this model from sklearn.

LogisticRegressionCV(

        solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

    )

cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

This returns an array of negative numbers

[-0.69517214 -0.69211235 -0.64173978 -0.66429986 -0.77126878 -0.65127196

 -0.66302393 -0.65916281 -0.66893633 -0.67605681]

I hope this is a specific enough question for SO. Sklearn devs redirect users to SO for questions that are not clear-cut bug reports or feature reqs.

Adding minimal repro code per request in comments (sklearn v 0.19.1 python 2.7):

from numpy.random import randn, seed

from sklearn.linear_model import LogisticRegressionCV

from sklearn.model_selection import cross_val_score



seed (0)

X = randn(100,2)

y = randn(100)>0

model = LogisticRegressionCV(

    solver="sag", scoring="neg_log_loss", verbose=0, n_jobs=-1, cv=10

)

model.fit(X=X, y=y)

model.score(X,y)



cross_val_score(model, X, y, cv=10, scoring="neg_log_loss")

scikit-learn cross-validation loss-function

edited Nov 20 '18 at 20:09

asked Nov 19 '18 at 19:48

piccolbo

1,073515

edited Nov 20 '18 at 20:09

asked Nov 19 '18 at 19:48

piccolbo

1,073515

edited Nov 20 '18 at 20:09

asked Nov 19 '18 at 19:48

piccolbo

1,073515

asked Nov 19 '18 at 19:48

piccolbo

1,073515

asked Nov 19 '18 at 19:48

piccolbo

1,073515

Can you show the complete code and possibly some data which reproduces positive score when model.score()? I am not able to duplicate it on scikit-learn inbuilt datasets.
– Vivek Kumar
Nov 20 '18 at 6:49

The complete code is at github.com/piccolbo/rightload branch basilica The ML code is in ml.py Sharing the data is more complex and running the code requires access to a web service. I need to think of something more self-contained for a more practical repro.
– piccolbo
Nov 20 '18 at 16:17

The code that generates the positive score is pretty trivial, in ml.py:127 and following lines. model.fit(X,y) followed by model.score(X,y), pretty much. I hope I got your question -- I still owe you some data for a complete repro, of course.
– piccolbo
Nov 20 '18 at 16:25

Got the repro but it requires sharing two pickles with data. Is there a SO preferred way of doing that?
– piccolbo
Nov 20 '18 at 19:11

Replaced repro with one that is self-contained and quick. Doesn't look like a simple sign flip anymore, though.
– piccolbo
Nov 20 '18 at 20:10

|
show 1 more comment

Can you show the complete code and possibly some data which reproduces positive score when model.score()? I am not able to duplicate it on scikit-learn inbuilt datasets.
– Vivek Kumar
Nov 20 '18 at 6:49

The complete code is at github.com/piccolbo/rightload branch basilica The ML code is in ml.py Sharing the data is more complex and running the code requires access to a web service. I need to think of something more self-contained for a more practical repro.
– piccolbo
Nov 20 '18 at 16:17

The code that generates the positive score is pretty trivial, in ml.py:127 and following lines. model.fit(X,y) followed by model.score(X,y), pretty much. I hope I got your question -- I still owe you some data for a complete repro, of course.
– piccolbo
Nov 20 '18 at 16:25

Got the repro but it requires sharing two pickles with data. Is there a SO preferred way of doing that?
– piccolbo
Nov 20 '18 at 19:11

Replaced repro with one that is self-contained and quick. Doesn't look like a simple sign flip anymore, though.
– piccolbo
Nov 20 '18 at 20:10

Can you show the complete code and possibly some data which reproduces positive score when model.score()? I am not able to duplicate it on scikit-learn inbuilt datasets.
– Vivek Kumar
Nov 20 '18 at 6:49

The complete code is at github.com/piccolbo/rightload branch basilica The ML code is in ml.py Sharing the data is more complex and running the code requires access to a web service. I need to think of something more self-contained for a more practical repro.
– piccolbo
Nov 20 '18 at 16:17

The code that generates the positive score is pretty trivial, in ml.py:127 and following lines. model.fit(X,y) followed by model.score(X,y), pretty much. I hope I got your question -- I still owe you some data for a complete repro, of course.
– piccolbo
Nov 20 '18 at 16:25

Got the repro but it requires sharing two pickles with data. Is there a SO preferred way of doing that?
– piccolbo
Nov 20 '18 at 19:11

Replaced repro with one that is self-contained and quick. Doesn't look like a simple sign flip anymore, though.
– piccolbo
Nov 20 '18 at 20:10

|
show 1 more comment

1 Answer
1

active

oldest

votes

Note: edited after the fruitful comment thread with Vivek Kumar and piccolbo.

About LinearRegressionCV `score` method's strange results

You found a bug, which was fixed in version 0.20.0.

From the changelog:

Fix: Fixed a bug in linear_model.LogisticRegressionCV where the score method always computes accuracy, not the metric given by the scoring parameter. #10998 by Thomas Fan.

Also, sklearn's 0.19 LogisticRegressionCV documentation says:

score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

While from version 0.20.0, the docs are updated with the bugfix:

score(X, y, sample_weight=None)

Returns the score using the scoring option on the given test data and labels.

About the negative values returned in `cross_val_score`

cross_val_score flips the result value for error or loss metrics, while it preserves the sign for score metrics. From the documentation:

All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as neg_mean_squared_error which return the negated value of the metric.

edited Nov 22 '18 at 3:10

answered Nov 20 '18 at 20:13

Julian Peller

864511

I don't understand why I got a negative vote. I reduced the assertivity of my answer, in case that was the problem. I think it adds useful information on the topic, at least.
– Julian Peller
Nov 21 '18 at 4:59

Yes. You are correct. LogisticRegressionCV returns mean accuracy in version 0.19. From version 0.20 upwards, it returns the score for the defined scoring param.
– Vivek Kumar
Nov 21 '18 at 6:37

The problem is solved upgrading sklearn. This was suggested in a comment which seems to have disappeared, then in Julian's answer which contains many other things that IMHO are weakly related. If he could simplify it to the point of accuracy vs requested metric as changed in the latest sklearn version, I'd be glad to mark it as accepted. Thanks!
– piccolbo
Nov 22 '18 at 2:29

@piccolbo glad to hear it's solved! It was a really tricky scenario. I did some editions to the answer, removing some argumental turnarounds, giving relevant credits, keeping the information on the problem of accuracy (and showing detailed explicit citations on the matter) and also the information about the sign flip for cross_val_score (which is not trivial and seems somehow relevant too, at least for the first part of your question before the repro code). Does it look good? Any suggestion?
– Julian Peller
Nov 22 '18 at 2:58

Actually, I found the bugfix in the changelog!! It's LogisticRegressionCV.score doesn't respect scoring, inconsistent with GridSearchCV. Adding this to the answer.
– Julian Peller
Nov 22 '18 at 3:06

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381650%2fwhy-does-cross-val-score-in-sklearn-flip-the-value-of-the-metric%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Note: edited after the fruitful comment thread with Vivek Kumar and piccolbo.

About LinearRegressionCV `score` method's strange results

You found a bug, which was fixed in version 0.20.0.

From the changelog:

Fix: Fixed a bug in linear_model.LogisticRegressionCV where the score method always computes accuracy, not the metric given by the scoring parameter. #10998 by Thomas Fan.

Also, sklearn's 0.19 LogisticRegressionCV documentation says:

score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

While from version 0.20.0, the docs are updated with the bugfix:

score(X, y, sample_weight=None)

Returns the score using the scoring option on the given test data and labels.

About the negative values returned in `cross_val_score`

cross_val_score flips the result value for error or loss metrics, while it preserves the sign for score metrics. From the documentation:

All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as neg_mean_squared_error which return the negated value of the metric.

edited Nov 22 '18 at 3:10

answered Nov 20 '18 at 20:13

Julian Peller

864511

I don't understand why I got a negative vote. I reduced the assertivity of my answer, in case that was the problem. I think it adds useful information on the topic, at least.
– Julian Peller
Nov 21 '18 at 4:59

Yes. You are correct. LogisticRegressionCV returns mean accuracy in version 0.19. From version 0.20 upwards, it returns the score for the defined scoring param.
– Vivek Kumar
Nov 21 '18 at 6:37

The problem is solved upgrading sklearn. This was suggested in a comment which seems to have disappeared, then in Julian's answer which contains many other things that IMHO are weakly related. If he could simplify it to the point of accuracy vs requested metric as changed in the latest sklearn version, I'd be glad to mark it as accepted. Thanks!
– piccolbo
Nov 22 '18 at 2:29

@piccolbo glad to hear it's solved! It was a really tricky scenario. I did some editions to the answer, removing some argumental turnarounds, giving relevant credits, keeping the information on the problem of accuracy (and showing detailed explicit citations on the matter) and also the information about the sign flip for cross_val_score (which is not trivial and seems somehow relevant too, at least for the first part of your question before the repro code). Does it look good? Any suggestion?
– Julian Peller
Nov 22 '18 at 2:58

Actually, I found the bugfix in the changelog!! It's LogisticRegressionCV.score doesn't respect scoring, inconsistent with GridSearchCV. Adding this to the answer.
– Julian Peller
Nov 22 '18 at 3:06

add a comment |

Note: edited after the fruitful comment thread with Vivek Kumar and piccolbo.

About LinearRegressionCV `score` method's strange results

You found a bug, which was fixed in version 0.20.0.

From the changelog:

Fix: Fixed a bug in linear_model.LogisticRegressionCV where the score method always computes accuracy, not the metric given by the scoring parameter. #10998 by Thomas Fan.

Also, sklearn's 0.19 LogisticRegressionCV documentation says:

score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

While from version 0.20.0, the docs are updated with the bugfix:

score(X, y, sample_weight=None)

Returns the score using the scoring option on the given test data and labels.

About the negative values returned in `cross_val_score`

cross_val_score flips the result value for error or loss metrics, while it preserves the sign for score metrics. From the documentation:

All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as neg_mean_squared_error which return the negated value of the metric.

edited Nov 22 '18 at 3:10

answered Nov 20 '18 at 20:13

Julian Peller

864511

I don't understand why I got a negative vote. I reduced the assertivity of my answer, in case that was the problem. I think it adds useful information on the topic, at least.
– Julian Peller
Nov 21 '18 at 4:59

Yes. You are correct. LogisticRegressionCV returns mean accuracy in version 0.19. From version 0.20 upwards, it returns the score for the defined scoring param.
– Vivek Kumar
Nov 21 '18 at 6:37

The problem is solved upgrading sklearn. This was suggested in a comment which seems to have disappeared, then in Julian's answer which contains many other things that IMHO are weakly related. If he could simplify it to the point of accuracy vs requested metric as changed in the latest sklearn version, I'd be glad to mark it as accepted. Thanks!
– piccolbo
Nov 22 '18 at 2:29

@piccolbo glad to hear it's solved! It was a really tricky scenario. I did some editions to the answer, removing some argumental turnarounds, giving relevant credits, keeping the information on the problem of accuracy (and showing detailed explicit citations on the matter) and also the information about the sign flip for cross_val_score (which is not trivial and seems somehow relevant too, at least for the first part of your question before the repro code). Does it look good? Any suggestion?
– Julian Peller
Nov 22 '18 at 2:58

Actually, I found the bugfix in the changelog!! It's LogisticRegressionCV.score doesn't respect scoring, inconsistent with GridSearchCV. Adding this to the answer.
– Julian Peller
Nov 22 '18 at 3:06

add a comment |

Note: edited after the fruitful comment thread with Vivek Kumar and piccolbo.

About LinearRegressionCV `score` method's strange results

You found a bug, which was fixed in version 0.20.0.

From the changelog:

Fix: Fixed a bug in linear_model.LogisticRegressionCV where the score method always computes accuracy, not the metric given by the scoring parameter. #10998 by Thomas Fan.

Also, sklearn's 0.19 LogisticRegressionCV documentation says:

score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

While from version 0.20.0, the docs are updated with the bugfix:

score(X, y, sample_weight=None)

Returns the score using the scoring option on the given test data and labels.

About the negative values returned in `cross_val_score`

cross_val_score flips the result value for error or loss metrics, while it preserves the sign for score metrics. From the documentation:

All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as neg_mean_squared_error which return the negated value of the metric.

edited Nov 22 '18 at 3:10

answered Nov 20 '18 at 20:13

Julian Peller

864511

Note: edited after the fruitful comment thread with Vivek Kumar and piccolbo.

About LinearRegressionCV `score` method's strange results

You found a bug, which was fixed in version 0.20.0.

From the changelog:

Fix: Fixed a bug in linear_model.LogisticRegressionCV where the score method always computes accuracy, not the metric given by the scoring parameter. #10998 by Thomas Fan.

Also, sklearn's 0.19 LogisticRegressionCV documentation says:

score(X, y, sample_weight=None)

Returns the mean accuracy on the given test data and labels.

While from version 0.20.0, the docs are updated with the bugfix:

score(X, y, sample_weight=None)

Returns the score using the scoring option on the given test data and labels.

About the negative values returned in `cross_val_score`

cross_val_score flips the result value for error or loss metrics, while it preserves the sign for score metrics. From the documentation:

All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as neg_mean_squared_error which return the negated value of the metric.

edited Nov 22 '18 at 3:10

answered Nov 20 '18 at 20:13

Julian Peller

864511

edited Nov 22 '18 at 3:10

answered Nov 20 '18 at 20:13

Julian Peller

864511

answered Nov 20 '18 at 20:13

Julian Peller

864511

answered Nov 20 '18 at 20:13

Julian Peller

864511

I don't understand why I got a negative vote. I reduced the assertivity of my answer, in case that was the problem. I think it adds useful information on the topic, at least.
– Julian Peller
Nov 21 '18 at 4:59

Yes. You are correct. LogisticRegressionCV returns mean accuracy in version 0.19. From version 0.20 upwards, it returns the score for the defined scoring param.
– Vivek Kumar
Nov 21 '18 at 6:37

The problem is solved upgrading sklearn. This was suggested in a comment which seems to have disappeared, then in Julian's answer which contains many other things that IMHO are weakly related. If he could simplify it to the point of accuracy vs requested metric as changed in the latest sklearn version, I'd be glad to mark it as accepted. Thanks!
– piccolbo
Nov 22 '18 at 2:29

@piccolbo glad to hear it's solved! It was a really tricky scenario. I did some editions to the answer, removing some argumental turnarounds, giving relevant credits, keeping the information on the problem of accuracy (and showing detailed explicit citations on the matter) and also the information about the sign flip for cross_val_score (which is not trivial and seems somehow relevant too, at least for the first part of your question before the repro code). Does it look good? Any suggestion?
– Julian Peller
Nov 22 '18 at 2:58

Actually, I found the bugfix in the changelog!! It's LogisticRegressionCV.score doesn't respect scoring, inconsistent with GridSearchCV. Adding this to the answer.
– Julian Peller
Nov 22 '18 at 3:06

add a comment |

I don't understand why I got a negative vote. I reduced the assertivity of my answer, in case that was the problem. I think it adds useful information on the topic, at least.
– Julian Peller
Nov 21 '18 at 4:59

Yes. You are correct. LogisticRegressionCV returns mean accuracy in version 0.19. From version 0.20 upwards, it returns the score for the defined scoring param.
– Vivek Kumar
Nov 21 '18 at 6:37

The problem is solved upgrading sklearn. This was suggested in a comment which seems to have disappeared, then in Julian's answer which contains many other things that IMHO are weakly related. If he could simplify it to the point of accuracy vs requested metric as changed in the latest sklearn version, I'd be glad to mark it as accepted. Thanks!
– piccolbo
Nov 22 '18 at 2:29

@piccolbo glad to hear it's solved! It was a really tricky scenario. I did some editions to the answer, removing some argumental turnarounds, giving relevant credits, keeping the information on the problem of accuracy (and showing detailed explicit citations on the matter) and also the information about the sign flip for cross_val_score (which is not trivial and seems somehow relevant too, at least for the first part of your question before the repro code). Does it look good? Any suggestion?
– Julian Peller
Nov 22 '18 at 2:58

Actually, I found the bugfix in the changelog!! It's LogisticRegressionCV.score doesn't respect scoring, inconsistent with GridSearchCV. Adding this to the answer.
– Julian Peller
Nov 22 '18 at 3:06

I don't understand why I got a negative vote. I reduced the assertivity of my answer, in case that was the problem. I think it adds useful information on the topic, at least.
– Julian Peller
Nov 21 '18 at 4:59

Yes. You are correct. LogisticRegressionCV returns mean accuracy in version 0.19. From version 0.20 upwards, it returns the score for the defined scoring param.
– Vivek Kumar
Nov 21 '18 at 6:37

The problem is solved upgrading sklearn. This was suggested in a comment which seems to have disappeared, then in Julian's answer which contains many other things that IMHO are weakly related. If he could simplify it to the point of accuracy vs requested metric as changed in the latest sklearn version, I'd be glad to mark it as accepted. Thanks!
– piccolbo
Nov 22 '18 at 2:29

@piccolbo glad to hear it's solved! It was a really tricky scenario. I did some editions to the answer, removing some argumental turnarounds, giving relevant credits, keeping the information on the problem of accuracy (and showing detailed explicit citations on the matter) and also the information about the sign flip for cross_val_score (which is not trivial and seems somehow relevant too, at least for the first part of your question before the repro code). Does it look good? Any suggestion?
– Julian Peller
Nov 22 '18 at 2:58

Actually, I found the bugfix in the changelog!! It's LogisticRegressionCV.score doesn't respect scoring, inconsistent with GridSearchCV. Adding this to the answer.
– Julian Peller
Nov 22 '18 at 3:06

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu