“coords” function of “pROC” package returns different sensibility and specificity values than...
Hi everybody and thank you very much in advance for your help.
I have performed a random forest model for classification. Now I want to determine the best threshold to optimize specificity and sensibility.
I am confused because, as stated in the title, the "coords" function of "pROC" package returns different values than the "confusionMatrix" function of the "caret" package.
Below is the code :
# package import
library(caret)
library(pROC)
# data import
data <- read.csv2("denonciation.csv", check.names = F)
# data partition
validation_index <- createDataPartition(data$Denonc, p=0.80,list=FALSE)
validation <- data[-validation_index,]
entrainement <- data[validation_index,]
# handling class imbalance
set.seed (7)
up_entrainement <- upSample(x=entrainement[,-ncol(entrainement)],y=entrainement$Denonc)
# Cross validation setting
control <- trainControl(method ="cv", number=10, classProbs = TRUE)
# Model training
fit.rf_up <-train(Denonc~EMOTION+Agreabilite_classe+Conscienciosite_classe, data = up_entrainement, method="rf", trControl = control)
# Best threshold determination
roc <- roc(up_entrainement$Denonc, predict(fit.rf_up, up_entrainement, type = "prob")[,2])
coords(roc, x="best", input = "threshold", best.method = "closest.topleft")
### The best threshold seems to be .36 with a specificity of .79 and a sensitivity of .73 ###
# Confusion matrix with the best threshold returned by "coords"
probsTest <- predict(fit.rf_up, validation, type = "prob")
threshold <- 0.36
predictions <- factor(ifelse(probsTest[, "denoncant"] > threshold, "denoncant", "non_denoncant"))
confusionMatrix(predictions, validation$Denonc)
Here the values are different :
Confusion Matrix and Statistics
Reference
Prediction denoncant non_denoncant
denoncant 433 1380
non_denoncant 386 1671
Accuracy : 0.5437
95% CI : (0.5278, 0.5595)
No Information Rate : 0.7884
P-Value [Acc > NIR] : 1
Kappa : 0.0529
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.5287
Specificity : 0.5477
Pos Pred Value : 0.2388
Neg Pred Value : 0.8123
Prevalence : 0.2116
Detection Rate : 0.1119
Detection Prevalence : 0.4685
Balanced Accuracy : 0.5382
'Positive' Class : denoncant
Please, could you tell me why the "coords" function of the "pROC" package returns false values?
Many thanks,
Baboune
r machine-learning r-caret proc confusion-matrix
add a comment |
Hi everybody and thank you very much in advance for your help.
I have performed a random forest model for classification. Now I want to determine the best threshold to optimize specificity and sensibility.
I am confused because, as stated in the title, the "coords" function of "pROC" package returns different values than the "confusionMatrix" function of the "caret" package.
Below is the code :
# package import
library(caret)
library(pROC)
# data import
data <- read.csv2("denonciation.csv", check.names = F)
# data partition
validation_index <- createDataPartition(data$Denonc, p=0.80,list=FALSE)
validation <- data[-validation_index,]
entrainement <- data[validation_index,]
# handling class imbalance
set.seed (7)
up_entrainement <- upSample(x=entrainement[,-ncol(entrainement)],y=entrainement$Denonc)
# Cross validation setting
control <- trainControl(method ="cv", number=10, classProbs = TRUE)
# Model training
fit.rf_up <-train(Denonc~EMOTION+Agreabilite_classe+Conscienciosite_classe, data = up_entrainement, method="rf", trControl = control)
# Best threshold determination
roc <- roc(up_entrainement$Denonc, predict(fit.rf_up, up_entrainement, type = "prob")[,2])
coords(roc, x="best", input = "threshold", best.method = "closest.topleft")
### The best threshold seems to be .36 with a specificity of .79 and a sensitivity of .73 ###
# Confusion matrix with the best threshold returned by "coords"
probsTest <- predict(fit.rf_up, validation, type = "prob")
threshold <- 0.36
predictions <- factor(ifelse(probsTest[, "denoncant"] > threshold, "denoncant", "non_denoncant"))
confusionMatrix(predictions, validation$Denonc)
Here the values are different :
Confusion Matrix and Statistics
Reference
Prediction denoncant non_denoncant
denoncant 433 1380
non_denoncant 386 1671
Accuracy : 0.5437
95% CI : (0.5278, 0.5595)
No Information Rate : 0.7884
P-Value [Acc > NIR] : 1
Kappa : 0.0529
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.5287
Specificity : 0.5477
Pos Pred Value : 0.2388
Neg Pred Value : 0.8123
Prevalence : 0.2116
Detection Rate : 0.1119
Detection Prevalence : 0.4685
Balanced Accuracy : 0.5382
'Positive' Class : denoncant
Please, could you tell me why the "coords" function of the "pROC" package returns false values?
Many thanks,
Baboune
r machine-learning r-caret proc confusion-matrix
If I am not mistaken you chose the cutoff value based on train data (data the model saw). What you should have done is to choose the cutoff value based on hold out predictions in re-sampling.
– missuse
Nov 19 '18 at 9:40
add a comment |
Hi everybody and thank you very much in advance for your help.
I have performed a random forest model for classification. Now I want to determine the best threshold to optimize specificity and sensibility.
I am confused because, as stated in the title, the "coords" function of "pROC" package returns different values than the "confusionMatrix" function of the "caret" package.
Below is the code :
# package import
library(caret)
library(pROC)
# data import
data <- read.csv2("denonciation.csv", check.names = F)
# data partition
validation_index <- createDataPartition(data$Denonc, p=0.80,list=FALSE)
validation <- data[-validation_index,]
entrainement <- data[validation_index,]
# handling class imbalance
set.seed (7)
up_entrainement <- upSample(x=entrainement[,-ncol(entrainement)],y=entrainement$Denonc)
# Cross validation setting
control <- trainControl(method ="cv", number=10, classProbs = TRUE)
# Model training
fit.rf_up <-train(Denonc~EMOTION+Agreabilite_classe+Conscienciosite_classe, data = up_entrainement, method="rf", trControl = control)
# Best threshold determination
roc <- roc(up_entrainement$Denonc, predict(fit.rf_up, up_entrainement, type = "prob")[,2])
coords(roc, x="best", input = "threshold", best.method = "closest.topleft")
### The best threshold seems to be .36 with a specificity of .79 and a sensitivity of .73 ###
# Confusion matrix with the best threshold returned by "coords"
probsTest <- predict(fit.rf_up, validation, type = "prob")
threshold <- 0.36
predictions <- factor(ifelse(probsTest[, "denoncant"] > threshold, "denoncant", "non_denoncant"))
confusionMatrix(predictions, validation$Denonc)
Here the values are different :
Confusion Matrix and Statistics
Reference
Prediction denoncant non_denoncant
denoncant 433 1380
non_denoncant 386 1671
Accuracy : 0.5437
95% CI : (0.5278, 0.5595)
No Information Rate : 0.7884
P-Value [Acc > NIR] : 1
Kappa : 0.0529
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.5287
Specificity : 0.5477
Pos Pred Value : 0.2388
Neg Pred Value : 0.8123
Prevalence : 0.2116
Detection Rate : 0.1119
Detection Prevalence : 0.4685
Balanced Accuracy : 0.5382
'Positive' Class : denoncant
Please, could you tell me why the "coords" function of the "pROC" package returns false values?
Many thanks,
Baboune
r machine-learning r-caret proc confusion-matrix
Hi everybody and thank you very much in advance for your help.
I have performed a random forest model for classification. Now I want to determine the best threshold to optimize specificity and sensibility.
I am confused because, as stated in the title, the "coords" function of "pROC" package returns different values than the "confusionMatrix" function of the "caret" package.
Below is the code :
# package import
library(caret)
library(pROC)
# data import
data <- read.csv2("denonciation.csv", check.names = F)
# data partition
validation_index <- createDataPartition(data$Denonc, p=0.80,list=FALSE)
validation <- data[-validation_index,]
entrainement <- data[validation_index,]
# handling class imbalance
set.seed (7)
up_entrainement <- upSample(x=entrainement[,-ncol(entrainement)],y=entrainement$Denonc)
# Cross validation setting
control <- trainControl(method ="cv", number=10, classProbs = TRUE)
# Model training
fit.rf_up <-train(Denonc~EMOTION+Agreabilite_classe+Conscienciosite_classe, data = up_entrainement, method="rf", trControl = control)
# Best threshold determination
roc <- roc(up_entrainement$Denonc, predict(fit.rf_up, up_entrainement, type = "prob")[,2])
coords(roc, x="best", input = "threshold", best.method = "closest.topleft")
### The best threshold seems to be .36 with a specificity of .79 and a sensitivity of .73 ###
# Confusion matrix with the best threshold returned by "coords"
probsTest <- predict(fit.rf_up, validation, type = "prob")
threshold <- 0.36
predictions <- factor(ifelse(probsTest[, "denoncant"] > threshold, "denoncant", "non_denoncant"))
confusionMatrix(predictions, validation$Denonc)
Here the values are different :
Confusion Matrix and Statistics
Reference
Prediction denoncant non_denoncant
denoncant 433 1380
non_denoncant 386 1671
Accuracy : 0.5437
95% CI : (0.5278, 0.5595)
No Information Rate : 0.7884
P-Value [Acc > NIR] : 1
Kappa : 0.0529
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.5287
Specificity : 0.5477
Pos Pred Value : 0.2388
Neg Pred Value : 0.8123
Prevalence : 0.2116
Detection Rate : 0.1119
Detection Prevalence : 0.4685
Balanced Accuracy : 0.5382
'Positive' Class : denoncant
Please, could you tell me why the "coords" function of the "pROC" package returns false values?
Many thanks,
Baboune
r machine-learning r-caret proc confusion-matrix
r machine-learning r-caret proc confusion-matrix
edited Nov 19 '18 at 11:06
desertnaut
16.8k63566
16.8k63566
asked Nov 19 '18 at 8:19
BabouneBaboune
1
1
If I am not mistaken you chose the cutoff value based on train data (data the model saw). What you should have done is to choose the cutoff value based on hold out predictions in re-sampling.
– missuse
Nov 19 '18 at 9:40
add a comment |
If I am not mistaken you chose the cutoff value based on train data (data the model saw). What you should have done is to choose the cutoff value based on hold out predictions in re-sampling.
– missuse
Nov 19 '18 at 9:40
If I am not mistaken you chose the cutoff value based on train data (data the model saw). What you should have done is to choose the cutoff value based on hold out predictions in re-sampling.
– missuse
Nov 19 '18 at 9:40
If I am not mistaken you chose the cutoff value based on train data (data the model saw). What you should have done is to choose the cutoff value based on hold out predictions in re-sampling.
– missuse
Nov 19 '18 at 9:40
add a comment |
1 Answer
1
active
oldest
votes
There are 2 possible issues here that I can see:
While training the model, the samples from the 2 classes are balanced by up-sampling the less numerous class: the best threshold resulting from the model is also calibrated on the same up-sampled dataset. That is not the case for the validation data set as far as I can see.
The two results give out model metrics on different sets (training and validation): while they are supposed to be close together for a RandomForest model, considering all the averaging that occurs under the hood, this doesn't mean the results will be exactly the same. It is very unlikely that a RandomForest model will over-fit the data, but it is possible if the data consists of a mixture of several different populations with different distributions of feature vectors and/or different feature-response relations, which may not always be uniformly distributed in the training and validation sets, even if you do randomly sample the data (i.e. the distribution may be same on average, but not for particular training-validation divides).
I think the first one is what is going wrong, but unfortunately, I can't test out your code, since it depends on the the file denonciation.csv
.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53370706%2fcoords-function-of-proc-package-returns-different-sensibility-and-specificit%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are 2 possible issues here that I can see:
While training the model, the samples from the 2 classes are balanced by up-sampling the less numerous class: the best threshold resulting from the model is also calibrated on the same up-sampled dataset. That is not the case for the validation data set as far as I can see.
The two results give out model metrics on different sets (training and validation): while they are supposed to be close together for a RandomForest model, considering all the averaging that occurs under the hood, this doesn't mean the results will be exactly the same. It is very unlikely that a RandomForest model will over-fit the data, but it is possible if the data consists of a mixture of several different populations with different distributions of feature vectors and/or different feature-response relations, which may not always be uniformly distributed in the training and validation sets, even if you do randomly sample the data (i.e. the distribution may be same on average, but not for particular training-validation divides).
I think the first one is what is going wrong, but unfortunately, I can't test out your code, since it depends on the the file denonciation.csv
.
add a comment |
There are 2 possible issues here that I can see:
While training the model, the samples from the 2 classes are balanced by up-sampling the less numerous class: the best threshold resulting from the model is also calibrated on the same up-sampled dataset. That is not the case for the validation data set as far as I can see.
The two results give out model metrics on different sets (training and validation): while they are supposed to be close together for a RandomForest model, considering all the averaging that occurs under the hood, this doesn't mean the results will be exactly the same. It is very unlikely that a RandomForest model will over-fit the data, but it is possible if the data consists of a mixture of several different populations with different distributions of feature vectors and/or different feature-response relations, which may not always be uniformly distributed in the training and validation sets, even if you do randomly sample the data (i.e. the distribution may be same on average, but not for particular training-validation divides).
I think the first one is what is going wrong, but unfortunately, I can't test out your code, since it depends on the the file denonciation.csv
.
add a comment |
There are 2 possible issues here that I can see:
While training the model, the samples from the 2 classes are balanced by up-sampling the less numerous class: the best threshold resulting from the model is also calibrated on the same up-sampled dataset. That is not the case for the validation data set as far as I can see.
The two results give out model metrics on different sets (training and validation): while they are supposed to be close together for a RandomForest model, considering all the averaging that occurs under the hood, this doesn't mean the results will be exactly the same. It is very unlikely that a RandomForest model will over-fit the data, but it is possible if the data consists of a mixture of several different populations with different distributions of feature vectors and/or different feature-response relations, which may not always be uniformly distributed in the training and validation sets, even if you do randomly sample the data (i.e. the distribution may be same on average, but not for particular training-validation divides).
I think the first one is what is going wrong, but unfortunately, I can't test out your code, since it depends on the the file denonciation.csv
.
There are 2 possible issues here that I can see:
While training the model, the samples from the 2 classes are balanced by up-sampling the less numerous class: the best threshold resulting from the model is also calibrated on the same up-sampled dataset. That is not the case for the validation data set as far as I can see.
The two results give out model metrics on different sets (training and validation): while they are supposed to be close together for a RandomForest model, considering all the averaging that occurs under the hood, this doesn't mean the results will be exactly the same. It is very unlikely that a RandomForest model will over-fit the data, but it is possible if the data consists of a mixture of several different populations with different distributions of feature vectors and/or different feature-response relations, which may not always be uniformly distributed in the training and validation sets, even if you do randomly sample the data (i.e. the distribution may be same on average, but not for particular training-validation divides).
I think the first one is what is going wrong, but unfortunately, I can't test out your code, since it depends on the the file denonciation.csv
.
edited Nov 19 '18 at 10:52
answered Nov 19 '18 at 9:37
ritwik33ritwik33
866
866
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53370706%2fcoords-function-of-proc-package-returns-different-sensibility-and-specificit%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If I am not mistaken you chose the cutoff value based on train data (data the model saw). What you should have done is to choose the cutoff value based on hold out predictions in re-sampling.
– missuse
Nov 19 '18 at 9:40