neural network keeps reproducing the baseline classifier

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

-1

I'm trying to train a network for a binary classification problem. (It's a convnet, using keras in R, for image recognition, the Human Protein Image Classification Challenge at kaggle to be specific, but I don't think the details are hugely important here. And this has happened with a completely different problem I've worked on before, for a multi-classification problem with text data, and with different software (Spark), so I'll keep this question very general.)

My training examples are labeled '0' and '1'. There are more 0's than 1's in the training data. The networks I train (while using binary crossentropy as my loss function) keep reproducing the baseline classifier; that is, the classifier that predicts '0' all the time, regardless of the test input.

It's not at all mystifying to me why this should sometimes happen. First, there are lots and lots of network configurations that reproduce this classifier; for instance, it wouldn't be hard at all to engineer a network that just output '0' all the time regardless of the input. And secondly, any such configuration is no doubt a local minimum for the loss function on the loss-landscape, and finding local minimums for the loss function is exactly what we ask these neural networks to do. So, we can't hardly blame them for sometimes coming up with this "somewhat good" configuration after training. But, this problem has been particularly persistent for me.

MY QUESTION: is this "regression to the baseline" a common problem in deep learning, and what are some "best practice" ways to either avoid it or combat it?

Just to motivate discussion, I'll mention a few possible courses of action that have already occurred to me, some of which I've actually tried (with no success):

1) Increasing the network complexity (adding more layers, more neurons per layer, more filters in the case of convnets, etc). This is the obvious first move; maybe the network just isn't "smart" enough, even given the best training, to differentiate between '0' and '1', and so the baseline really is the best that you can hope this network architecure to accomplish.

This I've tried. I've even tried a pre-trained convnet with two densely connected layers and 41 million trainable parameters. Same result.

2) Changing the loss function. I tried this, and it didn't help. Noteably, when I train with loss = binary_crossentropy (when your metric is accuracy), it produces the baseline classifier for that metric (predicting all '0's). And when I train with loss = F1_score, it produces the baseline classifier for that metric (predicting all '1's). So again, this thing is obviously doing what it is supposed to, finding a good local minimum; it's just a horrible solution (obviously).

3) Just train the whole thing over again (with a different random initial configuration). I tried this, and it didn't help; it keeps reproducing the baseline. So the baseline is not just popping up because of bad luck, it seems to be ubiquitous.

4) Adjust the learning rate. Tried this, no luck. And really, there's no reason to expect this to help; if it found the baseline before, slowing the learning rate probably won't help to "unfind" it.

Anyone else run into this problem? And how did you deal with it?

asked Jan 3 at 3:59

Mike Crumley

1085

How imbalanced are your classes?

– jonnor
Jan 3 at 10:28

Not very, about a 40/60 split.

– Mike Crumley
Jan 3 at 14:14

add a comment |

-1

MY QUESTION: is this "regression to the baseline" a common problem in deep learning, and what are some "best practice" ways to either avoid it or combat it?

Just to motivate discussion, I'll mention a few possible courses of action that have already occurred to me, some of which I've actually tried (with no success):

This I've tried. I've even tried a pre-trained convnet with two densely connected layers and 41 million trainable parameters. Same result.

4) Adjust the learning rate. Tried this, no luck. And really, there's no reason to expect this to help; if it found the baseline before, slowing the learning rate probably won't help to "unfind" it.

Anyone else run into this problem? And how did you deal with it?

asked Jan 3 at 3:59

Mike Crumley

1085

How imbalanced are your classes?

– jonnor
Jan 3 at 10:28

Not very, about a 40/60 split.

– Mike Crumley
Jan 3 at 14:14

add a comment |

-1

MY QUESTION: is this "regression to the baseline" a common problem in deep learning, and what are some "best practice" ways to either avoid it or combat it?

Just to motivate discussion, I'll mention a few possible courses of action that have already occurred to me, some of which I've actually tried (with no success):

This I've tried. I've even tried a pre-trained convnet with two densely connected layers and 41 million trainable parameters. Same result.

4) Adjust the learning rate. Tried this, no luck. And really, there's no reason to expect this to help; if it found the baseline before, slowing the learning rate probably won't help to "unfind" it.

Anyone else run into this problem? And how did you deal with it?

asked Jan 3 at 3:59

Mike Crumley

1085

MY QUESTION: is this "regression to the baseline" a common problem in deep learning, and what are some "best practice" ways to either avoid it or combat it?

Just to motivate discussion, I'll mention a few possible courses of action that have already occurred to me, some of which I've actually tried (with no success):

This I've tried. I've even tried a pre-trained convnet with two densely connected layers and 41 million trainable parameters. Same result.

4) Adjust the learning rate. Tried this, no luck. And really, there's no reason to expect this to help; if it found the baseline before, slowing the learning rate probably won't help to "unfind" it.

Anyone else run into this problem? And how did you deal with it?

keras deep-learning classification

asked Jan 3 at 3:59

Mike Crumley

1085

asked Jan 3 at 3:59

Mike Crumley

1085

asked Jan 3 at 3:59

Mike Crumley

1085

asked Jan 3 at 3:59

Mike Crumley

1085

asked Jan 3 at 3:59

Mike Crumley

1085

How imbalanced are your classes?

– jonnor
Jan 3 at 10:28

Not very, about a 40/60 split.

– Mike Crumley
Jan 3 at 14:14

add a comment |

How imbalanced are your classes?

– jonnor
Jan 3 at 10:28

Not very, about a 40/60 split.

– Mike Crumley
Jan 3 at 14:14

How imbalanced are your classes?

– jonnor
Jan 3 at 10:28

Not very, about a 40/60 split.

– Mike Crumley
Jan 3 at 14:14

add a comment |

1 Answer
1

active

oldest

votes

-1

I'm not sure what is the best way, but I can share some of my experiences.

First, it's better to make it same number of '0'-labeled samples and '1'-labeled samples

Second, if it goes to the baseline every-time, your sample is like too random.
So you must break the problem and make it less random.

answered Jan 3 at 6:35

CodingLab

6072819

I'd rather not artificially balance the classes unnecessarily, the test (real world) data is unbalanced. And, I know the problem isn't with the data itself, other people are coming up with perfectly respectable classifiers for this problem. And, I have tried augmenting the images with random transformations, no luck there either.

– Mike Crumley
Jan 3 at 14:17

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54016172%2fneural-network-keeps-reproducing-the-baseline-classifier%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

-1

I'm not sure what is the best way, but I can share some of my experiences.

First, it's better to make it same number of '0'-labeled samples and '1'-labeled samples

Second, if it goes to the baseline every-time, your sample is like too random.
So you must break the problem and make it less random.

answered Jan 3 at 6:35

CodingLab

6072819

I'd rather not artificially balance the classes unnecessarily, the test (real world) data is unbalanced. And, I know the problem isn't with the data itself, other people are coming up with perfectly respectable classifiers for this problem. And, I have tried augmenting the images with random transformations, no luck there either.

– Mike Crumley
Jan 3 at 14:17

add a comment |

-1

I'm not sure what is the best way, but I can share some of my experiences.

First, it's better to make it same number of '0'-labeled samples and '1'-labeled samples

Second, if it goes to the baseline every-time, your sample is like too random.
So you must break the problem and make it less random.

answered Jan 3 at 6:35

CodingLab

6072819

I'd rather not artificially balance the classes unnecessarily, the test (real world) data is unbalanced. And, I know the problem isn't with the data itself, other people are coming up with perfectly respectable classifiers for this problem. And, I have tried augmenting the images with random transformations, no luck there either.

– Mike Crumley
Jan 3 at 14:17

add a comment |

-1

I'm not sure what is the best way, but I can share some of my experiences.

First, it's better to make it same number of '0'-labeled samples and '1'-labeled samples

Second, if it goes to the baseline every-time, your sample is like too random.
So you must break the problem and make it less random.

answered Jan 3 at 6:35

CodingLab

6072819

I'm not sure what is the best way, but I can share some of my experiences.

First, it's better to make it same number of '0'-labeled samples and '1'-labeled samples

Second, if it goes to the baseline every-time, your sample is like too random.
So you must break the problem and make it less random.

answered Jan 3 at 6:35

CodingLab

6072819

answered Jan 3 at 6:35

CodingLab

6072819

answered Jan 3 at 6:35

CodingLab

6072819

answered Jan 3 at 6:35

CodingLab

6072819

I'd rather not artificially balance the classes unnecessarily, the test (real world) data is unbalanced. And, I know the problem isn't with the data itself, other people are coming up with perfectly respectable classifiers for this problem. And, I have tried augmenting the images with random transformations, no luck there either.

– Mike Crumley
Jan 3 at 14:17

add a comment |

I'd rather not artificially balance the classes unnecessarily, the test (real world) data is unbalanced. And, I know the problem isn't with the data itself, other people are coming up with perfectly respectable classifiers for this problem. And, I have tried augmenting the images with random transformations, no luck there either.

– Mike Crumley
Jan 3 at 14:17

I'd rather not artificially balance the classes unnecessarily, the test (real world) data is unbalanced. And, I know the problem isn't with the data itself, other people are coming up with perfectly respectable classifiers for this problem. And, I have tried augmenting the images with random transformations, no luck there either.

– Mike Crumley
Jan 3 at 14:17

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu