Initialization state in DQN
I am initializing the state of my environment with some value s'
.
Also i reinitialize the state of the environment everytime a new epsiode starts. But I have noticed that when I make the environment and initialize the state as lets say [10,3]
, the Policy obtained after the training is not close to the optimal at all. However with other states lets say [20,3].[20,7]....
etc I get results quite close to the optimal. So the question is , is it possible that starting from a state [10,3]
might result in the network getting stuck at local minimas ?
deep-learning reinforcement-learning
add a comment |
I am initializing the state of my environment with some value s'
.
Also i reinitialize the state of the environment everytime a new epsiode starts. But I have noticed that when I make the environment and initialize the state as lets say [10,3]
, the Policy obtained after the training is not close to the optimal at all. However with other states lets say [20,3].[20,7]....
etc I get results quite close to the optimal. So the question is , is it possible that starting from a state [10,3]
might result in the network getting stuck at local minimas ?
deep-learning reinforcement-learning
add a comment |
I am initializing the state of my environment with some value s'
.
Also i reinitialize the state of the environment everytime a new epsiode starts. But I have noticed that when I make the environment and initialize the state as lets say [10,3]
, the Policy obtained after the training is not close to the optimal at all. However with other states lets say [20,3].[20,7]....
etc I get results quite close to the optimal. So the question is , is it possible that starting from a state [10,3]
might result in the network getting stuck at local minimas ?
deep-learning reinforcement-learning
I am initializing the state of my environment with some value s'
.
Also i reinitialize the state of the environment everytime a new epsiode starts. But I have noticed that when I make the environment and initialize the state as lets say [10,3]
, the Policy obtained after the training is not close to the optimal at all. However with other states lets say [20,3].[20,7]....
etc I get results quite close to the optimal. So the question is , is it possible that starting from a state [10,3]
might result in the network getting stuck at local minimas ?
deep-learning reinforcement-learning
deep-learning reinforcement-learning
asked Nov 22 '18 at 11:13
Siddhant TandonSiddhant Tandon
305
305
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Strictly answering the question, sure, it can result in sub-optimum policies. A basic case would be if the agent is not exploring enough and it is not that easy to get to the final state from the state you've chosen for initialization. This would end up in agent finding a local minimum because it never left that 'local space'.
One question you might want to ask yourself is - why you don't initialize your state randomly? Sure, there are cases where it makes more sense to have one main state for initialization, but if your algorithm learns better for other starting points, it might be worth a try to initialize each episode with a different state and let the agent generalize the state space better. Another suggestion would be to check your exploration strategy and see if it's making enough impact.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53429724%2finitialization-state-in-dqn%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Strictly answering the question, sure, it can result in sub-optimum policies. A basic case would be if the agent is not exploring enough and it is not that easy to get to the final state from the state you've chosen for initialization. This would end up in agent finding a local minimum because it never left that 'local space'.
One question you might want to ask yourself is - why you don't initialize your state randomly? Sure, there are cases where it makes more sense to have one main state for initialization, but if your algorithm learns better for other starting points, it might be worth a try to initialize each episode with a different state and let the agent generalize the state space better. Another suggestion would be to check your exploration strategy and see if it's making enough impact.
add a comment |
Strictly answering the question, sure, it can result in sub-optimum policies. A basic case would be if the agent is not exploring enough and it is not that easy to get to the final state from the state you've chosen for initialization. This would end up in agent finding a local minimum because it never left that 'local space'.
One question you might want to ask yourself is - why you don't initialize your state randomly? Sure, there are cases where it makes more sense to have one main state for initialization, but if your algorithm learns better for other starting points, it might be worth a try to initialize each episode with a different state and let the agent generalize the state space better. Another suggestion would be to check your exploration strategy and see if it's making enough impact.
add a comment |
Strictly answering the question, sure, it can result in sub-optimum policies. A basic case would be if the agent is not exploring enough and it is not that easy to get to the final state from the state you've chosen for initialization. This would end up in agent finding a local minimum because it never left that 'local space'.
One question you might want to ask yourself is - why you don't initialize your state randomly? Sure, there are cases where it makes more sense to have one main state for initialization, but if your algorithm learns better for other starting points, it might be worth a try to initialize each episode with a different state and let the agent generalize the state space better. Another suggestion would be to check your exploration strategy and see if it's making enough impact.
Strictly answering the question, sure, it can result in sub-optimum policies. A basic case would be if the agent is not exploring enough and it is not that easy to get to the final state from the state you've chosen for initialization. This would end up in agent finding a local minimum because it never left that 'local space'.
One question you might want to ask yourself is - why you don't initialize your state randomly? Sure, there are cases where it makes more sense to have one main state for initialization, but if your algorithm learns better for other starting points, it might be worth a try to initialize each episode with a different state and let the agent generalize the state space better. Another suggestion would be to check your exploration strategy and see if it's making enough impact.
answered Nov 22 '18 at 14:06
Filip O.Filip O.
665
665
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53429724%2finitialization-state-in-dqn%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown