mutate_at function cancels the previous mutate_at
I have created a dataframe S by merging two dataframes innov2015 and innov2017 by a unique identifying column.
Some cases in innov2015 are not included in innov2017 and vice versa, so there are NA entries for half of the variables in S for some of the cases.
I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.
I have tried to do this with:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
If I run
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
p takes the desired value.
when I run both mutate_at() statements
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA
What do I need to do to make both mutate_at() statements work without cancelling the previous one?
r condition mutate
add a comment |
I have created a dataframe S by merging two dataframes innov2015 and innov2017 by a unique identifying column.
Some cases in innov2015 are not included in innov2017 and vice versa, so there are NA entries for half of the variables in S for some of the cases.
I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.
I have tried to do this with:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
If I run
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
p takes the desired value.
when I run both mutate_at() statements
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA
What do I need to do to make both mutate_at() statements work without cancelling the previous one?
r condition mutate
1
Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify thena.rm
argument in themean()
function toTRUE
.S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))
– Lennyy
Jan 2 at 12:09
Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row
– Laura
Jan 2 at 12:15
2
Sorry I meant therowMeans()
function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like:S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T)
with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.
– Lennyy
Jan 2 at 12:18
Thanks that works
– Laura
Jan 2 at 12:23
add a comment |
I have created a dataframe S by merging two dataframes innov2015 and innov2017 by a unique identifying column.
Some cases in innov2015 are not included in innov2017 and vice versa, so there are NA entries for half of the variables in S for some of the cases.
I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.
I have tried to do this with:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
If I run
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
p takes the desired value.
when I run both mutate_at() statements
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA
What do I need to do to make both mutate_at() statements work without cancelling the previous one?
r condition mutate
I have created a dataframe S by merging two dataframes innov2015 and innov2017 by a unique identifying column.
Some cases in innov2015 are not included in innov2017 and vice versa, so there are NA entries for half of the variables in S for some of the cases.
I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.
I have tried to do this with:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
If I run
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
p takes the desired value.
when I run both mutate_at() statements
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = 0) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%
mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%
the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA
What do I need to do to make both mutate_at() statements work without cancelling the previous one?
r condition mutate
r condition mutate
asked Jan 2 at 12:06


LauraLaura
596
596
1
Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify thena.rm
argument in themean()
function toTRUE
.S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))
– Lennyy
Jan 2 at 12:09
Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row
– Laura
Jan 2 at 12:15
2
Sorry I meant therowMeans()
function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like:S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T)
with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.
– Lennyy
Jan 2 at 12:18
Thanks that works
– Laura
Jan 2 at 12:23
add a comment |
1
Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify thena.rm
argument in themean()
function toTRUE
.S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))
– Lennyy
Jan 2 at 12:09
Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row
– Laura
Jan 2 at 12:15
2
Sorry I meant therowMeans()
function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like:S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T)
with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.
– Lennyy
Jan 2 at 12:18
Thanks that works
– Laura
Jan 2 at 12:23
1
1
Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the
na.rm
argument in the mean()
function to TRUE
. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))
– Lennyy
Jan 2 at 12:09
Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the
na.rm
argument in the mean()
function to TRUE
. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))
– Lennyy
Jan 2 at 12:09
Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row
– Laura
Jan 2 at 12:15
Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row
– Laura
Jan 2 at 12:15
2
2
Sorry I meant the
rowMeans()
function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T)
with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.– Lennyy
Jan 2 at 12:18
Sorry I meant the
rowMeans()
function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T)
with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.– Lennyy
Jan 2 at 12:18
Thanks that works
– Laura
Jan 2 at 12:23
Thanks that works
– Laura
Jan 2 at 12:23
add a comment |
1 Answer
1
active
oldest
votes
These two mutate
s conflict. You are fully re-defining "p" in each of them, since the value of "p" from the first call is never re-used in the second. @Lennyy's comment will get the job done, but if you want to keep this operation within the tidyverse, you might have better luck using case_when
. Your example is not fully reproducible, so the following is a guess as to how it should work:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = case_when(
is.na(smalln_2015) ~ smalln_2017,
is.na(smalln_2017) ~ smalln_2015,
TRUE ~ (smalln_2015 + smalln_2017) / 2
))
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006069%2fmutate-at-function-cancels-the-previous-mutate-at%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
These two mutate
s conflict. You are fully re-defining "p" in each of them, since the value of "p" from the first call is never re-used in the second. @Lennyy's comment will get the job done, but if you want to keep this operation within the tidyverse, you might have better luck using case_when
. Your example is not fully reproducible, so the following is a guess as to how it should work:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = case_when(
is.na(smalln_2015) ~ smalln_2017,
is.na(smalln_2017) ~ smalln_2015,
TRUE ~ (smalln_2015 + smalln_2017) / 2
))
add a comment |
These two mutate
s conflict. You are fully re-defining "p" in each of them, since the value of "p" from the first call is never re-used in the second. @Lennyy's comment will get the job done, but if you want to keep this operation within the tidyverse, you might have better luck using case_when
. Your example is not fully reproducible, so the following is a guess as to how it should work:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = case_when(
is.na(smalln_2015) ~ smalln_2017,
is.na(smalln_2017) ~ smalln_2015,
TRUE ~ (smalln_2015 + smalln_2017) / 2
))
add a comment |
These two mutate
s conflict. You are fully re-defining "p" in each of them, since the value of "p" from the first call is never re-used in the second. @Lennyy's comment will get the job done, but if you want to keep this operation within the tidyverse, you might have better luck using case_when
. Your example is not fully reproducible, so the following is a guess as to how it should work:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = case_when(
is.na(smalln_2015) ~ smalln_2017,
is.na(smalln_2017) ~ smalln_2015,
TRUE ~ (smalln_2015 + smalln_2017) / 2
))
These two mutate
s conflict. You are fully re-defining "p" in each of them, since the value of "p" from the first call is never re-used in the second. @Lennyy's comment will get the job done, but if you want to keep this operation within the tidyverse, you might have better luck using case_when
. Your example is not fully reproducible, so the following is a guess as to how it should work:
S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%
mutate(p = case_when(
is.na(smalln_2015) ~ smalln_2017,
is.na(smalln_2017) ~ smalln_2015,
TRUE ~ (smalln_2015 + smalln_2017) / 2
))
edited Jan 2 at 12:26
answered Jan 2 at 12:18


jdobresjdobres
5,0231623
5,0231623
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006069%2fmutate-at-function-cancels-the-previous-mutate-at%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the
na.rm
argument in themean()
function toTRUE
.S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))
– Lennyy
Jan 2 at 12:09
Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row
– Laura
Jan 2 at 12:15
2
Sorry I meant the
rowMeans()
function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like:S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T)
with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.– Lennyy
Jan 2 at 12:18
Thanks that works
– Laura
Jan 2 at 12:23