mutate_at function cancels the previous mutate

I have created a dataframe S by merging two dataframes innov2015 and innov2017 by a unique identifying column.
Some cases in innov2015 are not included in innov2017 and vice versa, so there are NA entries for half of the variables in S for some of the cases.

I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.

I have tried to do this with:

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

If I run

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

p takes the desired value.

when I run both mutate_at() statements

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA

What do I need to do to make both mutate_at() statements work without cancelling the previous one?

asked Jan 2 at 12:06

Laura

596

1

Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the na.rm argument in the mean() function to TRUE. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))

– Lennyy
Jan 2 at 12:09

Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row

– Laura
Jan 2 at 12:15

2

Sorry I meant the rowMeans() function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T) with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.

– Lennyy
Jan 2 at 12:18

Thanks that works

– Laura
Jan 2 at 12:23

add a comment |

I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.

I have tried to do this with:

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

If I run

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

p takes the desired value.

when I run both mutate_at() statements

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA

What do I need to do to make both mutate_at() statements work without cancelling the previous one?

asked Jan 2 at 12:06

Laura

596

1

Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the na.rm argument in the mean() function to TRUE. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))

– Lennyy
Jan 2 at 12:09

Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row

– Laura
Jan 2 at 12:15

2

Sorry I meant the rowMeans() function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T) with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.

– Lennyy
Jan 2 at 12:18

Thanks that works

– Laura
Jan 2 at 12:23

add a comment |

I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.

I have tried to do this with:

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

If I run

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

p takes the desired value.

when I run both mutate_at() statements

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA

What do I need to do to make both mutate_at() statements work without cancelling the previous one?

asked Jan 2 at 12:06

Laura

596

I want to calculate p = (p_2015+p_2017)/2 , however, when there is an NA entry for p_2015 I want p = p_2017 and vice versa.

I have tried to do this with:

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

If I run

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

p takes the desired value.

when I run both mutate_at() statements

    S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = 0) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2015), p_2017,(p_2015+p_2017)/2))) %>%

  mutate_at(vars(p), funs(ifelse(is.na(smalln_2017), p_2015,(p_2015+p_2017)/2))) %>%

the second mutate_at() statement produces the desired values, however it undoes the first mutate_at() statement and where p had taken the correct value, there is now NA

What do I need to do to make both mutate_at() statements work without cancelling the previous one?

r condition mutate

asked Jan 2 at 12:06

Laura

596

asked Jan 2 at 12:06

Laura

596

asked Jan 2 at 12:06

Laura

596

asked Jan 2 at 12:06

Laura

596

asked Jan 2 at 12:06

Laura

596

1

Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the na.rm argument in the mean() function to TRUE. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))

– Lennyy
Jan 2 at 12:09

Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row

– Laura
Jan 2 at 12:15

2

Sorry I meant the rowMeans() function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T) with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.

– Lennyy
Jan 2 at 12:18

Thanks that works

– Laura
Jan 2 at 12:23

add a comment |

1

Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the na.rm argument in the mean() function to TRUE. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))

– Lennyy
Jan 2 at 12:09

Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row

– Laura
Jan 2 at 12:15

2

Sorry I meant the rowMeans() function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T) with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.

– Lennyy
Jan 2 at 12:18

Thanks that works

– Laura
Jan 2 at 12:23

Why don't you just take the mean of both variables? If one is missing, it takes the mean of 1 value, which is exactly what you want in case not both values are available? Do not forget to specify the na.rm argument in the mean() function to TRUE. S %>% mutate(p = mean(c(p_2015,p_2017), na.rm = T))

– Lennyy
Jan 2 at 12:09

Thanks for the work around, do you know have an explanation for why my code doesn't work? EDIT: the code you have provided inputs the value of p as the mean of the entire columns p_2015 and p_2017 rather than the mean of the values in the corresponding row

– Laura
Jan 2 at 12:15

Sorry I meant the rowMeans() function above, but can't edit that comment anymore. Anyway, I think your attempt is more of a workaround than my suggestion. Something like: S$p <- rowMeans(S[,c("p_2015", "p_2017")], na.rm = T) with just base R would work. The answer below beat me to it with regard to explaining why your attempt did not have the desired result.

– Lennyy
Jan 2 at 12:18

Thanks that works

– Laura
Jan 2 at 12:23

add a comment |

1 Answer
1

active

oldest

votes

These two mutates conflict. You are fully re-defining "p" in each of them, since the value of "p" from the first call is never re-used in the second. @Lennyy's comment will get the job done, but if you want to keep this operation within the tidyverse, you might have better luck using case_when. Your example is not fully reproducible, so the following is a guess as to how it should work:

S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = case_when(

    is.na(smalln_2015) ~ smalln_2017,

    is.na(smalln_2017) ~ smalln_2015,

    TRUE ~ (smalln_2015 + smalln_2017) / 2

  ))

edited Jan 2 at 12:26

answered Jan 2 at 12:18

jdobres

5,0231623

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54006069%2fmutate-at-function-cancels-the-previous-mutate-at%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = case_when(

    is.na(smalln_2015) ~ smalln_2017,

    is.na(smalln_2017) ~ smalln_2015,

    TRUE ~ (smalln_2015 + smalln_2017) / 2

  ))

edited Jan 2 at 12:26

answered Jan 2 at 12:18

jdobres

5,0231623

add a comment |

S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = case_when(

    is.na(smalln_2015) ~ smalln_2017,

    is.na(smalln_2017) ~ smalln_2015,

    TRUE ~ (smalln_2015 + smalln_2017) / 2

  ))

edited Jan 2 at 12:26

answered Jan 2 at 12:18

jdobres

5,0231623

add a comment |

S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = case_when(

    is.na(smalln_2015) ~ smalln_2017,

    is.na(smalln_2017) ~ smalln_2015,

    TRUE ~ (smalln_2015 + smalln_2017) / 2

  ))

edited Jan 2 at 12:26

answered Jan 2 at 12:18

jdobres

5,0231623

S <- merge(x = innov_2015_2, y = innov_2017_2, by = "cell_no", all = TRUE) %>%

  mutate(p = case_when(

    is.na(smalln_2015) ~ smalln_2017,

    is.na(smalln_2017) ~ smalln_2015,

    TRUE ~ (smalln_2015 + smalln_2017) / 2

  ))

edited Jan 2 at 12:26

answered Jan 2 at 12:18

jdobres

5,0231623

edited Jan 2 at 12:26

answered Jan 2 at 12:18

jdobres

5,0231623

answered Jan 2 at 12:18

jdobres

5,0231623

answered Jan 2 at 12:18

jdobres

5,0231623

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu