Replace all NA values for variable with one row equal to 0












16















Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3
9 c NA
10 c 2
11 c NA
12 c 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3
6 c 2
7 c 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!



edit: updated with clearer desired output. might make desired answers submitted before that a bit less clear










share|improve this question

























  • So you want to add rows with 0 only if all the values for particular id is 0?

    – Ronak Shah
    Jan 3 at 12:47











  • only if they're all NA for a particular id

    – Robert Hickman
    Jan 3 at 12:49






  • 1





    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.

    – markus
    Jan 3 at 13:31
















16















Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3
9 c NA
10 c 2
11 c NA
12 c 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3
6 c 2
7 c 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!



edit: updated with clearer desired output. might make desired answers submitted before that a bit less clear










share|improve this question

























  • So you want to add rows with 0 only if all the values for particular id is 0?

    – Ronak Shah
    Jan 3 at 12:47











  • only if they're all NA for a particular id

    – Robert Hickman
    Jan 3 at 12:49






  • 1





    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.

    – markus
    Jan 3 at 13:31














16












16








16


0






Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3
9 c NA
10 c 2
11 c NA
12 c 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3
6 c 2
7 c 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!



edit: updated with clearer desired output. might make desired answers submitted before that a bit less clear










share|improve this question
















Slightly difficult to phrase, as far as I saw none of the similar questions answered my problem.



I have a data.frame such as:



df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))

df1

id val
1 a NA
2 a NA
3 a NA
4 a NA
5 b 1
6 b 2
7 b 2
8 b 3
9 c NA
10 c 2
11 c NA
12 c 3


and I want to get rid of all the NA values (easy enough using e.g. filter() ) but make sure that if this removes all of one id value (in this case it removes every instance of "a") that one extra row is inserted of (e.g.) a = 0



so that:



  id val
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3
6 c 2
7 c 3


obviously easy enough to do this in a roundabout way but I was wondering if there's a tidy/elegant way to do this. I thought tidyr::complete() might help but not entirely sure how to apply it to a case like this



I don't care about the order of the rows



Cheers!



edit: updated with clearer desired output. might make desired answers submitted before that a bit less clear







r na complete






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 6 at 14:56







Robert Hickman

















asked Jan 3 at 12:43









Robert HickmanRobert Hickman

187110




187110













  • So you want to add rows with 0 only if all the values for particular id is 0?

    – Ronak Shah
    Jan 3 at 12:47











  • only if they're all NA for a particular id

    – Robert Hickman
    Jan 3 at 12:49






  • 1





    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.

    – markus
    Jan 3 at 13:31



















  • So you want to add rows with 0 only if all the values for particular id is 0?

    – Ronak Shah
    Jan 3 at 12:47











  • only if they're all NA for a particular id

    – Robert Hickman
    Jan 3 at 12:49






  • 1





    @RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.

    – markus
    Jan 3 at 13:31

















So you want to add rows with 0 only if all the values for particular id is 0?

– Ronak Shah
Jan 3 at 12:47





So you want to add rows with 0 only if all the values for particular id is 0?

– Ronak Shah
Jan 3 at 12:47













only if they're all NA for a particular id

– Robert Hickman
Jan 3 at 12:49





only if they're all NA for a particular id

– Robert Hickman
Jan 3 at 12:49




1




1





@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.

– markus
Jan 3 at 13:31





@RobertHickman There seems to be some confusion about your desired output. Could you update your question with the expected output based on this df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) ? Thanks to @VivekKalyanarangan for the data.

– markus
Jan 3 at 13:31












9 Answers
9






active

oldest

votes


















9














Another idea using dplyr,



library(dplyr)

df1 %>%
group_by(id) %>%
mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
na.omit()


which gives,




# A tibble: 5 x 2
# Groups: id [2]
id val
<fct> <dbl>
1 a 0
2 b 1
3 b 2
4 b 2
5 b 3






share|improve this answer



















  • 2





    (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

    – Mikko Marttila
    Jan 3 at 14:30











  • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

    – Sotos
    Jan 3 at 14:35





















3














We may do



df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
# A tibble: 5 x 2
# Groups: id [2]
# id val
# <fct> <dbl>
# 1 a 0
# 2 b 1
# 3 b 2
# 4 b 2
# 5 b 3


After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



In a more readable format that would be



df1 %>% group_by(id) %>% 
do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


(Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






share|improve this answer





















  • 1





    @markus, right, I had assumed that that's the goal. Thanks!

    – Julius Vainora
    Jan 3 at 13:19













  • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

    – Vivek Kalyanarangan
    Jan 3 at 13:27






  • 1





    @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

    – Julius Vainora
    Jan 3 at 13:29



















2














df1[is.na(df1)] <- 0
df1[!(duplicated(df1$id) & df1$val == 0), ]

id val
1 a 0
5 b 1
6 b 2
7 b 2
8 b 3





share|improve this answer



















  • 5





    Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

    – markus
    Jan 3 at 13:05













  • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

    – Robert Hickman
    Jan 3 at 13:26





















1














Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



all_NA <- with(df1, ave(is.na(val), id, FUN = all))
rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

# id val
#1 a 0
#5 b 1
#6 b 2
#7 b 2
#8 b 3




dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



library(dplyr)

bind_rows(df1 %>%
group_by(id) %>%
filter(all(!is.na(val))),
df1 %>%
group_by(id) %>%
filter(all(is.na(val))) %>%
ungroup() %>%
summarise(id = unique(id),
val = 0)) %>%
arrange(id)


# id val
# <fct> <dbl>
#1 a 0
#2 b 1
#3 b 2
#4 b 2
#5 b 3





share|improve this answer

































    1














    Changed the df to make example more exhaustive -



    df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
    val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
    library(dplyr)
    df1 %>%
    group_by(id) %>%
    mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
    mutate(val=ifelse(is.na(val)&case,0,val)) %>%
    filter( !(case&row_num!=1) ) %>%
    select(id, val)


    Output



      id      val
    <fct> <dbl>
    1 a 0
    2 b 1
    3 b 2
    4 b 2
    5 b 3
    6 c NA
    7 c 2
    8 c NA
    9 c 3





    share|improve this answer































      1














      Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



      df1 <- na.omit(df1)

      df1 <- rbind(
      df1,
      data.frame(
      id = levels(df1$id)[!levels(df1$id) %in% df1$id],
      val = 0)
      )


      I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






      share|improve this answer































        1














        Here is an option too:



        df1 %>% 
        mutate_if(is.factor,as.character) %>%
        mutate_all(funs(replace(.,is.na(.),0))) %>%
        slice(4:nrow(.))


        This gives:



         id val
        1 a 0
        2 b 1
        3 b 2
        4 b 2
        5 b 3


        Alternative:



        df1 %>% 
        mutate_if(is.factor,as.character) %>%
        mutate_all(funs(replace(.,is.na(.),0))) %>%
        unique()


        UPDATE based on other requirements:
        Some users suggested to test on this dataframe. Of course this answer assumes you'll look at everything by hand. Might be less useful if you have to look at everything by "hand" but here goes:



        df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))


        df1 %>%
        mutate_if(is.factor,as.character) %>%
        mutate(val=ifelse(id=="a",0,val)) %>%
        slice(4:nrow(.))


        This yields:



         id val
        1 a 0
        2 b 1
        3 b 2
        4 b 2
        5 b 3
        6 c NA
        7 c 2
        8 c NA
        9 c 3





        share|improve this answer





















        • 3





          where did 4 come from?

          – Sotos
          Jan 3 at 13:22











        • The solution produces four 0s. We're only interested in having 1?

          – NelsonGon
          Jan 3 at 13:23











        • What if one group has 4 and another 3?

          – Sotos
          Jan 3 at 13:26











        • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

          – NelsonGon
          Jan 3 at 13:27











        • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

          – Vivek Kalyanarangan
          Jan 3 at 13:28



















        0














        Here is a base R solution.



        res <- lapply(split(df1, df1$id), function(DF){
        if(anyNA(DF$val)) {
        i <- is.na(DF$val)
        DF$val[i] <- 0
        DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
        }
        DF
        })
        res <- do.call(rbind, res)
        row.names(res) <- NULL
        res
        # id val
        #1 a 0
        #2 b 1
        #3 b 2
        #4 b 2
        #5 b 3


        Edit.



        A dplyr solution could be the following.
        It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



        library(dplyr)

        na2zero <- function(DF){
        DF %>%
        group_by(id) %>%
        mutate(val = ifelse(is.na(val), 0, val),
        crit = val == 0 & duplicated(val)) %>%
        filter(!crit) %>%
        select(-crit)
        }

        na2zero(df1)
        na2zero(df2)
        na2zero(df3)





        share|improve this answer


























        • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

          – markus
          Jan 3 at 13:21











        • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

          – Rui Barradas
          Jan 3 at 14:05











        • Fair enough. People are reading the question differently.

          – markus
          Jan 3 at 14:51



















        0














        One may try this :



        df1 = data.frame(id = rep(c("a", "b","c"), each = 4),
        val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
        df1
        # id val
        #1 a NA
        #2 a NA
        #3 a NA
        #4 a NA
        #5 b 1
        #6 b 2
        #7 b 2
        #8 b 3
        #9 c NA
        #10 c 2
        #11 c NA
        #12 c 3


        Task is to remove all rows corresponding to any id IFF val for the corresponding id is all NAs and add new row with this id and val = 0.

        In this example, id = a.



        Note : val for c also has NAs but all the val corresponding to c are not NA therefore we need to remove the corresponding row for c where val = NA.



        So lets create another column say, val2 which indicates 0 means its all NAs and 1 otherwise.



        library(dplyr)

        df1 = df1 %>%
        group_by(id) %>%
        mutate(val2 = if_else(condition = all(is.na(val)),true = 0, false = 1))
        df1

        # A tibble: 12 x 3
        # Groups: id [3]
        # id val val2
        # <fct> <dbl> <dbl>
        #1 a NA 0
        #2 a NA 0
        #3 a NA 0
        #4 a NA 0
        #5 b 1 1
        #6 b 2 1
        #7 b 2 1
        #8 b 3 1
        #9 c NA 1
        #10 c 2 1
        #11 c NA 1
        #12 c 3 1


        Get the list of ids with corresponding val = NA for all.



        all_na = unique(df1$id[df1$val2 == 0])


        Then remove theids from the dataframe df1 with val = NA.



        df1 = na.omit(df1)
        df1
        # A tibble: 6 x 3
        # Groups: id [2]
        # id val val2
        # <fct> <dbl> <dbl>
        # 1 b 1 1
        # 2 b 2 1
        # 3 b 2 1
        # 4 b 3 1
        # 5 c 2 1
        # 6 c 3 1


        And create a new dataframe with ids in all_na and val = 0



        all_na_df = data.frame(id = all_na, val = 0) 
        all_na_df
        # id val
        # 1 a 0


        then combine these two dataframes.



        df1 = bind_rows(all_na_df, df1[,c('id', 'val')])
        df1

        # id val
        # 1 a 0
        # 2 b 1
        # 3 b 2
        # 4 b 2
        # 5 b 3
        # 6 c 2
        # 7 c 3


        Hope this helps and Edits are most welcomed :-)






        share|improve this answer

























          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022536%2freplace-all-na-values-for-variable-with-one-row-equal-to-0%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          9 Answers
          9






          active

          oldest

          votes








          9 Answers
          9






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          9














          Another idea using dplyr,



          library(dplyr)

          df1 %>%
          group_by(id) %>%
          mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
          na.omit()


          which gives,




          # A tibble: 5 x 2
          # Groups: id [2]
          id val
          <fct> <dbl>
          1 a 0
          2 b 1
          3 b 2
          4 b 2
          5 b 3






          share|improve this answer



















          • 2





            (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

            – Mikko Marttila
            Jan 3 at 14:30











          • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

            – Sotos
            Jan 3 at 14:35


















          9














          Another idea using dplyr,



          library(dplyr)

          df1 %>%
          group_by(id) %>%
          mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
          na.omit()


          which gives,




          # A tibble: 5 x 2
          # Groups: id [2]
          id val
          <fct> <dbl>
          1 a 0
          2 b 1
          3 b 2
          4 b 2
          5 b 3






          share|improve this answer



















          • 2





            (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

            – Mikko Marttila
            Jan 3 at 14:30











          • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

            – Sotos
            Jan 3 at 14:35
















          9












          9








          9







          Another idea using dplyr,



          library(dplyr)

          df1 %>%
          group_by(id) %>%
          mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
          na.omit()


          which gives,




          # A tibble: 5 x 2
          # Groups: id [2]
          id val
          <fct> <dbl>
          1 a 0
          2 b 1
          3 b 2
          4 b 2
          5 b 3






          share|improve this answer













          Another idea using dplyr,



          library(dplyr)

          df1 %>%
          group_by(id) %>%
          mutate(val = ifelse(row_number() == 1 & all(is.na(val)), 0, val)) %>%
          na.omit()


          which gives,




          # A tibble: 5 x 2
          # Groups: id [2]
          id val
          <fct> <dbl>
          1 a 0
          2 b 1
          3 b 2
          4 b 2
          5 b 3







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 3 at 13:34









          SotosSotos

          28.9k51640




          28.9k51640








          • 2





            (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

            – Mikko Marttila
            Jan 3 at 14:30











          • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

            – Sotos
            Jan 3 at 14:35
















          • 2





            (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

            – Mikko Marttila
            Jan 3 at 14:30











          • @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

            – Sotos
            Jan 3 at 14:35










          2




          2





          (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

          – Mikko Marttila
          Jan 3 at 14:30





          (+1) Seems like the most robust answer here. Would be marginally more concise using replace(val, all(is.na(val)) * 1, 0) instead of the ifelse(...).

          – Mikko Marttila
          Jan 3 at 14:30













          @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

          – Sotos
          Jan 3 at 14:35







          @MikkoMarttila Good suggestion. I usually try and avoid ifelse in general

          – Sotos
          Jan 3 at 14:35















          3














          We may do



          df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
          # A tibble: 5 x 2
          # Groups: id [2]
          # id val
          # <fct> <dbl>
          # 1 a 0
          # 2 b 1
          # 3 b 2
          # 4 b 2
          # 5 b 3


          After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



          In a more readable format that would be



          df1 %>% group_by(id) %>% 
          do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


          (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






          share|improve this answer





















          • 1





            @markus, right, I had assumed that that's the goal. Thanks!

            – Julius Vainora
            Jan 3 at 13:19













          • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

            – Vivek Kalyanarangan
            Jan 3 at 13:27






          • 1





            @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

            – Julius Vainora
            Jan 3 at 13:29
















          3














          We may do



          df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
          # A tibble: 5 x 2
          # Groups: id [2]
          # id val
          # <fct> <dbl>
          # 1 a 0
          # 2 b 1
          # 3 b 2
          # 4 b 2
          # 5 b 3


          After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



          In a more readable format that would be



          df1 %>% group_by(id) %>% 
          do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


          (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






          share|improve this answer





















          • 1





            @markus, right, I had assumed that that's the goal. Thanks!

            – Julius Vainora
            Jan 3 at 13:19













          • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

            – Vivek Kalyanarangan
            Jan 3 at 13:27






          • 1





            @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

            – Julius Vainora
            Jan 3 at 13:29














          3












          3








          3







          We may do



          df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
          # A tibble: 5 x 2
          # Groups: id [2]
          # id val
          # <fct> <dbl>
          # 1 a 0
          # 2 b 1
          # 3 b 2
          # 4 b 2
          # 5 b 3


          After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



          In a more readable format that would be



          df1 %>% group_by(id) %>% 
          do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


          (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)






          share|improve this answer















          We may do



          df1 %>% group_by(id) %>% do(if(all(is.na(.$val))) replace(.[1, ], 2, 0) else na.omit(.))
          # A tibble: 5 x 2
          # Groups: id [2]
          # id val
          # <fct> <dbl>
          # 1 a 0
          # 2 b 1
          # 3 b 2
          # 4 b 2
          # 5 b 3


          After grouping by id, if everything in val is NA, then we leave only the first row with the second element replaced by 0, otherwise the same data is returned after applying na.omit.



          In a more readable format that would be



          df1 %>% group_by(id) %>% 
          do(if(all(is.na(.$val))) data.frame(id = .$id[1], val = 0) else na.omit(.))


          (Here I presume that you indeed want to get rid of all NA values; otherwise there is no need for na.omit.)







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 3 at 13:31

























          answered Jan 3 at 13:11









          Julius VainoraJulius Vainora

          34.6k76079




          34.6k76079








          • 1





            @markus, right, I had assumed that that's the goal. Thanks!

            – Julius Vainora
            Jan 3 at 13:19













          • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

            – Vivek Kalyanarangan
            Jan 3 at 13:27






          • 1





            @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

            – Julius Vainora
            Jan 3 at 13:29














          • 1





            @markus, right, I had assumed that that's the goal. Thanks!

            – Julius Vainora
            Jan 3 at 13:19













          • It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

            – Vivek Kalyanarangan
            Jan 3 at 13:27






          • 1





            @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

            – Julius Vainora
            Jan 3 at 13:29








          1




          1





          @markus, right, I had assumed that that's the goal. Thanks!

          – Julius Vainora
          Jan 3 at 13:19







          @markus, right, I had assumed that that's the goal. Thanks!

          – Julius Vainora
          Jan 3 at 13:19















          It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

          – Vivek Kalyanarangan
          Jan 3 at 13:27





          It looks like op wants to retain the first row and replace the val column of that row with 0 where all val is NA for a group. Check my ans pls. Agree with @markus, it does seem tricky

          – Vivek Kalyanarangan
          Jan 3 at 13:27




          1




          1





          @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

          – Julius Vainora
          Jan 3 at 13:29





          @VivekKalyanarangan, that's what I initially thought, but "and I want to get rid of all the NA values" suggests otherwise.

          – Julius Vainora
          Jan 3 at 13:29











          2














          df1[is.na(df1)] <- 0
          df1[!(duplicated(df1$id) & df1$val == 0), ]

          id val
          1 a 0
          5 b 1
          6 b 2
          7 b 2
          8 b 3





          share|improve this answer



















          • 5





            Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

            – markus
            Jan 3 at 13:05













          • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

            – Robert Hickman
            Jan 3 at 13:26


















          2














          df1[is.na(df1)] <- 0
          df1[!(duplicated(df1$id) & df1$val == 0), ]

          id val
          1 a 0
          5 b 1
          6 b 2
          7 b 2
          8 b 3





          share|improve this answer



















          • 5





            Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

            – markus
            Jan 3 at 13:05













          • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

            – Robert Hickman
            Jan 3 at 13:26
















          2












          2








          2







          df1[is.na(df1)] <- 0
          df1[!(duplicated(df1$id) & df1$val == 0), ]

          id val
          1 a 0
          5 b 1
          6 b 2
          7 b 2
          8 b 3





          share|improve this answer













          df1[is.na(df1)] <- 0
          df1[!(duplicated(df1$id) & df1$val == 0), ]

          id val
          1 a 0
          5 b 1
          6 b 2
          7 b 2
          8 b 3






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 3 at 13:02









          AdammAdamm

          842517




          842517








          • 5





            Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

            – markus
            Jan 3 at 13:05













          • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

            – Robert Hickman
            Jan 3 at 13:26
















          • 5





            Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

            – markus
            Jan 3 at 13:05













          • I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

            – Robert Hickman
            Jan 3 at 13:26










          5




          5





          Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

          – markus
          Jan 3 at 13:05







          Would this work for ids that contain NAs and non-NAs? Try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3))

          – markus
          Jan 3 at 13:05















          I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

          – Robert Hickman
          Jan 3 at 13:26







          I think this is the best so far (I'll leave it open for another hour or so to see) would maybe change to df %>% replace(is.na(.), 0) %>% .[!(duplicated(.$id) & .$val == 0), ]

          – Robert Hickman
          Jan 3 at 13:26













          1














          Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



          all_NA <- with(df1, ave(is.na(val), id, FUN = all))
          rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

          # id val
          #1 a 0
          #5 b 1
          #6 b 2
          #7 b 2
          #8 b 3




          dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



          library(dplyr)

          bind_rows(df1 %>%
          group_by(id) %>%
          filter(all(!is.na(val))),
          df1 %>%
          group_by(id) %>%
          filter(all(is.na(val))) %>%
          ungroup() %>%
          summarise(id = unique(id),
          val = 0)) %>%
          arrange(id)


          # id val
          # <fct> <dbl>
          #1 a 0
          #2 b 1
          #3 b 2
          #4 b 2
          #5 b 3





          share|improve this answer






























            1














            Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



            all_NA <- with(df1, ave(is.na(val), id, FUN = all))
            rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

            # id val
            #1 a 0
            #5 b 1
            #6 b 2
            #7 b 2
            #8 b 3




            dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



            library(dplyr)

            bind_rows(df1 %>%
            group_by(id) %>%
            filter(all(!is.na(val))),
            df1 %>%
            group_by(id) %>%
            filter(all(is.na(val))) %>%
            ungroup() %>%
            summarise(id = unique(id),
            val = 0)) %>%
            arrange(id)


            # id val
            # <fct> <dbl>
            #1 a 0
            #2 b 1
            #3 b 2
            #4 b 2
            #5 b 3





            share|improve this answer




























              1












              1








              1







              Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



              all_NA <- with(df1, ave(is.na(val), id, FUN = all))
              rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

              # id val
              #1 a 0
              #5 b 1
              #6 b 2
              #7 b 2
              #8 b 3




              dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



              library(dplyr)

              bind_rows(df1 %>%
              group_by(id) %>%
              filter(all(!is.na(val))),
              df1 %>%
              group_by(id) %>%
              filter(all(is.na(val))) %>%
              ungroup() %>%
              summarise(id = unique(id),
              val = 0)) %>%
              arrange(id)


              # id val
              # <fct> <dbl>
              #1 a 0
              #2 b 1
              #3 b 2
              #4 b 2
              #5 b 3





              share|improve this answer















              Base R option is to find groups with all NAs and transform them by changing their val to 0 and select only unique rows so that there is only one row per group. We rbind this dataframe with the groups which are !all_NA.



              all_NA <- with(df1, ave(is.na(val), id, FUN = all))
              rbind(unique(transform(df1[all_NA, ], val = 0)), df1[!all_NA, ])

              # id val
              #1 a 0
              #5 b 1
              #6 b 2
              #7 b 2
              #8 b 3




              dplyr option looks ugly but one way is to make two groups of dataframes one with groups of all NA values and other with groups of all non-NA values. For groups with all NA values we add row with it's id and val as 0 and bind this to the other group.



              library(dplyr)

              bind_rows(df1 %>%
              group_by(id) %>%
              filter(all(!is.na(val))),
              df1 %>%
              group_by(id) %>%
              filter(all(is.na(val))) %>%
              ungroup() %>%
              summarise(id = unique(id),
              val = 0)) %>%
              arrange(id)


              # id val
              # <fct> <dbl>
              #1 a 0
              #2 b 1
              #3 b 2
              #4 b 2
              #5 b 3






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jan 3 at 13:17

























              answered Jan 3 at 12:56









              Ronak ShahRonak Shah

              35k103856




              35k103856























                  1














                  Changed the df to make example more exhaustive -



                  df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                  library(dplyr)
                  df1 %>%
                  group_by(id) %>%
                  mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                  mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                  filter( !(case&row_num!=1) ) %>%
                  select(id, val)


                  Output



                    id      val
                  <fct> <dbl>
                  1 a 0
                  2 b 1
                  3 b 2
                  4 b 2
                  5 b 3
                  6 c NA
                  7 c 2
                  8 c NA
                  9 c 3





                  share|improve this answer




























                    1














                    Changed the df to make example more exhaustive -



                    df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                    val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                    library(dplyr)
                    df1 %>%
                    group_by(id) %>%
                    mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                    mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                    filter( !(case&row_num!=1) ) %>%
                    select(id, val)


                    Output



                      id      val
                    <fct> <dbl>
                    1 a 0
                    2 b 1
                    3 b 2
                    4 b 2
                    5 b 3
                    6 c NA
                    7 c 2
                    8 c NA
                    9 c 3





                    share|improve this answer


























                      1












                      1








                      1







                      Changed the df to make example more exhaustive -



                      df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                      val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                      library(dplyr)
                      df1 %>%
                      group_by(id) %>%
                      mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                      mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                      filter( !(case&row_num!=1) ) %>%
                      select(id, val)


                      Output



                        id      val
                      <fct> <dbl>
                      1 a 0
                      2 b 1
                      3 b 2
                      4 b 2
                      5 b 3
                      6 c NA
                      7 c 2
                      8 c NA
                      9 c 3





                      share|improve this answer













                      Changed the df to make example more exhaustive -



                      df1 <- data.frame(id = rep(c("a", "b","c"), each = 4),
                      val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                      library(dplyr)
                      df1 %>%
                      group_by(id) %>%
                      mutate(case=sum(is.na(val))==n(), row_num=row_number() ) %>%
                      mutate(val=ifelse(is.na(val)&case,0,val)) %>%
                      filter( !(case&row_num!=1) ) %>%
                      select(id, val)


                      Output



                        id      val
                      <fct> <dbl>
                      1 a 0
                      2 b 1
                      3 b 2
                      4 b 2
                      5 b 3
                      6 c NA
                      7 c 2
                      8 c NA
                      9 c 3






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Jan 3 at 13:25









                      Vivek KalyanaranganVivek Kalyanarangan

                      5,0111827




                      5,0111827























                          1














                          Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                          df1 <- na.omit(df1)

                          df1 <- rbind(
                          df1,
                          data.frame(
                          id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                          val = 0)
                          )


                          I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






                          share|improve this answer




























                            1














                            Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                            df1 <- na.omit(df1)

                            df1 <- rbind(
                            df1,
                            data.frame(
                            id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                            val = 0)
                            )


                            I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






                            share|improve this answer


























                              1












                              1








                              1







                              Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                              df1 <- na.omit(df1)

                              df1 <- rbind(
                              df1,
                              data.frame(
                              id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                              val = 0)
                              )


                              I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.






                              share|improve this answer













                              Another base approach, one that doesn't maintain the order of the rows and takes advantage of factors remembering lost values:



                              df1 <- na.omit(df1)

                              df1 <- rbind(
                              df1,
                              data.frame(
                              id = levels(df1$id)[!levels(df1$id) %in% df1$id],
                              val = 0)
                              )


                              I do personally prefer the dplyr approach given by Sotos, as I don't like rbind-ing data.frames back together so it's a matter of taste, but this isn't unbearably complicated by my eye. It's easy enough to adapt to a character id column with a unique(df1$id) variable.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Jan 3 at 16:09









                              CriminallyVulgarCriminallyVulgar

                              363




                              363























                                  1














                                  Here is an option too:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  slice(4:nrow(.))


                                  This gives:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3


                                  Alternative:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  unique()


                                  UPDATE based on other requirements:
                                  Some users suggested to test on this dataframe. Of course this answer assumes you'll look at everything by hand. Might be less useful if you have to look at everything by "hand" but here goes:



                                  df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))


                                  df1 %>%
                                  mutate_if(is.factor,as.character) %>%
                                  mutate(val=ifelse(id=="a",0,val)) %>%
                                  slice(4:nrow(.))


                                  This yields:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3
                                  6 c NA
                                  7 c 2
                                  8 c NA
                                  9 c 3





                                  share|improve this answer





















                                  • 3





                                    where did 4 come from?

                                    – Sotos
                                    Jan 3 at 13:22











                                  • The solution produces four 0s. We're only interested in having 1?

                                    – NelsonGon
                                    Jan 3 at 13:23











                                  • What if one group has 4 and another 3?

                                    – Sotos
                                    Jan 3 at 13:26











                                  • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

                                    – NelsonGon
                                    Jan 3 at 13:27











                                  • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

                                    – Vivek Kalyanarangan
                                    Jan 3 at 13:28
















                                  1














                                  Here is an option too:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  slice(4:nrow(.))


                                  This gives:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3


                                  Alternative:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  unique()


                                  UPDATE based on other requirements:
                                  Some users suggested to test on this dataframe. Of course this answer assumes you'll look at everything by hand. Might be less useful if you have to look at everything by "hand" but here goes:



                                  df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))


                                  df1 %>%
                                  mutate_if(is.factor,as.character) %>%
                                  mutate(val=ifelse(id=="a",0,val)) %>%
                                  slice(4:nrow(.))


                                  This yields:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3
                                  6 c NA
                                  7 c 2
                                  8 c NA
                                  9 c 3





                                  share|improve this answer





















                                  • 3





                                    where did 4 come from?

                                    – Sotos
                                    Jan 3 at 13:22











                                  • The solution produces four 0s. We're only interested in having 1?

                                    – NelsonGon
                                    Jan 3 at 13:23











                                  • What if one group has 4 and another 3?

                                    – Sotos
                                    Jan 3 at 13:26











                                  • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

                                    – NelsonGon
                                    Jan 3 at 13:27











                                  • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

                                    – Vivek Kalyanarangan
                                    Jan 3 at 13:28














                                  1












                                  1








                                  1







                                  Here is an option too:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  slice(4:nrow(.))


                                  This gives:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3


                                  Alternative:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  unique()


                                  UPDATE based on other requirements:
                                  Some users suggested to test on this dataframe. Of course this answer assumes you'll look at everything by hand. Might be less useful if you have to look at everything by "hand" but here goes:



                                  df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))


                                  df1 %>%
                                  mutate_if(is.factor,as.character) %>%
                                  mutate(val=ifelse(id=="a",0,val)) %>%
                                  slice(4:nrow(.))


                                  This yields:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3
                                  6 c NA
                                  7 c 2
                                  8 c NA
                                  9 c 3





                                  share|improve this answer















                                  Here is an option too:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  slice(4:nrow(.))


                                  This gives:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3


                                  Alternative:



                                  df1 %>% 
                                  mutate_if(is.factor,as.character) %>%
                                  mutate_all(funs(replace(.,is.na(.),0))) %>%
                                  unique()


                                  UPDATE based on other requirements:
                                  Some users suggested to test on this dataframe. Of course this answer assumes you'll look at everything by hand. Might be less useful if you have to look at everything by "hand" but here goes:



                                  df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))


                                  df1 %>%
                                  mutate_if(is.factor,as.character) %>%
                                  mutate(val=ifelse(id=="a",0,val)) %>%
                                  slice(4:nrow(.))


                                  This yields:



                                   id val
                                  1 a 0
                                  2 b 1
                                  3 b 2
                                  4 b 2
                                  5 b 3
                                  6 c NA
                                  7 c 2
                                  8 c NA
                                  9 c 3






                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Jan 4 at 4:49

























                                  answered Jan 3 at 13:18









                                  NelsonGonNelsonGon

                                  1,240420




                                  1,240420








                                  • 3





                                    where did 4 come from?

                                    – Sotos
                                    Jan 3 at 13:22











                                  • The solution produces four 0s. We're only interested in having 1?

                                    – NelsonGon
                                    Jan 3 at 13:23











                                  • What if one group has 4 and another 3?

                                    – Sotos
                                    Jan 3 at 13:26











                                  • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

                                    – NelsonGon
                                    Jan 3 at 13:27











                                  • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

                                    – Vivek Kalyanarangan
                                    Jan 3 at 13:28














                                  • 3





                                    where did 4 come from?

                                    – Sotos
                                    Jan 3 at 13:22











                                  • The solution produces four 0s. We're only interested in having 1?

                                    – NelsonGon
                                    Jan 3 at 13:23











                                  • What if one group has 4 and another 3?

                                    – Sotos
                                    Jan 3 at 13:26











                                  • Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

                                    – NelsonGon
                                    Jan 3 at 13:27











                                  • Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

                                    – Vivek Kalyanarangan
                                    Jan 3 at 13:28








                                  3




                                  3





                                  where did 4 come from?

                                  – Sotos
                                  Jan 3 at 13:22





                                  where did 4 come from?

                                  – Sotos
                                  Jan 3 at 13:22













                                  The solution produces four 0s. We're only interested in having 1?

                                  – NelsonGon
                                  Jan 3 at 13:23





                                  The solution produces four 0s. We're only interested in having 1?

                                  – NelsonGon
                                  Jan 3 at 13:23













                                  What if one group has 4 and another 3?

                                  – Sotos
                                  Jan 3 at 13:26





                                  What if one group has 4 and another 3?

                                  – Sotos
                                  Jan 3 at 13:26













                                  Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

                                  – NelsonGon
                                  Jan 3 at 13:27





                                  Sorry I only answered based on the question. Maybe then we could twist things up, not sure though!

                                  – NelsonGon
                                  Jan 3 at 13:27













                                  Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

                                  – Vivek Kalyanarangan
                                  Jan 3 at 13:28





                                  Consider this example - df1 <- data.frame(id = rep(c("a", "b","c"), each = 4), val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3)) I think here OP wants to remove NA values for A group only, not the rest

                                  – Vivek Kalyanarangan
                                  Jan 3 at 13:28











                                  0














                                  Here is a base R solution.



                                  res <- lapply(split(df1, df1$id), function(DF){
                                  if(anyNA(DF$val)) {
                                  i <- is.na(DF$val)
                                  DF$val[i] <- 0
                                  DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                                  }
                                  DF
                                  })
                                  res <- do.call(rbind, res)
                                  row.names(res) <- NULL
                                  res
                                  # id val
                                  #1 a 0
                                  #2 b 1
                                  #3 b 2
                                  #4 b 2
                                  #5 b 3


                                  Edit.



                                  A dplyr solution could be the following.
                                  It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                                  library(dplyr)

                                  na2zero <- function(DF){
                                  DF %>%
                                  group_by(id) %>%
                                  mutate(val = ifelse(is.na(val), 0, val),
                                  crit = val == 0 & duplicated(val)) %>%
                                  filter(!crit) %>%
                                  select(-crit)
                                  }

                                  na2zero(df1)
                                  na2zero(df2)
                                  na2zero(df3)





                                  share|improve this answer


























                                  • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

                                    – markus
                                    Jan 3 at 13:21











                                  • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

                                    – Rui Barradas
                                    Jan 3 at 14:05











                                  • Fair enough. People are reading the question differently.

                                    – markus
                                    Jan 3 at 14:51
















                                  0














                                  Here is a base R solution.



                                  res <- lapply(split(df1, df1$id), function(DF){
                                  if(anyNA(DF$val)) {
                                  i <- is.na(DF$val)
                                  DF$val[i] <- 0
                                  DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                                  }
                                  DF
                                  })
                                  res <- do.call(rbind, res)
                                  row.names(res) <- NULL
                                  res
                                  # id val
                                  #1 a 0
                                  #2 b 1
                                  #3 b 2
                                  #4 b 2
                                  #5 b 3


                                  Edit.



                                  A dplyr solution could be the following.
                                  It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                                  library(dplyr)

                                  na2zero <- function(DF){
                                  DF %>%
                                  group_by(id) %>%
                                  mutate(val = ifelse(is.na(val), 0, val),
                                  crit = val == 0 & duplicated(val)) %>%
                                  filter(!crit) %>%
                                  select(-crit)
                                  }

                                  na2zero(df1)
                                  na2zero(df2)
                                  na2zero(df3)





                                  share|improve this answer


























                                  • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

                                    – markus
                                    Jan 3 at 13:21











                                  • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

                                    – Rui Barradas
                                    Jan 3 at 14:05











                                  • Fair enough. People are reading the question differently.

                                    – markus
                                    Jan 3 at 14:51














                                  0












                                  0








                                  0







                                  Here is a base R solution.



                                  res <- lapply(split(df1, df1$id), function(DF){
                                  if(anyNA(DF$val)) {
                                  i <- is.na(DF$val)
                                  DF$val[i] <- 0
                                  DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                                  }
                                  DF
                                  })
                                  res <- do.call(rbind, res)
                                  row.names(res) <- NULL
                                  res
                                  # id val
                                  #1 a 0
                                  #2 b 1
                                  #3 b 2
                                  #4 b 2
                                  #5 b 3


                                  Edit.



                                  A dplyr solution could be the following.
                                  It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                                  library(dplyr)

                                  na2zero <- function(DF){
                                  DF %>%
                                  group_by(id) %>%
                                  mutate(val = ifelse(is.na(val), 0, val),
                                  crit = val == 0 & duplicated(val)) %>%
                                  filter(!crit) %>%
                                  select(-crit)
                                  }

                                  na2zero(df1)
                                  na2zero(df2)
                                  na2zero(df3)





                                  share|improve this answer















                                  Here is a base R solution.



                                  res <- lapply(split(df1, df1$id), function(DF){
                                  if(anyNA(DF$val)) {
                                  i <- is.na(DF$val)
                                  DF$val[i] <- 0
                                  DF <- rbind(DF[i & !duplicated(DF[i, ]), ], DF[!i, ])
                                  }
                                  DF
                                  })
                                  res <- do.call(rbind, res)
                                  row.names(res) <- NULL
                                  res
                                  # id val
                                  #1 a 0
                                  #2 b 1
                                  #3 b 2
                                  #4 b 2
                                  #5 b 3


                                  Edit.



                                  A dplyr solution could be the following.
                                  It was tested with the original dataset posted by the OP, with the dataset in Vivek Kalyanarangan's answer and with the dataset in markus' comment, renamed df2 and df3, respectively.



                                  library(dplyr)

                                  na2zero <- function(DF){
                                  DF %>%
                                  group_by(id) %>%
                                  mutate(val = ifelse(is.na(val), 0, val),
                                  crit = val == 0 & duplicated(val)) %>%
                                  filter(!crit) %>%
                                  select(-crit)
                                  }

                                  na2zero(df1)
                                  na2zero(df2)
                                  na2zero(df3)






                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Jan 3 at 14:22

























                                  answered Jan 3 at 13:03









                                  Rui BarradasRui Barradas

                                  16.4k51730




                                  16.4k51730













                                  • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

                                    – markus
                                    Jan 3 at 13:21











                                  • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

                                    – Rui Barradas
                                    Jan 3 at 14:05











                                  • Fair enough. People are reading the question differently.

                                    – markus
                                    Jan 3 at 14:51



















                                  • Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

                                    – markus
                                    Jan 3 at 13:21











                                  • @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

                                    – Rui Barradas
                                    Jan 3 at 14:05











                                  • Fair enough. People are reading the question differently.

                                    – markus
                                    Jan 3 at 14:51

















                                  Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

                                  – markus
                                  Jan 3 at 13:21





                                  Rui, try with df1 <- data.frame(id = rep(c("a", "b"), each = 2), val = c(NA, 1, 2, 3)). Unfortunately your solution doesn't return a data frame with only three rows.

                                  – markus
                                  Jan 3 at 13:21













                                  @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

                                  – Rui Barradas
                                  Jan 3 at 14:05





                                  @markus No, it doesn't. The NA is replaced by a 0 and the other value of val is not NA so both must be in the output. At least that's how I'm understanding the OP's problem.

                                  – Rui Barradas
                                  Jan 3 at 14:05













                                  Fair enough. People are reading the question differently.

                                  – markus
                                  Jan 3 at 14:51





                                  Fair enough. People are reading the question differently.

                                  – markus
                                  Jan 3 at 14:51











                                  0














                                  One may try this :



                                  df1 = data.frame(id = rep(c("a", "b","c"), each = 4),
                                  val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                                  df1
                                  # id val
                                  #1 a NA
                                  #2 a NA
                                  #3 a NA
                                  #4 a NA
                                  #5 b 1
                                  #6 b 2
                                  #7 b 2
                                  #8 b 3
                                  #9 c NA
                                  #10 c 2
                                  #11 c NA
                                  #12 c 3


                                  Task is to remove all rows corresponding to any id IFF val for the corresponding id is all NAs and add new row with this id and val = 0.

                                  In this example, id = a.



                                  Note : val for c also has NAs but all the val corresponding to c are not NA therefore we need to remove the corresponding row for c where val = NA.



                                  So lets create another column say, val2 which indicates 0 means its all NAs and 1 otherwise.



                                  library(dplyr)

                                  df1 = df1 %>%
                                  group_by(id) %>%
                                  mutate(val2 = if_else(condition = all(is.na(val)),true = 0, false = 1))
                                  df1

                                  # A tibble: 12 x 3
                                  # Groups: id [3]
                                  # id val val2
                                  # <fct> <dbl> <dbl>
                                  #1 a NA 0
                                  #2 a NA 0
                                  #3 a NA 0
                                  #4 a NA 0
                                  #5 b 1 1
                                  #6 b 2 1
                                  #7 b 2 1
                                  #8 b 3 1
                                  #9 c NA 1
                                  #10 c 2 1
                                  #11 c NA 1
                                  #12 c 3 1


                                  Get the list of ids with corresponding val = NA for all.



                                  all_na = unique(df1$id[df1$val2 == 0])


                                  Then remove theids from the dataframe df1 with val = NA.



                                  df1 = na.omit(df1)
                                  df1
                                  # A tibble: 6 x 3
                                  # Groups: id [2]
                                  # id val val2
                                  # <fct> <dbl> <dbl>
                                  # 1 b 1 1
                                  # 2 b 2 1
                                  # 3 b 2 1
                                  # 4 b 3 1
                                  # 5 c 2 1
                                  # 6 c 3 1


                                  And create a new dataframe with ids in all_na and val = 0



                                  all_na_df = data.frame(id = all_na, val = 0) 
                                  all_na_df
                                  # id val
                                  # 1 a 0


                                  then combine these two dataframes.



                                  df1 = bind_rows(all_na_df, df1[,c('id', 'val')])
                                  df1

                                  # id val
                                  # 1 a 0
                                  # 2 b 1
                                  # 3 b 2
                                  # 4 b 2
                                  # 5 b 3
                                  # 6 c 2
                                  # 7 c 3


                                  Hope this helps and Edits are most welcomed :-)






                                  share|improve this answer






























                                    0














                                    One may try this :



                                    df1 = data.frame(id = rep(c("a", "b","c"), each = 4),
                                    val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                                    df1
                                    # id val
                                    #1 a NA
                                    #2 a NA
                                    #3 a NA
                                    #4 a NA
                                    #5 b 1
                                    #6 b 2
                                    #7 b 2
                                    #8 b 3
                                    #9 c NA
                                    #10 c 2
                                    #11 c NA
                                    #12 c 3


                                    Task is to remove all rows corresponding to any id IFF val for the corresponding id is all NAs and add new row with this id and val = 0.

                                    In this example, id = a.



                                    Note : val for c also has NAs but all the val corresponding to c are not NA therefore we need to remove the corresponding row for c where val = NA.



                                    So lets create another column say, val2 which indicates 0 means its all NAs and 1 otherwise.



                                    library(dplyr)

                                    df1 = df1 %>%
                                    group_by(id) %>%
                                    mutate(val2 = if_else(condition = all(is.na(val)),true = 0, false = 1))
                                    df1

                                    # A tibble: 12 x 3
                                    # Groups: id [3]
                                    # id val val2
                                    # <fct> <dbl> <dbl>
                                    #1 a NA 0
                                    #2 a NA 0
                                    #3 a NA 0
                                    #4 a NA 0
                                    #5 b 1 1
                                    #6 b 2 1
                                    #7 b 2 1
                                    #8 b 3 1
                                    #9 c NA 1
                                    #10 c 2 1
                                    #11 c NA 1
                                    #12 c 3 1


                                    Get the list of ids with corresponding val = NA for all.



                                    all_na = unique(df1$id[df1$val2 == 0])


                                    Then remove theids from the dataframe df1 with val = NA.



                                    df1 = na.omit(df1)
                                    df1
                                    # A tibble: 6 x 3
                                    # Groups: id [2]
                                    # id val val2
                                    # <fct> <dbl> <dbl>
                                    # 1 b 1 1
                                    # 2 b 2 1
                                    # 3 b 2 1
                                    # 4 b 3 1
                                    # 5 c 2 1
                                    # 6 c 3 1


                                    And create a new dataframe with ids in all_na and val = 0



                                    all_na_df = data.frame(id = all_na, val = 0) 
                                    all_na_df
                                    # id val
                                    # 1 a 0


                                    then combine these two dataframes.



                                    df1 = bind_rows(all_na_df, df1[,c('id', 'val')])
                                    df1

                                    # id val
                                    # 1 a 0
                                    # 2 b 1
                                    # 3 b 2
                                    # 4 b 2
                                    # 5 b 3
                                    # 6 c 2
                                    # 7 c 3


                                    Hope this helps and Edits are most welcomed :-)






                                    share|improve this answer




























                                      0












                                      0








                                      0







                                      One may try this :



                                      df1 = data.frame(id = rep(c("a", "b","c"), each = 4),
                                      val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                                      df1
                                      # id val
                                      #1 a NA
                                      #2 a NA
                                      #3 a NA
                                      #4 a NA
                                      #5 b 1
                                      #6 b 2
                                      #7 b 2
                                      #8 b 3
                                      #9 c NA
                                      #10 c 2
                                      #11 c NA
                                      #12 c 3


                                      Task is to remove all rows corresponding to any id IFF val for the corresponding id is all NAs and add new row with this id and val = 0.

                                      In this example, id = a.



                                      Note : val for c also has NAs but all the val corresponding to c are not NA therefore we need to remove the corresponding row for c where val = NA.



                                      So lets create another column say, val2 which indicates 0 means its all NAs and 1 otherwise.



                                      library(dplyr)

                                      df1 = df1 %>%
                                      group_by(id) %>%
                                      mutate(val2 = if_else(condition = all(is.na(val)),true = 0, false = 1))
                                      df1

                                      # A tibble: 12 x 3
                                      # Groups: id [3]
                                      # id val val2
                                      # <fct> <dbl> <dbl>
                                      #1 a NA 0
                                      #2 a NA 0
                                      #3 a NA 0
                                      #4 a NA 0
                                      #5 b 1 1
                                      #6 b 2 1
                                      #7 b 2 1
                                      #8 b 3 1
                                      #9 c NA 1
                                      #10 c 2 1
                                      #11 c NA 1
                                      #12 c 3 1


                                      Get the list of ids with corresponding val = NA for all.



                                      all_na = unique(df1$id[df1$val2 == 0])


                                      Then remove theids from the dataframe df1 with val = NA.



                                      df1 = na.omit(df1)
                                      df1
                                      # A tibble: 6 x 3
                                      # Groups: id [2]
                                      # id val val2
                                      # <fct> <dbl> <dbl>
                                      # 1 b 1 1
                                      # 2 b 2 1
                                      # 3 b 2 1
                                      # 4 b 3 1
                                      # 5 c 2 1
                                      # 6 c 3 1


                                      And create a new dataframe with ids in all_na and val = 0



                                      all_na_df = data.frame(id = all_na, val = 0) 
                                      all_na_df
                                      # id val
                                      # 1 a 0


                                      then combine these two dataframes.



                                      df1 = bind_rows(all_na_df, df1[,c('id', 'val')])
                                      df1

                                      # id val
                                      # 1 a 0
                                      # 2 b 1
                                      # 3 b 2
                                      # 4 b 2
                                      # 5 b 3
                                      # 6 c 2
                                      # 7 c 3


                                      Hope this helps and Edits are most welcomed :-)






                                      share|improve this answer















                                      One may try this :



                                      df1 = data.frame(id = rep(c("a", "b","c"), each = 4),
                                      val = c(NA, NA, NA, NA, 1, 2, 2, 3,NA,2,NA,3))
                                      df1
                                      # id val
                                      #1 a NA
                                      #2 a NA
                                      #3 a NA
                                      #4 a NA
                                      #5 b 1
                                      #6 b 2
                                      #7 b 2
                                      #8 b 3
                                      #9 c NA
                                      #10 c 2
                                      #11 c NA
                                      #12 c 3


                                      Task is to remove all rows corresponding to any id IFF val for the corresponding id is all NAs and add new row with this id and val = 0.

                                      In this example, id = a.



                                      Note : val for c also has NAs but all the val corresponding to c are not NA therefore we need to remove the corresponding row for c where val = NA.



                                      So lets create another column say, val2 which indicates 0 means its all NAs and 1 otherwise.



                                      library(dplyr)

                                      df1 = df1 %>%
                                      group_by(id) %>%
                                      mutate(val2 = if_else(condition = all(is.na(val)),true = 0, false = 1))
                                      df1

                                      # A tibble: 12 x 3
                                      # Groups: id [3]
                                      # id val val2
                                      # <fct> <dbl> <dbl>
                                      #1 a NA 0
                                      #2 a NA 0
                                      #3 a NA 0
                                      #4 a NA 0
                                      #5 b 1 1
                                      #6 b 2 1
                                      #7 b 2 1
                                      #8 b 3 1
                                      #9 c NA 1
                                      #10 c 2 1
                                      #11 c NA 1
                                      #12 c 3 1


                                      Get the list of ids with corresponding val = NA for all.



                                      all_na = unique(df1$id[df1$val2 == 0])


                                      Then remove theids from the dataframe df1 with val = NA.



                                      df1 = na.omit(df1)
                                      df1
                                      # A tibble: 6 x 3
                                      # Groups: id [2]
                                      # id val val2
                                      # <fct> <dbl> <dbl>
                                      # 1 b 1 1
                                      # 2 b 2 1
                                      # 3 b 2 1
                                      # 4 b 3 1
                                      # 5 c 2 1
                                      # 6 c 3 1


                                      And create a new dataframe with ids in all_na and val = 0



                                      all_na_df = data.frame(id = all_na, val = 0) 
                                      all_na_df
                                      # id val
                                      # 1 a 0


                                      then combine these two dataframes.



                                      df1 = bind_rows(all_na_df, df1[,c('id', 'val')])
                                      df1

                                      # id val
                                      # 1 a 0
                                      # 2 b 1
                                      # 3 b 2
                                      # 4 b 2
                                      # 5 b 3
                                      # 6 c 2
                                      # 7 c 3


                                      Hope this helps and Edits are most welcomed :-)







                                      share|improve this answer














                                      share|improve this answer



                                      share|improve this answer








                                      edited Jan 8 at 12:37

























                                      answered Jan 8 at 10:48









                                      heisenbug47heisenbug47

                                      459




                                      459






























                                          draft saved

                                          draft discarded




















































                                          Thanks for contributing an answer to Stack Overflow!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54022536%2freplace-all-na-values-for-variable-with-one-row-equal-to-0%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          android studio warns about leanback feature tag usage required on manifest while using Unity exported app?

                                          SQL update select statement

                                          'app-layout' is not a known element: how to share Component with different Modules