recode/replace multiple values in a shared data column to a single value across data frames





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







3















I hope I haven't missed it, but I haven't been able to find a working solution to this problem.
I have a set of data frames with a shared column. These columns contain multiple and varying transcription errors, some of which are shared, others not, for multiple values.
I would like replace/recode the transcription errors (bad_values) with the correct values (good_values) across all data frames.



I have tried nesting the map*() family of functions across lists of data frames, bad_values, and good_values to do this, among other things. Here is an example:



df1 = data.frame(grp = c("a1","a.","a.",rep("b",7)), measure = rnorm(10))

df2 = data.frame(grp = c(rep("as", 3), "b2",rep("a",22)), measure = rnorm(26))

df3 = data.frame(grp = c(rep("b-",3),rep("bq",2),"a", rep("a.", 3)), measure = 1:9)


df_list = list(df1, df2, df3)
bad_values = list(c("a1","a.","as"), c("b2","b-","bq"))
good_values = list("a", "b")

dfs = map(df_list, function(x) {
x %>% mutate(grp = plyr::mapvalues(grp, bad_values, rep(good_values,length(bad_values))))
})


Which I didn't necessarily expect to work beyond a single good-bad value pair. However, I thought nesting another call to map*() within this might work:



dfs = map(df_list, function(x) {
x %>% mutate(grp = map2(bad_values, good_values, function(x,y) {
recode(grp, bad_values = good_values)})
})


I have tried a number of other approaches, none of which have worked.



Ultimately, I would like to go from a set of data frames with errors, as here:



[[1]]
grp measure
1 a1 0.5582253
2 a. 0.3400904
3 a. -0.2200824
4 b -0.7287385
5 b -0.2128275
6 b 1.9030766

[[2]]
grp measure
1 as 1.6148772
2 as 0.1090853
3 as -1.3714180
4 b2 -0.1606979
5 a 1.1726395
6 a -0.3201150

[[3]]
grp measure
1 b- 1
2 b- 2
3 b- 3
4 bq 4
5 bq 5
6 a 6


To a list of 'fixed' data frames, as such:



[[1]]
grp measure
1 a -0.7671052
2 a 0.1781247
3 a -0.7565773
4 b -0.3606900
5 b 1.9264804
6 b 0.9506608

[[2]]
grp measure
1 a 1.45036125
2 a -2.16715639
3 a 0.80105611
4 b 0.24216723
5 a 1.33089426
6 a -0.08388404

[[3]]
grp measure
1 b 1
2 b 2
3 b 3
4 b 4
5 b 5
6 a 6


Any help would be very much appreciated










share|improve this question





























    3















    I hope I haven't missed it, but I haven't been able to find a working solution to this problem.
    I have a set of data frames with a shared column. These columns contain multiple and varying transcription errors, some of which are shared, others not, for multiple values.
    I would like replace/recode the transcription errors (bad_values) with the correct values (good_values) across all data frames.



    I have tried nesting the map*() family of functions across lists of data frames, bad_values, and good_values to do this, among other things. Here is an example:



    df1 = data.frame(grp = c("a1","a.","a.",rep("b",7)), measure = rnorm(10))

    df2 = data.frame(grp = c(rep("as", 3), "b2",rep("a",22)), measure = rnorm(26))

    df3 = data.frame(grp = c(rep("b-",3),rep("bq",2),"a", rep("a.", 3)), measure = 1:9)


    df_list = list(df1, df2, df3)
    bad_values = list(c("a1","a.","as"), c("b2","b-","bq"))
    good_values = list("a", "b")

    dfs = map(df_list, function(x) {
    x %>% mutate(grp = plyr::mapvalues(grp, bad_values, rep(good_values,length(bad_values))))
    })


    Which I didn't necessarily expect to work beyond a single good-bad value pair. However, I thought nesting another call to map*() within this might work:



    dfs = map(df_list, function(x) {
    x %>% mutate(grp = map2(bad_values, good_values, function(x,y) {
    recode(grp, bad_values = good_values)})
    })


    I have tried a number of other approaches, none of which have worked.



    Ultimately, I would like to go from a set of data frames with errors, as here:



    [[1]]
    grp measure
    1 a1 0.5582253
    2 a. 0.3400904
    3 a. -0.2200824
    4 b -0.7287385
    5 b -0.2128275
    6 b 1.9030766

    [[2]]
    grp measure
    1 as 1.6148772
    2 as 0.1090853
    3 as -1.3714180
    4 b2 -0.1606979
    5 a 1.1726395
    6 a -0.3201150

    [[3]]
    grp measure
    1 b- 1
    2 b- 2
    3 b- 3
    4 bq 4
    5 bq 5
    6 a 6


    To a list of 'fixed' data frames, as such:



    [[1]]
    grp measure
    1 a -0.7671052
    2 a 0.1781247
    3 a -0.7565773
    4 b -0.3606900
    5 b 1.9264804
    6 b 0.9506608

    [[2]]
    grp measure
    1 a 1.45036125
    2 a -2.16715639
    3 a 0.80105611
    4 b 0.24216723
    5 a 1.33089426
    6 a -0.08388404

    [[3]]
    grp measure
    1 b 1
    2 b 2
    3 b 3
    4 b 4
    5 b 5
    6 a 6


    Any help would be very much appreciated










    share|improve this question

























      3












      3








      3








      I hope I haven't missed it, but I haven't been able to find a working solution to this problem.
      I have a set of data frames with a shared column. These columns contain multiple and varying transcription errors, some of which are shared, others not, for multiple values.
      I would like replace/recode the transcription errors (bad_values) with the correct values (good_values) across all data frames.



      I have tried nesting the map*() family of functions across lists of data frames, bad_values, and good_values to do this, among other things. Here is an example:



      df1 = data.frame(grp = c("a1","a.","a.",rep("b",7)), measure = rnorm(10))

      df2 = data.frame(grp = c(rep("as", 3), "b2",rep("a",22)), measure = rnorm(26))

      df3 = data.frame(grp = c(rep("b-",3),rep("bq",2),"a", rep("a.", 3)), measure = 1:9)


      df_list = list(df1, df2, df3)
      bad_values = list(c("a1","a.","as"), c("b2","b-","bq"))
      good_values = list("a", "b")

      dfs = map(df_list, function(x) {
      x %>% mutate(grp = plyr::mapvalues(grp, bad_values, rep(good_values,length(bad_values))))
      })


      Which I didn't necessarily expect to work beyond a single good-bad value pair. However, I thought nesting another call to map*() within this might work:



      dfs = map(df_list, function(x) {
      x %>% mutate(grp = map2(bad_values, good_values, function(x,y) {
      recode(grp, bad_values = good_values)})
      })


      I have tried a number of other approaches, none of which have worked.



      Ultimately, I would like to go from a set of data frames with errors, as here:



      [[1]]
      grp measure
      1 a1 0.5582253
      2 a. 0.3400904
      3 a. -0.2200824
      4 b -0.7287385
      5 b -0.2128275
      6 b 1.9030766

      [[2]]
      grp measure
      1 as 1.6148772
      2 as 0.1090853
      3 as -1.3714180
      4 b2 -0.1606979
      5 a 1.1726395
      6 a -0.3201150

      [[3]]
      grp measure
      1 b- 1
      2 b- 2
      3 b- 3
      4 bq 4
      5 bq 5
      6 a 6


      To a list of 'fixed' data frames, as such:



      [[1]]
      grp measure
      1 a -0.7671052
      2 a 0.1781247
      3 a -0.7565773
      4 b -0.3606900
      5 b 1.9264804
      6 b 0.9506608

      [[2]]
      grp measure
      1 a 1.45036125
      2 a -2.16715639
      3 a 0.80105611
      4 b 0.24216723
      5 a 1.33089426
      6 a -0.08388404

      [[3]]
      grp measure
      1 b 1
      2 b 2
      3 b 3
      4 b 4
      5 b 5
      6 a 6


      Any help would be very much appreciated










      share|improve this question














      I hope I haven't missed it, but I haven't been able to find a working solution to this problem.
      I have a set of data frames with a shared column. These columns contain multiple and varying transcription errors, some of which are shared, others not, for multiple values.
      I would like replace/recode the transcription errors (bad_values) with the correct values (good_values) across all data frames.



      I have tried nesting the map*() family of functions across lists of data frames, bad_values, and good_values to do this, among other things. Here is an example:



      df1 = data.frame(grp = c("a1","a.","a.",rep("b",7)), measure = rnorm(10))

      df2 = data.frame(grp = c(rep("as", 3), "b2",rep("a",22)), measure = rnorm(26))

      df3 = data.frame(grp = c(rep("b-",3),rep("bq",2),"a", rep("a.", 3)), measure = 1:9)


      df_list = list(df1, df2, df3)
      bad_values = list(c("a1","a.","as"), c("b2","b-","bq"))
      good_values = list("a", "b")

      dfs = map(df_list, function(x) {
      x %>% mutate(grp = plyr::mapvalues(grp, bad_values, rep(good_values,length(bad_values))))
      })


      Which I didn't necessarily expect to work beyond a single good-bad value pair. However, I thought nesting another call to map*() within this might work:



      dfs = map(df_list, function(x) {
      x %>% mutate(grp = map2(bad_values, good_values, function(x,y) {
      recode(grp, bad_values = good_values)})
      })


      I have tried a number of other approaches, none of which have worked.



      Ultimately, I would like to go from a set of data frames with errors, as here:



      [[1]]
      grp measure
      1 a1 0.5582253
      2 a. 0.3400904
      3 a. -0.2200824
      4 b -0.7287385
      5 b -0.2128275
      6 b 1.9030766

      [[2]]
      grp measure
      1 as 1.6148772
      2 as 0.1090853
      3 as -1.3714180
      4 b2 -0.1606979
      5 a 1.1726395
      6 a -0.3201150

      [[3]]
      grp measure
      1 b- 1
      2 b- 2
      3 b- 3
      4 bq 4
      5 bq 5
      6 a 6


      To a list of 'fixed' data frames, as such:



      [[1]]
      grp measure
      1 a -0.7671052
      2 a 0.1781247
      3 a -0.7565773
      4 b -0.3606900
      5 b 1.9264804
      6 b 0.9506608

      [[2]]
      grp measure
      1 a 1.45036125
      2 a -2.16715639
      3 a 0.80105611
      4 b 0.24216723
      5 a 1.33089426
      6 a -0.08388404

      [[3]]
      grp measure
      1 b 1
      2 b 2
      3 b 3
      4 b 4
      5 b 5
      6 a 6


      Any help would be very much appreciated







      r dplyr lapply purrr






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 3 at 6:20









      Jim JunkerJim Junker

      304




      304
























          3 Answers
          3






          active

          oldest

          votes


















          3














          Here is an option using tidyverse with recode_factor. When there are multiple elements to be changed, create a list of key/val elements and use recode_factor to match and change the values to new levels



          library(tidyverse)
          keyval <- setNames(rep(good_values, lengths(bad_values)), unlist(bad_values))
          out <- map(df_list, ~ .x %>%
          mutate(grp = recode_factor(grp, !!! keyval)))


          -output



          out
          #[[1]]
          # grp measure
          #1 a -1.63295876
          #2 a 0.03859976
          #3 a -0.46541610
          #4 b -0.72356671
          #5 b -1.11552841
          #6 b 0.99352861
          #....

          #[[2]]
          # grp measure
          #1 a 1.26536789
          #2 a -0.48189740
          #3 a 0.23041056
          #4 b -1.01324689
          #5 a -1.41586086
          #6 a 0.59026463
          #....


          #[[3]]
          # grp measure
          #1 b 1
          #2 b 2
          #3 b 3
          #4 b 4
          #5 b 5
          #6 a 6
          #....


          NOTE: This doesn't change the class of the initial dataset column



          str(out)
          #List of 3
          # $ :'data.frame': 10 obs. of 2 variables:
          # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 2 2 2 2 2 2
          # ..$ measure: num [1:10] -1.633 0.0386 -0.4654 -0.7236 -1.1155 ...
          # $ :'data.frame': 26 obs. of 2 variables:
          # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 1 1 1 1 1 1 ...
          # ..$ measure: num [1:26] 1.265 -0.482 0.23 -1.013 -1.416 ...
          # $ :'data.frame': 9 obs. of 2 variables:
          # ..$ grp : Factor w/ 2 levels "a","b": 2 2 2 2 2 1 1 1 1
          # ..$ measure: int [1:9] 1 2 3 4 5 6 7 8 9




          Once we have a keyval pair list, this can be also used in base R functions



          out1 <- lapply(df_list, transform, grp = unlist(keyval[grp]))





          share|improve this answer

































            2














            Any reason mapping a case_when statement wouldn't work?



            library(tidyverse)
            df_list %>%
            map(~ mutate_if(.x, is.factor, as.character)) %>% # convert factor to character
            map(~ mutate(.x, grp = case_when(grp %in% bad_values[[1]] ~ good_values[[1]],
            grp %in% bad_values[[2]] ~ good_values[[2]],
            TRUE ~ grp)))


            I could see it working for your reprex but possibly not the greater problem.






            share|improve this answer































              1














              A base R option if you have lot of good_values and bad_values and it is not possible to check each one individually.



              lapply(df_list, function(x) {
              vec = x[['grp']]
              mapply(function(p, q) vec[vec %in% p] <<- q ,bad_values, good_values)
              transform(x, grp = vec)
              })


              #[[1]]
              # grp measure
              #1 a -0.648146527
              #2 a -0.004722549
              #3 a -0.943451194
              #4 b -0.709509396
              #5 b -0.719434286
              #....

              #[[2]]
              # grp measure
              #1 a 1.03131291
              #2 a -0.85558910
              #3 a -0.05933911
              #4 b 0.67812934
              #5 a 3.23854093
              #6 a 1.31688645
              #7 a 1.87464048
              #8 a 0.90100179
              #....

              #[[3]]
              # grp measure
              #1 b 1
              #2 b 2
              #3 b 3
              #4 b 4
              #5 b 5
              #....


              Here, for every list element we extract it's grp column and replace bad_values with corresponding good_values if they are found and return the corrected dataframe.






              share|improve this answer
























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54017250%2frecode-replace-multiple-values-in-a-shared-data-column-to-a-single-value-across%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                3














                Here is an option using tidyverse with recode_factor. When there are multiple elements to be changed, create a list of key/val elements and use recode_factor to match and change the values to new levels



                library(tidyverse)
                keyval <- setNames(rep(good_values, lengths(bad_values)), unlist(bad_values))
                out <- map(df_list, ~ .x %>%
                mutate(grp = recode_factor(grp, !!! keyval)))


                -output



                out
                #[[1]]
                # grp measure
                #1 a -1.63295876
                #2 a 0.03859976
                #3 a -0.46541610
                #4 b -0.72356671
                #5 b -1.11552841
                #6 b 0.99352861
                #....

                #[[2]]
                # grp measure
                #1 a 1.26536789
                #2 a -0.48189740
                #3 a 0.23041056
                #4 b -1.01324689
                #5 a -1.41586086
                #6 a 0.59026463
                #....


                #[[3]]
                # grp measure
                #1 b 1
                #2 b 2
                #3 b 3
                #4 b 4
                #5 b 5
                #6 a 6
                #....


                NOTE: This doesn't change the class of the initial dataset column



                str(out)
                #List of 3
                # $ :'data.frame': 10 obs. of 2 variables:
                # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 2 2 2 2 2 2
                # ..$ measure: num [1:10] -1.633 0.0386 -0.4654 -0.7236 -1.1155 ...
                # $ :'data.frame': 26 obs. of 2 variables:
                # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 1 1 1 1 1 1 ...
                # ..$ measure: num [1:26] 1.265 -0.482 0.23 -1.013 -1.416 ...
                # $ :'data.frame': 9 obs. of 2 variables:
                # ..$ grp : Factor w/ 2 levels "a","b": 2 2 2 2 2 1 1 1 1
                # ..$ measure: int [1:9] 1 2 3 4 5 6 7 8 9




                Once we have a keyval pair list, this can be also used in base R functions



                out1 <- lapply(df_list, transform, grp = unlist(keyval[grp]))





                share|improve this answer






























                  3














                  Here is an option using tidyverse with recode_factor. When there are multiple elements to be changed, create a list of key/val elements and use recode_factor to match and change the values to new levels



                  library(tidyverse)
                  keyval <- setNames(rep(good_values, lengths(bad_values)), unlist(bad_values))
                  out <- map(df_list, ~ .x %>%
                  mutate(grp = recode_factor(grp, !!! keyval)))


                  -output



                  out
                  #[[1]]
                  # grp measure
                  #1 a -1.63295876
                  #2 a 0.03859976
                  #3 a -0.46541610
                  #4 b -0.72356671
                  #5 b -1.11552841
                  #6 b 0.99352861
                  #....

                  #[[2]]
                  # grp measure
                  #1 a 1.26536789
                  #2 a -0.48189740
                  #3 a 0.23041056
                  #4 b -1.01324689
                  #5 a -1.41586086
                  #6 a 0.59026463
                  #....


                  #[[3]]
                  # grp measure
                  #1 b 1
                  #2 b 2
                  #3 b 3
                  #4 b 4
                  #5 b 5
                  #6 a 6
                  #....


                  NOTE: This doesn't change the class of the initial dataset column



                  str(out)
                  #List of 3
                  # $ :'data.frame': 10 obs. of 2 variables:
                  # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 2 2 2 2 2 2
                  # ..$ measure: num [1:10] -1.633 0.0386 -0.4654 -0.7236 -1.1155 ...
                  # $ :'data.frame': 26 obs. of 2 variables:
                  # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 1 1 1 1 1 1 ...
                  # ..$ measure: num [1:26] 1.265 -0.482 0.23 -1.013 -1.416 ...
                  # $ :'data.frame': 9 obs. of 2 variables:
                  # ..$ grp : Factor w/ 2 levels "a","b": 2 2 2 2 2 1 1 1 1
                  # ..$ measure: int [1:9] 1 2 3 4 5 6 7 8 9




                  Once we have a keyval pair list, this can be also used in base R functions



                  out1 <- lapply(df_list, transform, grp = unlist(keyval[grp]))





                  share|improve this answer




























                    3












                    3








                    3







                    Here is an option using tidyverse with recode_factor. When there are multiple elements to be changed, create a list of key/val elements and use recode_factor to match and change the values to new levels



                    library(tidyverse)
                    keyval <- setNames(rep(good_values, lengths(bad_values)), unlist(bad_values))
                    out <- map(df_list, ~ .x %>%
                    mutate(grp = recode_factor(grp, !!! keyval)))


                    -output



                    out
                    #[[1]]
                    # grp measure
                    #1 a -1.63295876
                    #2 a 0.03859976
                    #3 a -0.46541610
                    #4 b -0.72356671
                    #5 b -1.11552841
                    #6 b 0.99352861
                    #....

                    #[[2]]
                    # grp measure
                    #1 a 1.26536789
                    #2 a -0.48189740
                    #3 a 0.23041056
                    #4 b -1.01324689
                    #5 a -1.41586086
                    #6 a 0.59026463
                    #....


                    #[[3]]
                    # grp measure
                    #1 b 1
                    #2 b 2
                    #3 b 3
                    #4 b 4
                    #5 b 5
                    #6 a 6
                    #....


                    NOTE: This doesn't change the class of the initial dataset column



                    str(out)
                    #List of 3
                    # $ :'data.frame': 10 obs. of 2 variables:
                    # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 2 2 2 2 2 2
                    # ..$ measure: num [1:10] -1.633 0.0386 -0.4654 -0.7236 -1.1155 ...
                    # $ :'data.frame': 26 obs. of 2 variables:
                    # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 1 1 1 1 1 1 ...
                    # ..$ measure: num [1:26] 1.265 -0.482 0.23 -1.013 -1.416 ...
                    # $ :'data.frame': 9 obs. of 2 variables:
                    # ..$ grp : Factor w/ 2 levels "a","b": 2 2 2 2 2 1 1 1 1
                    # ..$ measure: int [1:9] 1 2 3 4 5 6 7 8 9




                    Once we have a keyval pair list, this can be also used in base R functions



                    out1 <- lapply(df_list, transform, grp = unlist(keyval[grp]))





                    share|improve this answer















                    Here is an option using tidyverse with recode_factor. When there are multiple elements to be changed, create a list of key/val elements and use recode_factor to match and change the values to new levels



                    library(tidyverse)
                    keyval <- setNames(rep(good_values, lengths(bad_values)), unlist(bad_values))
                    out <- map(df_list, ~ .x %>%
                    mutate(grp = recode_factor(grp, !!! keyval)))


                    -output



                    out
                    #[[1]]
                    # grp measure
                    #1 a -1.63295876
                    #2 a 0.03859976
                    #3 a -0.46541610
                    #4 b -0.72356671
                    #5 b -1.11552841
                    #6 b 0.99352861
                    #....

                    #[[2]]
                    # grp measure
                    #1 a 1.26536789
                    #2 a -0.48189740
                    #3 a 0.23041056
                    #4 b -1.01324689
                    #5 a -1.41586086
                    #6 a 0.59026463
                    #....


                    #[[3]]
                    # grp measure
                    #1 b 1
                    #2 b 2
                    #3 b 3
                    #4 b 4
                    #5 b 5
                    #6 a 6
                    #....


                    NOTE: This doesn't change the class of the initial dataset column



                    str(out)
                    #List of 3
                    # $ :'data.frame': 10 obs. of 2 variables:
                    # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 2 2 2 2 2 2
                    # ..$ measure: num [1:10] -1.633 0.0386 -0.4654 -0.7236 -1.1155 ...
                    # $ :'data.frame': 26 obs. of 2 variables:
                    # ..$ grp : Factor w/ 2 levels "a","b": 1 1 1 2 1 1 1 1 1 1 ...
                    # ..$ measure: num [1:26] 1.265 -0.482 0.23 -1.013 -1.416 ...
                    # $ :'data.frame': 9 obs. of 2 variables:
                    # ..$ grp : Factor w/ 2 levels "a","b": 2 2 2 2 2 1 1 1 1
                    # ..$ measure: int [1:9] 1 2 3 4 5 6 7 8 9




                    Once we have a keyval pair list, this can be also used in base R functions



                    out1 <- lapply(df_list, transform, grp = unlist(keyval[grp]))






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jan 3 at 7:38

























                    answered Jan 3 at 7:23









                    akrunakrun

                    419k13207284




                    419k13207284

























                        2














                        Any reason mapping a case_when statement wouldn't work?



                        library(tidyverse)
                        df_list %>%
                        map(~ mutate_if(.x, is.factor, as.character)) %>% # convert factor to character
                        map(~ mutate(.x, grp = case_when(grp %in% bad_values[[1]] ~ good_values[[1]],
                        grp %in% bad_values[[2]] ~ good_values[[2]],
                        TRUE ~ grp)))


                        I could see it working for your reprex but possibly not the greater problem.






                        share|improve this answer




























                          2














                          Any reason mapping a case_when statement wouldn't work?



                          library(tidyverse)
                          df_list %>%
                          map(~ mutate_if(.x, is.factor, as.character)) %>% # convert factor to character
                          map(~ mutate(.x, grp = case_when(grp %in% bad_values[[1]] ~ good_values[[1]],
                          grp %in% bad_values[[2]] ~ good_values[[2]],
                          TRUE ~ grp)))


                          I could see it working for your reprex but possibly not the greater problem.






                          share|improve this answer


























                            2












                            2








                            2







                            Any reason mapping a case_when statement wouldn't work?



                            library(tidyverse)
                            df_list %>%
                            map(~ mutate_if(.x, is.factor, as.character)) %>% # convert factor to character
                            map(~ mutate(.x, grp = case_when(grp %in% bad_values[[1]] ~ good_values[[1]],
                            grp %in% bad_values[[2]] ~ good_values[[2]],
                            TRUE ~ grp)))


                            I could see it working for your reprex but possibly not the greater problem.






                            share|improve this answer













                            Any reason mapping a case_when statement wouldn't work?



                            library(tidyverse)
                            df_list %>%
                            map(~ mutate_if(.x, is.factor, as.character)) %>% # convert factor to character
                            map(~ mutate(.x, grp = case_when(grp %in% bad_values[[1]] ~ good_values[[1]],
                            grp %in% bad_values[[2]] ~ good_values[[2]],
                            TRUE ~ grp)))


                            I could see it working for your reprex but possibly not the greater problem.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Jan 3 at 6:52









                            zackzack

                            3,4131322




                            3,4131322























                                1














                                A base R option if you have lot of good_values and bad_values and it is not possible to check each one individually.



                                lapply(df_list, function(x) {
                                vec = x[['grp']]
                                mapply(function(p, q) vec[vec %in% p] <<- q ,bad_values, good_values)
                                transform(x, grp = vec)
                                })


                                #[[1]]
                                # grp measure
                                #1 a -0.648146527
                                #2 a -0.004722549
                                #3 a -0.943451194
                                #4 b -0.709509396
                                #5 b -0.719434286
                                #....

                                #[[2]]
                                # grp measure
                                #1 a 1.03131291
                                #2 a -0.85558910
                                #3 a -0.05933911
                                #4 b 0.67812934
                                #5 a 3.23854093
                                #6 a 1.31688645
                                #7 a 1.87464048
                                #8 a 0.90100179
                                #....

                                #[[3]]
                                # grp measure
                                #1 b 1
                                #2 b 2
                                #3 b 3
                                #4 b 4
                                #5 b 5
                                #....


                                Here, for every list element we extract it's grp column and replace bad_values with corresponding good_values if they are found and return the corrected dataframe.






                                share|improve this answer




























                                  1














                                  A base R option if you have lot of good_values and bad_values and it is not possible to check each one individually.



                                  lapply(df_list, function(x) {
                                  vec = x[['grp']]
                                  mapply(function(p, q) vec[vec %in% p] <<- q ,bad_values, good_values)
                                  transform(x, grp = vec)
                                  })


                                  #[[1]]
                                  # grp measure
                                  #1 a -0.648146527
                                  #2 a -0.004722549
                                  #3 a -0.943451194
                                  #4 b -0.709509396
                                  #5 b -0.719434286
                                  #....

                                  #[[2]]
                                  # grp measure
                                  #1 a 1.03131291
                                  #2 a -0.85558910
                                  #3 a -0.05933911
                                  #4 b 0.67812934
                                  #5 a 3.23854093
                                  #6 a 1.31688645
                                  #7 a 1.87464048
                                  #8 a 0.90100179
                                  #....

                                  #[[3]]
                                  # grp measure
                                  #1 b 1
                                  #2 b 2
                                  #3 b 3
                                  #4 b 4
                                  #5 b 5
                                  #....


                                  Here, for every list element we extract it's grp column and replace bad_values with corresponding good_values if they are found and return the corrected dataframe.






                                  share|improve this answer


























                                    1












                                    1








                                    1







                                    A base R option if you have lot of good_values and bad_values and it is not possible to check each one individually.



                                    lapply(df_list, function(x) {
                                    vec = x[['grp']]
                                    mapply(function(p, q) vec[vec %in% p] <<- q ,bad_values, good_values)
                                    transform(x, grp = vec)
                                    })


                                    #[[1]]
                                    # grp measure
                                    #1 a -0.648146527
                                    #2 a -0.004722549
                                    #3 a -0.943451194
                                    #4 b -0.709509396
                                    #5 b -0.719434286
                                    #....

                                    #[[2]]
                                    # grp measure
                                    #1 a 1.03131291
                                    #2 a -0.85558910
                                    #3 a -0.05933911
                                    #4 b 0.67812934
                                    #5 a 3.23854093
                                    #6 a 1.31688645
                                    #7 a 1.87464048
                                    #8 a 0.90100179
                                    #....

                                    #[[3]]
                                    # grp measure
                                    #1 b 1
                                    #2 b 2
                                    #3 b 3
                                    #4 b 4
                                    #5 b 5
                                    #....


                                    Here, for every list element we extract it's grp column and replace bad_values with corresponding good_values if they are found and return the corrected dataframe.






                                    share|improve this answer













                                    A base R option if you have lot of good_values and bad_values and it is not possible to check each one individually.



                                    lapply(df_list, function(x) {
                                    vec = x[['grp']]
                                    mapply(function(p, q) vec[vec %in% p] <<- q ,bad_values, good_values)
                                    transform(x, grp = vec)
                                    })


                                    #[[1]]
                                    # grp measure
                                    #1 a -0.648146527
                                    #2 a -0.004722549
                                    #3 a -0.943451194
                                    #4 b -0.709509396
                                    #5 b -0.719434286
                                    #....

                                    #[[2]]
                                    # grp measure
                                    #1 a 1.03131291
                                    #2 a -0.85558910
                                    #3 a -0.05933911
                                    #4 b 0.67812934
                                    #5 a 3.23854093
                                    #6 a 1.31688645
                                    #7 a 1.87464048
                                    #8 a 0.90100179
                                    #....

                                    #[[3]]
                                    # grp measure
                                    #1 b 1
                                    #2 b 2
                                    #3 b 3
                                    #4 b 4
                                    #5 b 5
                                    #....


                                    Here, for every list element we extract it's grp column and replace bad_values with corresponding good_values if they are found and return the corrected dataframe.







                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Jan 3 at 7:11









                                    Ronak ShahRonak Shah

                                    45.3k104267




                                    45.3k104267






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54017250%2frecode-replace-multiple-values-in-a-shared-data-column-to-a-single-value-across%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        MongoDB - Not Authorized To Execute Command

                                        Npm cannot find a required file even through it is in the searched directory

                                        in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith