lapply on list of dataframes not working the same as FUN applied to dfs individually












1















example data



metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)


I have 20 dataframes, each named in the following format where x is a number 1-9:



metro_20XX_X



I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.



addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}

example <- addYear(metro_2005_1)

str(example)

'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"


I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.



metro_append_year <- list(metro_2005_1, metro_2006_1)

result <- lapply(metro_append_year,addYear)

str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""









share|improve this question




















  • 2





    Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.

    – M-M
    Jan 2 at 20:29











  • You are checking one individual data frame (not all). Try lapply(result, str) and tell us if Year situation occurs across all dfs.

    – Parfait
    Jan 2 at 20:38











  • I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.

    – Alex Talbott
    Jan 2 at 21:06











  • I checked and the year is missing across all dfs.

    – Alex Talbott
    Jan 2 at 21:07
















1















example data



metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)


I have 20 dataframes, each named in the following format where x is a number 1-9:



metro_20XX_X



I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.



addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}

example <- addYear(metro_2005_1)

str(example)

'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"


I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.



metro_append_year <- list(metro_2005_1, metro_2006_1)

result <- lapply(metro_append_year,addYear)

str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""









share|improve this question




















  • 2





    Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.

    – M-M
    Jan 2 at 20:29











  • You are checking one individual data frame (not all). Try lapply(result, str) and tell us if Year situation occurs across all dfs.

    – Parfait
    Jan 2 at 20:38











  • I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.

    – Alex Talbott
    Jan 2 at 21:06











  • I checked and the year is missing across all dfs.

    – Alex Talbott
    Jan 2 at 21:07














1












1








1








example data



metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)


I have 20 dataframes, each named in the following format where x is a number 1-9:



metro_20XX_X



I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.



addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}

example <- addYear(metro_2005_1)

str(example)

'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"


I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.



metro_append_year <- list(metro_2005_1, metro_2006_1)

result <- lapply(metro_append_year,addYear)

str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""









share|improve this question
















example data



metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)


I have 20 dataframes, each named in the following format where x is a number 1-9:



metro_20XX_X



I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.



addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}

example <- addYear(metro_2005_1)

str(example)

'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"


I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.



metro_append_year <- list(metro_2005_1, metro_2006_1)

result <- lapply(metro_append_year,addYear)

str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""






r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 5 at 22:50









anothermh

3,36431733




3,36431733










asked Jan 2 at 20:26









Alex TalbottAlex Talbott

83




83








  • 2





    Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.

    – M-M
    Jan 2 at 20:29











  • You are checking one individual data frame (not all). Try lapply(result, str) and tell us if Year situation occurs across all dfs.

    – Parfait
    Jan 2 at 20:38











  • I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.

    – Alex Talbott
    Jan 2 at 21:06











  • I checked and the year is missing across all dfs.

    – Alex Talbott
    Jan 2 at 21:07














  • 2





    Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.

    – M-M
    Jan 2 at 20:29











  • You are checking one individual data frame (not all). Try lapply(result, str) and tell us if Year situation occurs across all dfs.

    – Parfait
    Jan 2 at 20:38











  • I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.

    – Alex Talbott
    Jan 2 at 21:06











  • I checked and the year is missing across all dfs.

    – Alex Talbott
    Jan 2 at 21:07








2




2





Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.

– M-M
Jan 2 at 20:29





Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.

– M-M
Jan 2 at 20:29













You are checking one individual data frame (not all). Try lapply(result, str) and tell us if Year situation occurs across all dfs.

– Parfait
Jan 2 at 20:38





You are checking one individual data frame (not all). Try lapply(result, str) and tell us if Year situation occurs across all dfs.

– Parfait
Jan 2 at 20:38













I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.

– Alex Talbott
Jan 2 at 21:06





I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.

– Alex Talbott
Jan 2 at 21:06













I checked and the year is missing across all dfs.

– Alex Talbott
Jan 2 at 21:07





I checked and the year is missing across all dfs.

– Alex Talbott
Jan 2 at 21:07












2 Answers
2






active

oldest

votes


















0














We could pass the 'data' and the name of the list element as two arguments. Now, it becomes easier



addYear <- function(data, name){

data %>%
mutate(Year = substr(name,7,10))

}
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))


data



metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))





share|improve this answer


























  • Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

    – Alex Talbott
    Jan 2 at 21:17













  • @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

    – akrun
    Jan 3 at 5:00



















0














Since you are a R newbie, consider a base R solution which can extract a list of objects with mget and iterate elementwise with Map (wrapper to mapply) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr call.



The within or transform functions mirrors dplyr::mutate where you can assign column(s) in place to return the object:



# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))

metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
names(metro_dfs), metro_dfs)


Alternatively:



metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)





share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54012713%2flapply-on-list-of-dataframes-not-working-the-same-as-fun-applied-to-dfs-individu%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    We could pass the 'data' and the name of the list element as two arguments. Now, it becomes easier



    addYear <- function(data, name){

    data %>%
    mutate(Year = substr(name,7,10))

    }
    lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))


    data



    metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
    metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
    metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))





    share|improve this answer


























    • Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

      – Alex Talbott
      Jan 2 at 21:17













    • @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

      – akrun
      Jan 3 at 5:00
















    0














    We could pass the 'data' and the name of the list element as two arguments. Now, it becomes easier



    addYear <- function(data, name){

    data %>%
    mutate(Year = substr(name,7,10))

    }
    lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))


    data



    metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
    metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
    metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))





    share|improve this answer


























    • Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

      – Alex Talbott
      Jan 2 at 21:17













    • @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

      – akrun
      Jan 3 at 5:00














    0












    0








    0







    We could pass the 'data' and the name of the list element as two arguments. Now, it becomes easier



    addYear <- function(data, name){

    data %>%
    mutate(Year = substr(name,7,10))

    }
    lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))


    data



    metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
    metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
    metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))





    share|improve this answer















    We could pass the 'data' and the name of the list element as two arguments. Now, it becomes easier



    addYear <- function(data, name){

    data %>%
    mutate(Year = substr(name,7,10))

    }
    lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))


    data



    metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
    metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
    metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 2 at 20:44

























    answered Jan 2 at 20:39









    akrunakrun

    418k13206281




    418k13206281













    • Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

      – Alex Talbott
      Jan 2 at 21:17













    • @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

      – akrun
      Jan 3 at 5:00



















    • Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

      – Alex Talbott
      Jan 2 at 21:17













    • @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

      – akrun
      Jan 3 at 5:00

















    Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

    – Alex Talbott
    Jan 2 at 21:17







    Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form

    – Alex Talbott
    Jan 2 at 21:17















    @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

    – akrun
    Jan 3 at 5:00





    @AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a list of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps

    – akrun
    Jan 3 at 5:00













    0














    Since you are a R newbie, consider a base R solution which can extract a list of objects with mget and iterate elementwise with Map (wrapper to mapply) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr call.



    The within or transform functions mirrors dplyr::mutate where you can assign column(s) in place to return the object:



    # ALL METRO DATA FRAMES
    metro_dfs <- mget(ls(pattern="metro"))

    metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
    names(metro_dfs), metro_dfs)


    Alternatively:



    metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
    names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)





    share|improve this answer




























      0














      Since you are a R newbie, consider a base R solution which can extract a list of objects with mget and iterate elementwise with Map (wrapper to mapply) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr call.



      The within or transform functions mirrors dplyr::mutate where you can assign column(s) in place to return the object:



      # ALL METRO DATA FRAMES
      metro_dfs <- mget(ls(pattern="metro"))

      metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
      names(metro_dfs), metro_dfs)


      Alternatively:



      metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
      names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)





      share|improve this answer


























        0












        0








        0







        Since you are a R newbie, consider a base R solution which can extract a list of objects with mget and iterate elementwise with Map (wrapper to mapply) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr call.



        The within or transform functions mirrors dplyr::mutate where you can assign column(s) in place to return the object:



        # ALL METRO DATA FRAMES
        metro_dfs <- mget(ls(pattern="metro"))

        metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
        names(metro_dfs), metro_dfs)


        Alternatively:



        metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
        names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)





        share|improve this answer













        Since you are a R newbie, consider a base R solution which can extract a list of objects with mget and iterate elementwise with Map (wrapper to mapply) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr call.



        The within or transform functions mirrors dplyr::mutate where you can assign column(s) in place to return the object:



        # ALL METRO DATA FRAMES
        metro_dfs <- mget(ls(pattern="metro"))

        metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
        names(metro_dfs), metro_dfs)


        Alternatively:



        metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
        names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 2 at 21:56









        ParfaitParfait

        53.5k94772




        53.5k94772






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54012713%2flapply-on-list-of-dataframes-not-working-the-same-as-fun-applied-to-dfs-individu%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            MongoDB - Not Authorized To Execute Command

            in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

            How to fix TextFormField cause rebuild widget in Flutter