lapply on list of dataframes not working the same as FUN applied to dfs individually
example data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
I have 20 dataframes, each named in the following format where x is a number 1-9:
metro_20XX_X
I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.
addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}
example <- addYear(metro_2005_1)
str(example)
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"
I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.
metro_append_year <- list(metro_2005_1, metro_2006_1)
result <- lapply(metro_append_year,addYear)
str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""
r
add a comment |
example data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
I have 20 dataframes, each named in the following format where x is a number 1-9:
metro_20XX_X
I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.
addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}
example <- addYear(metro_2005_1)
str(example)
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"
I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.
metro_append_year <- list(metro_2005_1, metro_2006_1)
result <- lapply(metro_append_year,addYear)
str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""
r
2
Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.
– M-M
Jan 2 at 20:29
You are checking one individual data frame (not all). Trylapply(result, str)
and tell us if Year situation occurs across all dfs.
– Parfait
Jan 2 at 20:38
I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.
– Alex Talbott
Jan 2 at 21:06
I checked and the year is missing across all dfs.
– Alex Talbott
Jan 2 at 21:07
add a comment |
example data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
I have 20 dataframes, each named in the following format where x is a number 1-9:
metro_20XX_X
I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.
addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}
example <- addYear(metro_2005_1)
str(example)
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"
I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.
metro_append_year <- list(metro_2005_1, metro_2006_1)
result <- lapply(metro_append_year,addYear)
str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""
r
example data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
I have 20 dataframes, each named in the following format where x is a number 1-9:
metro_20XX_X
I am trying to extract the middle section into a new column, and wrote a function that works when applied on each dataframe individually called addYear.
addYear <- function(metro){
metro_name <- deparse(substitute(metro))
metro <- metro %>% mutate(Year = substr(metro_name,7,10))
return(metro)
}
example <- addYear(metro_2005_1)
str(example)
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "2005" "2005" "2005" "2005"
I added all 20 of my dataframes into a list called metro_append_year, and tried to apply my addYear function to all 20 of the dataframes using lapply. However, when I inspect "result" the year column is created in each of my dataframes but empty.
metro_append_year <- list(metro_2005_1, metro_2006_1)
result <- lapply(metro_append_year,addYear)
str(result[[1]])
'data.frame': 5 obs. of 3 variables:
$ col1: int 1 2 3 4 5
$ col2: int 6 7 8 9 10
$ Year: chr "" "" "" ""
r
r
edited Jan 5 at 22:50


anothermh
3,36431733
3,36431733
asked Jan 2 at 20:26


Alex TalbottAlex Talbott
83
83
2
Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.
– M-M
Jan 2 at 20:29
You are checking one individual data frame (not all). Trylapply(result, str)
and tell us if Year situation occurs across all dfs.
– Parfait
Jan 2 at 20:38
I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.
– Alex Talbott
Jan 2 at 21:06
I checked and the year is missing across all dfs.
– Alex Talbott
Jan 2 at 21:07
add a comment |
2
Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.
– M-M
Jan 2 at 20:29
You are checking one individual data frame (not all). Trylapply(result, str)
and tell us if Year situation occurs across all dfs.
– Parfait
Jan 2 at 20:38
I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.
– Alex Talbott
Jan 2 at 21:06
I checked and the year is missing across all dfs.
– Alex Talbott
Jan 2 at 21:07
2
2
Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.
– M-M
Jan 2 at 20:29
Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.
– M-M
Jan 2 at 20:29
You are checking one individual data frame (not all). Try
lapply(result, str)
and tell us if Year situation occurs across all dfs.– Parfait
Jan 2 at 20:38
You are checking one individual data frame (not all). Try
lapply(result, str)
and tell us if Year situation occurs across all dfs.– Parfait
Jan 2 at 20:38
I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.
– Alex Talbott
Jan 2 at 21:06
I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.
– Alex Talbott
Jan 2 at 21:06
I checked and the year is missing across all dfs.
– Alex Talbott
Jan 2 at 21:07
I checked and the year is missing across all dfs.
– Alex Talbott
Jan 2 at 21:07
add a comment |
2 Answers
2
active
oldest
votes
We could pass the 'data' and the name of the list
element as two arguments. Now, it becomes easier
addYear <- function(data, name){
data %>%
mutate(Year = substr(name,7,10))
}
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))
data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in alist
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps
– akrun
Jan 3 at 5:00
add a comment |
Since you are a R newbie, consider a base R solution which can extract a list of objects with mget
and iterate elementwise with Map
(wrapper to mapply
) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr
call.
The within
or transform
functions mirrors dplyr::mutate
where you can assign column(s) in place to return the object:
# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))
metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
names(metro_dfs), metro_dfs)
Alternatively:
metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)
add a comment |
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54012713%2flapply-on-list-of-dataframes-not-working-the-same-as-fun-applied-to-dfs-individu%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
We could pass the 'data' and the name of the list
element as two arguments. Now, it becomes easier
addYear <- function(data, name){
data %>%
mutate(Year = substr(name,7,10))
}
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))
data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in alist
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps
– akrun
Jan 3 at 5:00
add a comment |
We could pass the 'data' and the name of the list
element as two arguments. Now, it becomes easier
addYear <- function(data, name){
data %>%
mutate(Year = substr(name,7,10))
}
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))
data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in alist
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps
– akrun
Jan 3 at 5:00
add a comment |
We could pass the 'data' and the name of the list
element as two arguments. Now, it becomes easier
addYear <- function(data, name){
data %>%
mutate(Year = substr(name,7,10))
}
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))
data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))
We could pass the 'data' and the name of the list
element as two arguments. Now, it becomes easier
addYear <- function(data, name){
data %>%
mutate(Year = substr(name,7,10))
}
lapply(names(metro_append_year), function(nm) addYear(metro_append_year[[nm]], nm))
data
metro_2005_1 <- data.frame(col1 = 1:5, col2 = 6:10)
metro_2006_1 <- data.frame(col1 = 1:3, col2 = 4:6)
metro_append_year <- mget(ls(pattern = '^metro_\d{4}'))
edited Jan 2 at 20:44
answered Jan 2 at 20:39
akrunakrun
418k13206281
418k13206281
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in alist
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps
– akrun
Jan 3 at 5:00
add a comment |
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in alist
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps
– akrun
Jan 3 at 5:00
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
Thank you, this worked. I am still researching as to why. So far it looks like I needed to use lapply in the function(nm) format in order to reference the names of the list elements. From Advanced R: three basic ways to use lapply(): lapply(xs, function(x) {}) lapply(seq_along(xs), function(i) {}) lapply(names(xs), function(nm) {}) Typically you’d use the first form because lapply() takes care of saving the output for you. However, if you need to know position or name of the element you’re working with, you should use the 2nd or 3rd form
– Alex Talbott
Jan 2 at 21:17
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a
list
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps– akrun
Jan 3 at 5:00
@AlexTalbott There are issues in getting the individual object. In your post, the single argument evaluate the object as well as extracts the substring. Here, it is in a
list
of data.frames. So, indexing the list to extract the object is the easiest way instead of doing some complicated steps– akrun
Jan 3 at 5:00
add a comment |
Since you are a R newbie, consider a base R solution which can extract a list of objects with mget
and iterate elementwise with Map
(wrapper to mapply
) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr
call.
The within
or transform
functions mirrors dplyr::mutate
where you can assign column(s) in place to return the object:
# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))
metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
names(metro_dfs), metro_dfs)
Alternatively:
metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)
add a comment |
Since you are a R newbie, consider a base R solution which can extract a list of objects with mget
and iterate elementwise with Map
(wrapper to mapply
) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr
call.
The within
or transform
functions mirrors dplyr::mutate
where you can assign column(s) in place to return the object:
# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))
metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
names(metro_dfs), metro_dfs)
Alternatively:
metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)
add a comment |
Since you are a R newbie, consider a base R solution which can extract a list of objects with mget
and iterate elementwise with Map
(wrapper to mapply
) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr
call.
The within
or transform
functions mirrors dplyr::mutate
where you can assign column(s) in place to return the object:
# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))
metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
names(metro_dfs), metro_dfs)
Alternatively:
metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)
Since you are a R newbie, consider a base R solution which can extract a list of objects with mget
and iterate elementwise with Map
(wrapper to mapply
) through list names and corresponding values. Possibly the passing of names for unquoted column aliases is the issue with your dplyr
call.
The within
or transform
functions mirrors dplyr::mutate
where you can assign column(s) in place to return the object:
# ALL METRO DATA FRAMES
metro_dfs <- mget(ls(pattern="metro"))
metro_dfs <- Map(function(name, df) within(df, Year <- substr(name,7,10))),
names(metro_dfs), metro_dfs)
Alternatively:
metro_dfs <- mapply(function(name, df) transform(df, Year = substr(name,7,10))),
names(metro_dfs), metro_dfs, SIMPLIFY=FALSE)
answered Jan 2 at 21:56


ParfaitParfait
53.5k94772
53.5k94772
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54012713%2flapply-on-list-of-dataframes-not-working-the-same-as-fun-applied-to-dfs-individu%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Welcome to Stack Overflow! Please provide a reproducible example in r. The link I provided, will tell you how. Moreover, please take the tour and visit how to ask. Cheers.
– M-M
Jan 2 at 20:29
You are checking one individual data frame (not all). Try
lapply(result, str)
and tell us if Year situation occurs across all dfs.– Parfait
Jan 2 at 20:38
I edited the post to include a reproducible example, borrowing from akrun's answer for the example data.
– Alex Talbott
Jan 2 at 21:06
I checked and the year is missing across all dfs.
– Alex Talbott
Jan 2 at 21:07