Using apply() function to iterate over different data types doesn't work












2















I want to write a function that dynamically uses different correlation methods depending on the scale of measure of the feature (continuous, dichotomous, ordinal). The label is always continuous. My idea was to use the apply() function, so iterate over every feature (aka column), check it's scale of measure (numeric, factor with two levels, factor with more than two levels) and then use the appropriate correlation function. Unfortunately my code seems to convert every feature into a character vector and as consequence the condition in the if statement is always false for every column. I don't know why my code is doing this. How can I prevent my code from converting my features to character vectors?



set.seed(42)    
foo <- sample(c("x", "y"), 200, replace = T, prob = c(0.7, 0.3))
bar <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.5,0.05,0.1,0.1,0.25))
y <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.25,0.1,0.1,0.05,0.5))
data <- data.frame(foo,bar,y)
features <- data[, !names(data) %in% 'y']

dyn.corr <- function(x,y){
# print out structure of every column
print(str(x))

# if feature is numeric and has more than two outcomes use corr.test
if(is.numeric(x) & length(unique(x))>2){
result <- corr.test(x,y)[['r']]
} else {
result <- "else"
}
}

result <- apply(features,2,dyn.corr,y)









share|improve this question




















  • 2





    In general, please don't include rm(list = ls()) in your questions. I copy/pasted your code into R and almost ran it before noticing that line. I would have been very sad to lose some of the things I had been working on.

    – Gregor
    Nov 20 '18 at 16:06
















2















I want to write a function that dynamically uses different correlation methods depending on the scale of measure of the feature (continuous, dichotomous, ordinal). The label is always continuous. My idea was to use the apply() function, so iterate over every feature (aka column), check it's scale of measure (numeric, factor with two levels, factor with more than two levels) and then use the appropriate correlation function. Unfortunately my code seems to convert every feature into a character vector and as consequence the condition in the if statement is always false for every column. I don't know why my code is doing this. How can I prevent my code from converting my features to character vectors?



set.seed(42)    
foo <- sample(c("x", "y"), 200, replace = T, prob = c(0.7, 0.3))
bar <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.5,0.05,0.1,0.1,0.25))
y <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.25,0.1,0.1,0.05,0.5))
data <- data.frame(foo,bar,y)
features <- data[, !names(data) %in% 'y']

dyn.corr <- function(x,y){
# print out structure of every column
print(str(x))

# if feature is numeric and has more than two outcomes use corr.test
if(is.numeric(x) & length(unique(x))>2){
result <- corr.test(x,y)[['r']]
} else {
result <- "else"
}
}

result <- apply(features,2,dyn.corr,y)









share|improve this question




















  • 2





    In general, please don't include rm(list = ls()) in your questions. I copy/pasted your code into R and almost ran it before noticing that line. I would have been very sad to lose some of the things I had been working on.

    – Gregor
    Nov 20 '18 at 16:06














2












2








2








I want to write a function that dynamically uses different correlation methods depending on the scale of measure of the feature (continuous, dichotomous, ordinal). The label is always continuous. My idea was to use the apply() function, so iterate over every feature (aka column), check it's scale of measure (numeric, factor with two levels, factor with more than two levels) and then use the appropriate correlation function. Unfortunately my code seems to convert every feature into a character vector and as consequence the condition in the if statement is always false for every column. I don't know why my code is doing this. How can I prevent my code from converting my features to character vectors?



set.seed(42)    
foo <- sample(c("x", "y"), 200, replace = T, prob = c(0.7, 0.3))
bar <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.5,0.05,0.1,0.1,0.25))
y <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.25,0.1,0.1,0.05,0.5))
data <- data.frame(foo,bar,y)
features <- data[, !names(data) %in% 'y']

dyn.corr <- function(x,y){
# print out structure of every column
print(str(x))

# if feature is numeric and has more than two outcomes use corr.test
if(is.numeric(x) & length(unique(x))>2){
result <- corr.test(x,y)[['r']]
} else {
result <- "else"
}
}

result <- apply(features,2,dyn.corr,y)









share|improve this question
















I want to write a function that dynamically uses different correlation methods depending on the scale of measure of the feature (continuous, dichotomous, ordinal). The label is always continuous. My idea was to use the apply() function, so iterate over every feature (aka column), check it's scale of measure (numeric, factor with two levels, factor with more than two levels) and then use the appropriate correlation function. Unfortunately my code seems to convert every feature into a character vector and as consequence the condition in the if statement is always false for every column. I don't know why my code is doing this. How can I prevent my code from converting my features to character vectors?



set.seed(42)    
foo <- sample(c("x", "y"), 200, replace = T, prob = c(0.7, 0.3))
bar <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.5,0.05,0.1,0.1,0.25))
y <- sample(c(1,2,3,4,5),200,replace = T,prob=c(0.25,0.1,0.1,0.05,0.5))
data <- data.frame(foo,bar,y)
features <- data[, !names(data) %in% 'y']

dyn.corr <- function(x,y){
# print out structure of every column
print(str(x))

# if feature is numeric and has more than two outcomes use corr.test
if(is.numeric(x) & length(unique(x))>2){
result <- corr.test(x,y)[['r']]
} else {
result <- "else"
}
}

result <- apply(features,2,dyn.corr,y)






r if-statement apply correlation






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 '18 at 16:41







Johannes Wiesner

















asked Nov 20 '18 at 16:02









Johannes WiesnerJohannes Wiesner

167




167








  • 2





    In general, please don't include rm(list = ls()) in your questions. I copy/pasted your code into R and almost ran it before noticing that line. I would have been very sad to lose some of the things I had been working on.

    – Gregor
    Nov 20 '18 at 16:06














  • 2





    In general, please don't include rm(list = ls()) in your questions. I copy/pasted your code into R and almost ran it before noticing that line. I would have been very sad to lose some of the things I had been working on.

    – Gregor
    Nov 20 '18 at 16:06








2




2





In general, please don't include rm(list = ls()) in your questions. I copy/pasted your code into R and almost ran it before noticing that line. I would have been very sad to lose some of the things I had been working on.

– Gregor
Nov 20 '18 at 16:06





In general, please don't include rm(list = ls()) in your questions. I copy/pasted your code into R and almost ran it before noticing that line. I would have been very sad to lose some of the things I had been working on.

– Gregor
Nov 20 '18 at 16:06












1 Answer
1






active

oldest

votes


















3














apply is built for matrices. When you apply to a data frame, the first thing that happens is coercing your data frame to a matrix. A matrix can only have one data type, so all columns of your data are converted to the most general type among them when this happens.



Use sapply or lapply to work with columns of a data frame.



This should work fine (I tried to test, but I don't know what package to load to get the corr.test function.)



result <- sapply(features, dyn.corr, income)





share|improve this answer
























  • Thanks a lot, that solved my problem!

    – Johannes Wiesner
    Nov 20 '18 at 16:42











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396939%2fusing-apply-function-to-iterate-over-different-data-types-doesnt-work%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









3














apply is built for matrices. When you apply to a data frame, the first thing that happens is coercing your data frame to a matrix. A matrix can only have one data type, so all columns of your data are converted to the most general type among them when this happens.



Use sapply or lapply to work with columns of a data frame.



This should work fine (I tried to test, but I don't know what package to load to get the corr.test function.)



result <- sapply(features, dyn.corr, income)





share|improve this answer
























  • Thanks a lot, that solved my problem!

    – Johannes Wiesner
    Nov 20 '18 at 16:42
















3














apply is built for matrices. When you apply to a data frame, the first thing that happens is coercing your data frame to a matrix. A matrix can only have one data type, so all columns of your data are converted to the most general type among them when this happens.



Use sapply or lapply to work with columns of a data frame.



This should work fine (I tried to test, but I don't know what package to load to get the corr.test function.)



result <- sapply(features, dyn.corr, income)





share|improve this answer
























  • Thanks a lot, that solved my problem!

    – Johannes Wiesner
    Nov 20 '18 at 16:42














3












3








3







apply is built for matrices. When you apply to a data frame, the first thing that happens is coercing your data frame to a matrix. A matrix can only have one data type, so all columns of your data are converted to the most general type among them when this happens.



Use sapply or lapply to work with columns of a data frame.



This should work fine (I tried to test, but I don't know what package to load to get the corr.test function.)



result <- sapply(features, dyn.corr, income)





share|improve this answer













apply is built for matrices. When you apply to a data frame, the first thing that happens is coercing your data frame to a matrix. A matrix can only have one data type, so all columns of your data are converted to the most general type among them when this happens.



Use sapply or lapply to work with columns of a data frame.



This should work fine (I tried to test, but I don't know what package to load to get the corr.test function.)



result <- sapply(features, dyn.corr, income)






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 20 '18 at 16:06









GregorGregor

63.6k990170




63.6k990170













  • Thanks a lot, that solved my problem!

    – Johannes Wiesner
    Nov 20 '18 at 16:42



















  • Thanks a lot, that solved my problem!

    – Johannes Wiesner
    Nov 20 '18 at 16:42

















Thanks a lot, that solved my problem!

– Johannes Wiesner
Nov 20 '18 at 16:42





Thanks a lot, that solved my problem!

– Johannes Wiesner
Nov 20 '18 at 16:42


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53396939%2fusing-apply-function-to-iterate-over-different-data-types-doesnt-work%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith