flexible column comparison in data table r












1















I am trying to filter by comparing groups of columns with each other in a dynamic fashion. Suppose I have the data base.



###########
#Setup data
###########

set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]


The data table:



   n prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 1 2 9 0 0 9 0
2: 2 7 4 8 7 1 8
3: 3 5 10 9 5 10 9
4: 4 1 8 1 1 8 1
5: 5 6 0 0 6 0 1
6: 6 8 7 0 8 7 0
7: 7 0 0 6 0 0 2


By dynamic, I mean Tp can be an arbitrary integer.



I want to filter the following way:



For each t, e.g. prod1vint{t}, I want to compare to look at the "prm" version of it and check if it's not zero. If it is not zero, then I only want to keep rows for which all t'>t vint are less than or equal to the pre-prime values and all lower elements (t'<t) are the SAME, e.g.



For each t.., if prod1vint{t}!=0, then



1. prod1vint{t'}prm <= prod1vint{t'} for t'>t
2. prod1vint{t'}prm == prod1vint{t'} for t'<t


For example, the following output should be displayed:



   n   prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 3 5 10 9 5 10 9
2: 4 1 8 1 1 8 1
3: 6 8 7 0 8 7 0
4: 7 0 0 6 0 0 2


(In case the X-Y problem...this might help... I am trying to make sure each vector (prod1vint1,prod1vint2,prod1vint3) is LIFO decreasing to it's prime. Ignore this bit if it doesn't help. My attempted solution involves coding various conditions such as the one above, which I am stuck on.)










share|improve this question




















  • 1





    i think you should reshape your data with melt to facilitate this.

    – MichaelChirico
    Jan 1 at 23:17











  • Not sure how that helps

    – wolfsatthedoor
    Jan 2 at 0:01
















1















I am trying to filter by comparing groups of columns with each other in a dynamic fashion. Suppose I have the data base.



###########
#Setup data
###########

set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]


The data table:



   n prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 1 2 9 0 0 9 0
2: 2 7 4 8 7 1 8
3: 3 5 10 9 5 10 9
4: 4 1 8 1 1 8 1
5: 5 6 0 0 6 0 1
6: 6 8 7 0 8 7 0
7: 7 0 0 6 0 0 2


By dynamic, I mean Tp can be an arbitrary integer.



I want to filter the following way:



For each t, e.g. prod1vint{t}, I want to compare to look at the "prm" version of it and check if it's not zero. If it is not zero, then I only want to keep rows for which all t'>t vint are less than or equal to the pre-prime values and all lower elements (t'<t) are the SAME, e.g.



For each t.., if prod1vint{t}!=0, then



1. prod1vint{t'}prm <= prod1vint{t'} for t'>t
2. prod1vint{t'}prm == prod1vint{t'} for t'<t


For example, the following output should be displayed:



   n   prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 3 5 10 9 5 10 9
2: 4 1 8 1 1 8 1
3: 6 8 7 0 8 7 0
4: 7 0 0 6 0 0 2


(In case the X-Y problem...this might help... I am trying to make sure each vector (prod1vint1,prod1vint2,prod1vint3) is LIFO decreasing to it's prime. Ignore this bit if it doesn't help. My attempted solution involves coding various conditions such as the one above, which I am stuck on.)










share|improve this question




















  • 1





    i think you should reshape your data with melt to facilitate this.

    – MichaelChirico
    Jan 1 at 23:17











  • Not sure how that helps

    – wolfsatthedoor
    Jan 2 at 0:01














1












1








1


1






I am trying to filter by comparing groups of columns with each other in a dynamic fashion. Suppose I have the data base.



###########
#Setup data
###########

set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]


The data table:



   n prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 1 2 9 0 0 9 0
2: 2 7 4 8 7 1 8
3: 3 5 10 9 5 10 9
4: 4 1 8 1 1 8 1
5: 5 6 0 0 6 0 1
6: 6 8 7 0 8 7 0
7: 7 0 0 6 0 0 2


By dynamic, I mean Tp can be an arbitrary integer.



I want to filter the following way:



For each t, e.g. prod1vint{t}, I want to compare to look at the "prm" version of it and check if it's not zero. If it is not zero, then I only want to keep rows for which all t'>t vint are less than or equal to the pre-prime values and all lower elements (t'<t) are the SAME, e.g.



For each t.., if prod1vint{t}!=0, then



1. prod1vint{t'}prm <= prod1vint{t'} for t'>t
2. prod1vint{t'}prm == prod1vint{t'} for t'<t


For example, the following output should be displayed:



   n   prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 3 5 10 9 5 10 9
2: 4 1 8 1 1 8 1
3: 6 8 7 0 8 7 0
4: 7 0 0 6 0 0 2


(In case the X-Y problem...this might help... I am trying to make sure each vector (prod1vint1,prod1vint2,prod1vint3) is LIFO decreasing to it's prime. Ignore this bit if it doesn't help. My attempted solution involves coding various conditions such as the one above, which I am stuck on.)










share|improve this question
















I am trying to filter by comparing groups of columns with each other in a dynamic fashion. Suppose I have the data base.



###########
#Setup data
###########

set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]


The data table:



   n prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 1 2 9 0 0 9 0
2: 2 7 4 8 7 1 8
3: 3 5 10 9 5 10 9
4: 4 1 8 1 1 8 1
5: 5 6 0 0 6 0 1
6: 6 8 7 0 8 7 0
7: 7 0 0 6 0 0 2


By dynamic, I mean Tp can be an arbitrary integer.



I want to filter the following way:



For each t, e.g. prod1vint{t}, I want to compare to look at the "prm" version of it and check if it's not zero. If it is not zero, then I only want to keep rows for which all t'>t vint are less than or equal to the pre-prime values and all lower elements (t'<t) are the SAME, e.g.



For each t.., if prod1vint{t}!=0, then



1. prod1vint{t'}prm <= prod1vint{t'} for t'>t
2. prod1vint{t'}prm == prod1vint{t'} for t'<t


For example, the following output should be displayed:



   n   prod1vint1 prod1vint2 prod1vint3 prod1vint1prm prod1vint2prm prod1vint3prm
1: 3 5 10 9 5 10 9
2: 4 1 8 1 1 8 1
3: 6 8 7 0 8 7 0
4: 7 0 0 6 0 0 2


(In case the X-Y problem...this might help... I am trying to make sure each vector (prod1vint1,prod1vint2,prod1vint3) is LIFO decreasing to it's prime. Ignore this bit if it doesn't help. My attempted solution involves coding various conditions such as the one above, which I am stuck on.)







r function data.table






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 1 at 22:58







wolfsatthedoor

















asked Jan 1 at 22:34









wolfsatthedoorwolfsatthedoor

1,627102959




1,627102959








  • 1





    i think you should reshape your data with melt to facilitate this.

    – MichaelChirico
    Jan 1 at 23:17











  • Not sure how that helps

    – wolfsatthedoor
    Jan 2 at 0:01














  • 1





    i think you should reshape your data with melt to facilitate this.

    – MichaelChirico
    Jan 1 at 23:17











  • Not sure how that helps

    – wolfsatthedoor
    Jan 2 at 0:01








1




1





i think you should reshape your data with melt to facilitate this.

– MichaelChirico
Jan 1 at 23:17





i think you should reshape your data with melt to facilitate this.

– MichaelChirico
Jan 1 at 23:17













Not sure how that helps

– wolfsatthedoor
Jan 2 at 0:01





Not sure how that helps

– wolfsatthedoor
Jan 2 at 0:01












1 Answer
1






active

oldest

votes


















0














Using melt as the comments suggest I would do it like this:



# this part is as replied from the question
set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]

# NEW CODE
fill.melt <- reshape2::melt(fill, id.vars = c('n'))

fill.melt$intpart <- sapply(fill.melt$variable,
function (x)
{stringr::str_extract(gsub('prod1','',x),
'\d')})


fill.melt$prmpart <- ifelse(grepl('prm', fill.melt$variable), 'prm','noprm')
fill.cast <- reshape2::dcast(fill.melt, n+intpart ~ prmpart , value.var = 'value')
fill.cast <- as.data.table(fill.cast)

t=3


tmp <- fill.cast[
((intpart >= t & prm <= noprm) | (intpart < t & prm == noprm)),]

ns <- unique(tmp$n)[table(tmp$n) == t]

fill[n %in% ns,]





share|improve this answer
























  • do you know that data.table has much faster and more functional melt/dcast functions?

    – eddi
    Jan 2 at 20:20











  • Absolutely, I am very fan of data.table, this is just a solution

    – karen
    Jan 3 at 8:11











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53999471%2fflexible-column-comparison-in-data-table-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Using melt as the comments suggest I would do it like this:



# this part is as replied from the question
set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]

# NEW CODE
fill.melt <- reshape2::melt(fill, id.vars = c('n'))

fill.melt$intpart <- sapply(fill.melt$variable,
function (x)
{stringr::str_extract(gsub('prod1','',x),
'\d')})


fill.melt$prmpart <- ifelse(grepl('prm', fill.melt$variable), 'prm','noprm')
fill.cast <- reshape2::dcast(fill.melt, n+intpart ~ prmpart , value.var = 'value')
fill.cast <- as.data.table(fill.cast)

t=3


tmp <- fill.cast[
((intpart >= t & prm <= noprm) | (intpart < t & prm == noprm)),]

ns <- unique(tmp$n)[table(tmp$n) == t]

fill[n %in% ns,]





share|improve this answer
























  • do you know that data.table has much faster and more functional melt/dcast functions?

    – eddi
    Jan 2 at 20:20











  • Absolutely, I am very fan of data.table, this is just a solution

    – karen
    Jan 3 at 8:11
















0














Using melt as the comments suggest I would do it like this:



# this part is as replied from the question
set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]

# NEW CODE
fill.melt <- reshape2::melt(fill, id.vars = c('n'))

fill.melt$intpart <- sapply(fill.melt$variable,
function (x)
{stringr::str_extract(gsub('prod1','',x),
'\d')})


fill.melt$prmpart <- ifelse(grepl('prm', fill.melt$variable), 'prm','noprm')
fill.cast <- reshape2::dcast(fill.melt, n+intpart ~ prmpart , value.var = 'value')
fill.cast <- as.data.table(fill.cast)

t=3


tmp <- fill.cast[
((intpart >= t & prm <= noprm) | (intpart < t & prm == noprm)),]

ns <- unique(tmp$n)[table(tmp$n) == t]

fill[n %in% ns,]





share|improve this answer
























  • do you know that data.table has much faster and more functional melt/dcast functions?

    – eddi
    Jan 2 at 20:20











  • Absolutely, I am very fan of data.table, this is just a solution

    – karen
    Jan 3 at 8:11














0












0








0







Using melt as the comments suggest I would do it like this:



# this part is as replied from the question
set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]

# NEW CODE
fill.melt <- reshape2::melt(fill, id.vars = c('n'))

fill.melt$intpart <- sapply(fill.melt$variable,
function (x)
{stringr::str_extract(gsub('prod1','',x),
'\d')})


fill.melt$prmpart <- ifelse(grepl('prm', fill.melt$variable), 'prm','noprm')
fill.cast <- reshape2::dcast(fill.melt, n+intpart ~ prmpart , value.var = 'value')
fill.cast <- as.data.table(fill.cast)

t=3


tmp <- fill.cast[
((intpart >= t & prm <= noprm) | (intpart < t & prm == noprm)),]

ns <- unique(tmp$n)[table(tmp$n) == t]

fill[n %in% ns,]





share|improve this answer













Using melt as the comments suggest I would do it like this:



# this part is as replied from the question
set.seed(2)
fill = data.table(n=1:7)
Tp=3

for(t in 1:Tp){
set(x = fill, j = paste0('prod1vint',t), value = sample(0:10,7))
}

fill[1,paste0('prod1vint',3):=0]
fill[5,paste0('prod1vint',2):=0]
fill[5,paste0('prod1vint',3):=0]

for(t in 1:Tp){
fill[,paste0('prod1vint',t,'prm'):=get(paste0('prod1vint',t))]
}


fill[1,paste0('prod1vint',1,'prm'):=0]
fill[2,paste0('prod1vint',2,'prm'):=1]
fill[5,paste0('prod1vint',3,'prm'):=1]
fill[7,paste0('prod1vint',3,'prm'):=2]

# NEW CODE
fill.melt <- reshape2::melt(fill, id.vars = c('n'))

fill.melt$intpart <- sapply(fill.melt$variable,
function (x)
{stringr::str_extract(gsub('prod1','',x),
'\d')})


fill.melt$prmpart <- ifelse(grepl('prm', fill.melt$variable), 'prm','noprm')
fill.cast <- reshape2::dcast(fill.melt, n+intpart ~ prmpart , value.var = 'value')
fill.cast <- as.data.table(fill.cast)

t=3


tmp <- fill.cast[
((intpart >= t & prm <= noprm) | (intpart < t & prm == noprm)),]

ns <- unique(tmp$n)[table(tmp$n) == t]

fill[n %in% ns,]






share|improve this answer












share|improve this answer



share|improve this answer










answered Jan 2 at 16:31









karenkaren

487215




487215













  • do you know that data.table has much faster and more functional melt/dcast functions?

    – eddi
    Jan 2 at 20:20











  • Absolutely, I am very fan of data.table, this is just a solution

    – karen
    Jan 3 at 8:11



















  • do you know that data.table has much faster and more functional melt/dcast functions?

    – eddi
    Jan 2 at 20:20











  • Absolutely, I am very fan of data.table, this is just a solution

    – karen
    Jan 3 at 8:11

















do you know that data.table has much faster and more functional melt/dcast functions?

– eddi
Jan 2 at 20:20





do you know that data.table has much faster and more functional melt/dcast functions?

– eddi
Jan 2 at 20:20













Absolutely, I am very fan of data.table, this is just a solution

– karen
Jan 3 at 8:11





Absolutely, I am very fan of data.table, this is just a solution

– karen
Jan 3 at 8:11




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53999471%2fflexible-column-comparison-in-data-table-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith