subset using percentile for gridded data

I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.

> dim(DailyT)

[1] 24249   963

> DailyT[1:4,1:7]

     x    y  1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05

1 34.000 33   291.7603   291.8044   291.6158   292.9659   293.7032

2 34.125 33   291.7240   291.7951   291.5439   292.9451   293.7017

3 34.250 33   291.6884   291.7866   291.4721   292.9250   293.7001

4 34.375 33   291.6521   291.7781   291.4010   292.9049   293.6986

I did this but did not work

df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]

asked Nov 22 '18 at 8:55

Ali

337

Maybe you find this helpful.

– A. Suliman
Nov 22 '18 at 9:07

add a comment |

I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.

> dim(DailyT)

[1] 24249   963

> DailyT[1:4,1:7]

     x    y  1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05

1 34.000 33   291.7603   291.8044   291.6158   292.9659   293.7032

2 34.125 33   291.7240   291.7951   291.5439   292.9451   293.7017

3 34.250 33   291.6884   291.7866   291.4721   292.9250   293.7001

4 34.375 33   291.6521   291.7781   291.4010   292.9049   293.6986

I did this but did not work

df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]

asked Nov 22 '18 at 8:55

Ali

337

Maybe you find this helpful.

– A. Suliman
Nov 22 '18 at 9:07

add a comment |

I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.

> dim(DailyT)

[1] 24249   963

> DailyT[1:4,1:7]

     x    y  1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05

1 34.000 33   291.7603   291.8044   291.6158   292.9659   293.7032

2 34.125 33   291.7240   291.7951   291.5439   292.9451   293.7017

3 34.250 33   291.6884   291.7866   291.4721   292.9250   293.7001

4 34.375 33   291.6521   291.7781   291.4010   292.9049   293.6986

I did this but did not work

df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]

asked Nov 22 '18 at 8:55

Ali

337

I have gridded data that has 24249 obs and 963 var for daily maximum temperatures (K). I am looking for a way in r to select all days with maximum temperatures higher than the 90th percentile.

> dim(DailyT)

[1] 24249   963

> DailyT[1:4,1:7]

     x    y  1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05

1 34.000 33   291.7603   291.8044   291.6158   292.9659   293.7032

2 34.125 33   291.7240   291.7951   291.5439   292.9451   293.7017

3 34.250 33   291.6884   291.7866   291.4721   292.9250   293.7001

4 34.375 33   291.6521   291.7781   291.4010   292.9049   293.6986

I did this but did not work

df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ]

asked Nov 22 '18 at 8:55

Ali

337

asked Nov 22 '18 at 8:55

Ali

337

asked Nov 22 '18 at 8:55

Ali

337

asked Nov 22 '18 at 8:55

Ali

337

asked Nov 22 '18 at 8:55

Ali

337

Maybe you find this helpful.

– A. Suliman
Nov 22 '18 at 9:07

add a comment |

Maybe you find this helpful.

– A. Suliman
Nov 22 '18 at 9:07

Maybe you find this helpful.

– A. Suliman
Nov 22 '18 at 9:07

add a comment |

1 Answer
1

active

oldest

votes

First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q.

DailyT <- cbind(id=rownames(DailyT), DailyT)  # to identify rows later

q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6)  # 293.7003

DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]

Yields

> DailyT.q

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017

Edit:
To get the quantile rowwise use apply()

q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)



> data.frame(DailyT, q90=q90)

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05      q90

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032 293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017 293.7017

3  3 34.250 33    291.6884    291.7866    291.4721    292.9250    293.7001 293.7001

4  4 34.375 33    291.6521    291.7781    291.4010    292.9049    293.6986 293.6986

Data

> dput(DailyT)

structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L, 

                                                       33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521

                                                       ), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158, 

                                                                                                                                   291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451, 

                                                                                                                                                                                 292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001, 

                                                                                                                                                                                                                     293.6986)), class = "data.frame", row.names = c(NA, -4L))

edited Nov 24 '18 at 11:24

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

Thanks, I need to calculate the 90% quantile of each row not of all data.

– Ali
Nov 24 '18 at 11:09

Aha, please see my edit.

– jay.sf
Nov 24 '18 at 11:24

Worked.... Many thanks

– Ali
Nov 24 '18 at 13:32

Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.

– jay.sf
Nov 24 '18 at 13:58

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427098%2fsubset-using-percentile-for-gridded-data%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q.

DailyT <- cbind(id=rownames(DailyT), DailyT)  # to identify rows later

q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6)  # 293.7003

DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]

Yields

> DailyT.q

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017

Edit:
To get the quantile rowwise use apply()

q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)



> data.frame(DailyT, q90=q90)

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05      q90

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032 293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017 293.7017

3  3 34.250 33    291.6884    291.7866    291.4721    292.9250    293.7001 293.7001

4  4 34.375 33    291.6521    291.7781    291.4010    292.9049    293.6986 293.6986

Data

> dput(DailyT)

structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L, 

                                                       33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521

                                                       ), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158, 

                                                                                                                                   291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451, 

                                                                                                                                                                                 292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001, 

                                                                                                                                                                                                                     293.6986)), class = "data.frame", row.names = c(NA, -4L))

edited Nov 24 '18 at 11:24

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

Thanks, I need to calculate the 90% quantile of each row not of all data.

– Ali
Nov 24 '18 at 11:09

Aha, please see my edit.

– jay.sf
Nov 24 '18 at 11:24

Worked.... Many thanks

– Ali
Nov 24 '18 at 13:32

Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.

– jay.sf
Nov 24 '18 at 13:58

add a comment |

First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q.

DailyT <- cbind(id=rownames(DailyT), DailyT)  # to identify rows later

q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6)  # 293.7003

DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]

Yields

> DailyT.q

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017

Edit:
To get the quantile rowwise use apply()

q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)



> data.frame(DailyT, q90=q90)

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05      q90

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032 293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017 293.7017

3  3 34.250 33    291.6884    291.7866    291.4721    292.9250    293.7001 293.7001

4  4 34.375 33    291.6521    291.7781    291.4010    292.9049    293.6986 293.6986

Data

> dput(DailyT)

structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L, 

                                                       33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521

                                                       ), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158, 

                                                                                                                                   291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451, 

                                                                                                                                                                                 292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001, 

                                                                                                                                                                                                                     293.6986)), class = "data.frame", row.names = c(NA, -4L))

edited Nov 24 '18 at 11:24

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

Thanks, I need to calculate the 90% quantile of each row not of all data.

– Ali
Nov 24 '18 at 11:09

Aha, please see my edit.

– jay.sf
Nov 24 '18 at 11:24

Worked.... Many thanks

– Ali
Nov 24 '18 at 13:32

Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.

– jay.sf
Nov 24 '18 at 13:58

add a comment |

First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q.

DailyT <- cbind(id=rownames(DailyT), DailyT)  # to identify rows later

q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6)  # 293.7003

DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]

Yields

> DailyT.q

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017

Edit:
To get the quantile rowwise use apply()

q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)



> data.frame(DailyT, q90=q90)

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05      q90

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032 293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017 293.7017

3  3 34.250 33    291.6884    291.7866    291.4721    292.9250    293.7001 293.7001

4  4 34.375 33    291.6521    291.7781    291.4010    292.9049    293.6986 293.6986

Data

> dput(DailyT)

structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L, 

                                                       33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521

                                                       ), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158, 

                                                                                                                                   291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451, 

                                                                                                                                                                                 292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001, 

                                                                                                                                                                                                                     293.6986)), class = "data.frame", row.names = c(NA, -4L))

edited Nov 24 '18 at 11:24

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

First, you need an id column to identify the rows later. Then, calculate the 90% quantile of all temperature values. At the end subset data witch any row cells exceeding q.

DailyT <- cbind(id=rownames(DailyT), DailyT)  # to identify rows later

q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6)  # 293.7003

DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]

Yields

> DailyT.q

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017

Edit:
To get the quantile rowwise use apply()

q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)



> data.frame(DailyT, q90=q90)

  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05      q90

1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032 293.7032

2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017 293.7017

3  3 34.250 33    291.6884    291.7866    291.4721    292.9250    293.7001 293.7001

4  4 34.375 33    291.6521    291.7781    291.4010    292.9049    293.6986 293.6986

Data

> dput(DailyT)

structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L, 

                                                       33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521

                                                       ), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158, 

                                                                                                                                   291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451, 

                                                                                                                                                                                 292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001, 

                                                                                                                                                                                                                     293.6986)), class = "data.frame", row.names = c(NA, -4L))

edited Nov 24 '18 at 11:24

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

edited Nov 24 '18 at 11:24

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

answered Nov 22 '18 at 9:21

jay.sf

5,51631739

Thanks, I need to calculate the 90% quantile of each row not of all data.

– Ali
Nov 24 '18 at 11:09

Aha, please see my edit.

– jay.sf
Nov 24 '18 at 11:24

Worked.... Many thanks

– Ali
Nov 24 '18 at 13:32

Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.

– jay.sf
Nov 24 '18 at 13:58

add a comment |

Thanks, I need to calculate the 90% quantile of each row not of all data.

– Ali
Nov 24 '18 at 11:09

Aha, please see my edit.

– jay.sf
Nov 24 '18 at 11:24

Worked.... Many thanks

– Ali
Nov 24 '18 at 13:32

Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.

– jay.sf
Nov 24 '18 at 13:58

Thanks, I need to calculate the 90% quantile of each row not of all data.

– Ali
Nov 24 '18 at 11:09

Aha, please see my edit.

– jay.sf
Nov 24 '18 at 11:24

Worked.... Many thanks

– Ali
Nov 24 '18 at 13:32

Very good! - Please mark the question as answered when you're satisfied with the given answer and win +2 reputation. This stops people spending time on answering a question that has already been answered.

– jay.sf
Nov 24 '18 at 13:58

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu