Create a matrix out of a dataframe
I am very new to R and script writing in general. Please be patient if this is a very basic question. My search for a solution has not been successful.
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
I want to change the df to a matrix which has the airport.departure as rows and the airport.arrival as columns with the cumulated amount for a given month/years in the cells of the matrix.
r dataframe matrix
add a comment |
I am very new to R and script writing in general. Please be patient if this is a very basic question. My search for a solution has not been successful.
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
I want to change the df to a matrix which has the airport.departure as rows and the airport.arrival as columns with the cumulated amount for a given month/years in the cells of the matrix.
r dataframe matrix
1
Your title is bad for my OCD :)
– Sotos
Nov 19 '18 at 15:50
Sorry. I did not notice it. Now it gives me slight pain too...
– Confusulum
Nov 19 '18 at 15:56
hehehe, you don't expect it to be such a big deal until you 'see' it
– Sotos
Nov 19 '18 at 15:57
Good thing it´s possible here to change the title. It would be mean to trigger OCDs all over the world. ;)
– Confusulum
Nov 19 '18 at 16:16
add a comment |
I am very new to R and script writing in general. Please be patient if this is a very basic question. My search for a solution has not been successful.
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
I want to change the df to a matrix which has the airport.departure as rows and the airport.arrival as columns with the cumulated amount for a given month/years in the cells of the matrix.
r dataframe matrix
I am very new to R and script writing in general. Please be patient if this is a very basic question. My search for a solution has not been successful.
date.depature <- c("2016.06.16", "2016.11.16", "2017.01.05", "2017.01.12", "2017.02.25")
airport.departure <- c("CDG", "QNY", "QXO", "CDG", "QNY")
airport.arrival <- c("SYD", "CDG", "QNY", "SYD", "QXO")
amount <- c("1", "3", "1", "10", "5")
df <- data.frame(date.depature, airport.departure, airport.arrival, amount)
I want to change the df to a matrix which has the airport.departure as rows and the airport.arrival as columns with the cumulated amount for a given month/years in the cells of the matrix.
r dataframe matrix
r dataframe matrix
edited Nov 19 '18 at 16:07
asked Nov 19 '18 at 15:44
Confusulum
104
104
1
Your title is bad for my OCD :)
– Sotos
Nov 19 '18 at 15:50
Sorry. I did not notice it. Now it gives me slight pain too...
– Confusulum
Nov 19 '18 at 15:56
hehehe, you don't expect it to be such a big deal until you 'see' it
– Sotos
Nov 19 '18 at 15:57
Good thing it´s possible here to change the title. It would be mean to trigger OCDs all over the world. ;)
– Confusulum
Nov 19 '18 at 16:16
add a comment |
1
Your title is bad for my OCD :)
– Sotos
Nov 19 '18 at 15:50
Sorry. I did not notice it. Now it gives me slight pain too...
– Confusulum
Nov 19 '18 at 15:56
hehehe, you don't expect it to be such a big deal until you 'see' it
– Sotos
Nov 19 '18 at 15:57
Good thing it´s possible here to change the title. It would be mean to trigger OCDs all over the world. ;)
– Confusulum
Nov 19 '18 at 16:16
1
1
Your title is bad for my OCD :)
– Sotos
Nov 19 '18 at 15:50
Your title is bad for my OCD :)
– Sotos
Nov 19 '18 at 15:50
Sorry. I did not notice it. Now it gives me slight pain too...
– Confusulum
Nov 19 '18 at 15:56
Sorry. I did not notice it. Now it gives me slight pain too...
– Confusulum
Nov 19 '18 at 15:56
hehehe, you don't expect it to be such a big deal until you 'see' it
– Sotos
Nov 19 '18 at 15:57
hehehe, you don't expect it to be such a big deal until you 'see' it
– Sotos
Nov 19 '18 at 15:57
Good thing it´s possible here to change the title. It would be mean to trigger OCDs all over the world. ;)
– Confusulum
Nov 19 '18 at 16:16
Good thing it´s possible here to change the title. It would be mean to trigger OCDs all over the world. ;)
– Confusulum
Nov 19 '18 at 16:16
add a comment |
2 Answers
2
active
oldest
votes
You are looking for xtabs
, i.e.
xtabs(amount ~ airport.arrival + airport.departure, df)
which gives,
airport.departure
airport.arrival CDG QNY QXO
CDG 0 3 0
QNY 0 0 1
QXO 0 5 0
SYD 11 0 0
P.S
As @Andre Elrico mentions, for some reason you declared your amount
variable as a string. You need to convert to integer prior to calculating the sums
Proposed Solution:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
2
since amount is string:xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
if you further want to work on this: consider usingas.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
add a comment |
using dplyr package:
library(dplyr)
df %>% mutate(month.departure =substr(date.depature, 1, 7)) %>%
group_by(airport.departure, airport.arrival, month.departure) %>%
summarize(total = sum(as.numeric(as.character(amount))))
- mutate adds a column that gives the month/year (without the date, because you asked to group by month)
- group_by says you want to see statistics grouped according to these variables
- summarize computes the statistic you're looking for, in this case the total amount per 'group.' The
as.numeric(as.character())
is included because 'amount' started as a factor type, which can't be summed [theas.character()
is needed because as.numeric would have unexpected behavior if called directly on the factor].total =
gives an arbitrary column name to include in the output table; you could leave it out and this will still work. - the %>% is used to pipe output from one command as input to the next command in dplyr
Output:
(format of output table is different than you described, but it contains all the info you're looking for. and dplyr is a great package to start learning for this sort of data manipulation!)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53378149%2fcreate-a-matrix-out-of-a-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You are looking for xtabs
, i.e.
xtabs(amount ~ airport.arrival + airport.departure, df)
which gives,
airport.departure
airport.arrival CDG QNY QXO
CDG 0 3 0
QNY 0 0 1
QXO 0 5 0
SYD 11 0 0
P.S
As @Andre Elrico mentions, for some reason you declared your amount
variable as a string. You need to convert to integer prior to calculating the sums
Proposed Solution:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
2
since amount is string:xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
if you further want to work on this: consider usingas.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
add a comment |
You are looking for xtabs
, i.e.
xtabs(amount ~ airport.arrival + airport.departure, df)
which gives,
airport.departure
airport.arrival CDG QNY QXO
CDG 0 3 0
QNY 0 0 1
QXO 0 5 0
SYD 11 0 0
P.S
As @Andre Elrico mentions, for some reason you declared your amount
variable as a string. You need to convert to integer prior to calculating the sums
Proposed Solution:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
2
since amount is string:xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
if you further want to work on this: consider usingas.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
add a comment |
You are looking for xtabs
, i.e.
xtabs(amount ~ airport.arrival + airport.departure, df)
which gives,
airport.departure
airport.arrival CDG QNY QXO
CDG 0 3 0
QNY 0 0 1
QXO 0 5 0
SYD 11 0 0
P.S
As @Andre Elrico mentions, for some reason you declared your amount
variable as a string. You need to convert to integer prior to calculating the sums
Proposed Solution:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
You are looking for xtabs
, i.e.
xtabs(amount ~ airport.arrival + airport.departure, df)
which gives,
airport.departure
airport.arrival CDG QNY QXO
CDG 0 3 0
QNY 0 0 1
QXO 0 5 0
SYD 11 0 0
P.S
As @Andre Elrico mentions, for some reason you declared your amount
variable as a string. You need to convert to integer prior to calculating the sums
Proposed Solution:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
edited Nov 19 '18 at 15:55
answered Nov 19 '18 at 15:49
Sotos
28.3k51640
28.3k51640
2
since amount is string:xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
if you further want to work on this: consider usingas.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
add a comment |
2
since amount is string:xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
if you further want to work on this: consider usingas.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
2
2
since amount is string:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
since amount is string:
xtabs(as.integer(amount) ~ airport.arrival + airport.departure, df)
– Andre Elrico
Nov 19 '18 at 15:51
if you further want to work on this: consider using
as.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
if you further want to work on this: consider using
as.data.frame.matrix(xtabs( ... ))
– Andre Elrico
Nov 19 '18 at 16:02
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
The "for some reason" is that I barely what I am doing. So it just happend without thinking. I changed it in the real dataframe. I will look into it. Thank you for the advice & help from both of you.
– Confusulum
Nov 19 '18 at 16:19
add a comment |
using dplyr package:
library(dplyr)
df %>% mutate(month.departure =substr(date.depature, 1, 7)) %>%
group_by(airport.departure, airport.arrival, month.departure) %>%
summarize(total = sum(as.numeric(as.character(amount))))
- mutate adds a column that gives the month/year (without the date, because you asked to group by month)
- group_by says you want to see statistics grouped according to these variables
- summarize computes the statistic you're looking for, in this case the total amount per 'group.' The
as.numeric(as.character())
is included because 'amount' started as a factor type, which can't be summed [theas.character()
is needed because as.numeric would have unexpected behavior if called directly on the factor].total =
gives an arbitrary column name to include in the output table; you could leave it out and this will still work. - the %>% is used to pipe output from one command as input to the next command in dplyr
Output:
(format of output table is different than you described, but it contains all the info you're looking for. and dplyr is a great package to start learning for this sort of data manipulation!)
add a comment |
using dplyr package:
library(dplyr)
df %>% mutate(month.departure =substr(date.depature, 1, 7)) %>%
group_by(airport.departure, airport.arrival, month.departure) %>%
summarize(total = sum(as.numeric(as.character(amount))))
- mutate adds a column that gives the month/year (without the date, because you asked to group by month)
- group_by says you want to see statistics grouped according to these variables
- summarize computes the statistic you're looking for, in this case the total amount per 'group.' The
as.numeric(as.character())
is included because 'amount' started as a factor type, which can't be summed [theas.character()
is needed because as.numeric would have unexpected behavior if called directly on the factor].total =
gives an arbitrary column name to include in the output table; you could leave it out and this will still work. - the %>% is used to pipe output from one command as input to the next command in dplyr
Output:
(format of output table is different than you described, but it contains all the info you're looking for. and dplyr is a great package to start learning for this sort of data manipulation!)
add a comment |
using dplyr package:
library(dplyr)
df %>% mutate(month.departure =substr(date.depature, 1, 7)) %>%
group_by(airport.departure, airport.arrival, month.departure) %>%
summarize(total = sum(as.numeric(as.character(amount))))
- mutate adds a column that gives the month/year (without the date, because you asked to group by month)
- group_by says you want to see statistics grouped according to these variables
- summarize computes the statistic you're looking for, in this case the total amount per 'group.' The
as.numeric(as.character())
is included because 'amount' started as a factor type, which can't be summed [theas.character()
is needed because as.numeric would have unexpected behavior if called directly on the factor].total =
gives an arbitrary column name to include in the output table; you could leave it out and this will still work. - the %>% is used to pipe output from one command as input to the next command in dplyr
Output:
(format of output table is different than you described, but it contains all the info you're looking for. and dplyr is a great package to start learning for this sort of data manipulation!)
using dplyr package:
library(dplyr)
df %>% mutate(month.departure =substr(date.depature, 1, 7)) %>%
group_by(airport.departure, airport.arrival, month.departure) %>%
summarize(total = sum(as.numeric(as.character(amount))))
- mutate adds a column that gives the month/year (without the date, because you asked to group by month)
- group_by says you want to see statistics grouped according to these variables
- summarize computes the statistic you're looking for, in this case the total amount per 'group.' The
as.numeric(as.character())
is included because 'amount' started as a factor type, which can't be summed [theas.character()
is needed because as.numeric would have unexpected behavior if called directly on the factor].total =
gives an arbitrary column name to include in the output table; you could leave it out and this will still work. - the %>% is used to pipe output from one command as input to the next command in dplyr
Output:
(format of output table is different than you described, but it contains all the info you're looking for. and dplyr is a great package to start learning for this sort of data manipulation!)
edited Nov 20 '18 at 20:19
answered Nov 19 '18 at 15:53
R-Peys
435
435
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53378149%2fcreate-a-matrix-out-of-a-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Your title is bad for my OCD :)
– Sotos
Nov 19 '18 at 15:50
Sorry. I did not notice it. Now it gives me slight pain too...
– Confusulum
Nov 19 '18 at 15:56
hehehe, you don't expect it to be such a big deal until you 'see' it
– Sotos
Nov 19 '18 at 15:57
Good thing it´s possible here to change the title. It would be mean to trigger OCDs all over the world. ;)
– Confusulum
Nov 19 '18 at 16:16