Row-wise dataframe creation using SQL-style INSERT for improved legibility

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have some reference tables in a script in which some values are occasionally updated manually.

With the usual column-wise dataframe definition, it can become troublesome to identify the correct index to change in larger tables: for example, finding the 15th value in each column to change.

Whilst I know these could be defined externally (e.g. a csv), I was wondering whether they could be more easily defined directly in the script for maximum visibility to other users.

A short example using column-wise data.frame creation:

data.frame(id = 1:3, 

           start = as.Date(c("2018-01-01", "2018-03-02", "2018-05-14")), 

           end = c(as.Date("2018-06-30"), Sys.Date(), Sys.Date()))

And using row-wise data.frame creation which still requires explicit variable names in each data.frame (otherwise column names may not match for rbind):

rbind(

  data.frame(id = 1, start = as.Date("2018-01-01"), end = as.Date("2018-06-30")),

  data.frame(id = 2, start = as.Date("2018-03-02"), end = Sys.Date()),

  data.frame(id = 3, start = as.Date("2018-05-14"), end = Sys.Date())

)

How this may be achieved in sql (clearest legibility in my opinion):

CREATE TABLE test (

  id int,

  start_date date,

  end_date date

);



INSERT INTO test

VALUES (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', SYSDATE),

       (3, '2018-05-14', SYSDATE);

I'd also be interested to hear if this is bad practice in general and whether just storing every reference table in a CSV is the best way to go.

edited Jan 3 at 13:09

massisenergy

5971519

asked Jan 3 at 11:58

jogall

422417

add a comment |

I have some reference tables in a script in which some values are occasionally updated manually.

With the usual column-wise dataframe definition, it can become troublesome to identify the correct index to change in larger tables: for example, finding the 15th value in each column to change.

Whilst I know these could be defined externally (e.g. a csv), I was wondering whether they could be more easily defined directly in the script for maximum visibility to other users.

A short example using column-wise data.frame creation:

data.frame(id = 1:3, 

           start = as.Date(c("2018-01-01", "2018-03-02", "2018-05-14")), 

           end = c(as.Date("2018-06-30"), Sys.Date(), Sys.Date()))

And using row-wise data.frame creation which still requires explicit variable names in each data.frame (otherwise column names may not match for rbind):

rbind(

  data.frame(id = 1, start = as.Date("2018-01-01"), end = as.Date("2018-06-30")),

  data.frame(id = 2, start = as.Date("2018-03-02"), end = Sys.Date()),

  data.frame(id = 3, start = as.Date("2018-05-14"), end = Sys.Date())

)

How this may be achieved in sql (clearest legibility in my opinion):

CREATE TABLE test (

  id int,

  start_date date,

  end_date date

);



INSERT INTO test

VALUES (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', SYSDATE),

       (3, '2018-05-14', SYSDATE);

I'd also be interested to hear if this is bad practice in general and whether just storing every reference table in a CSV is the best way to go.

edited Jan 3 at 13:09

massisenergy

5971519

asked Jan 3 at 11:58

jogall

422417

add a comment |

I have some reference tables in a script in which some values are occasionally updated manually.

With the usual column-wise dataframe definition, it can become troublesome to identify the correct index to change in larger tables: for example, finding the 15th value in each column to change.

Whilst I know these could be defined externally (e.g. a csv), I was wondering whether they could be more easily defined directly in the script for maximum visibility to other users.

A short example using column-wise data.frame creation:

data.frame(id = 1:3, 

           start = as.Date(c("2018-01-01", "2018-03-02", "2018-05-14")), 

           end = c(as.Date("2018-06-30"), Sys.Date(), Sys.Date()))

And using row-wise data.frame creation which still requires explicit variable names in each data.frame (otherwise column names may not match for rbind):

rbind(

  data.frame(id = 1, start = as.Date("2018-01-01"), end = as.Date("2018-06-30")),

  data.frame(id = 2, start = as.Date("2018-03-02"), end = Sys.Date()),

  data.frame(id = 3, start = as.Date("2018-05-14"), end = Sys.Date())

)

How this may be achieved in sql (clearest legibility in my opinion):

CREATE TABLE test (

  id int,

  start_date date,

  end_date date

);



INSERT INTO test

VALUES (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', SYSDATE),

       (3, '2018-05-14', SYSDATE);

I'd also be interested to hear if this is bad practice in general and whether just storing every reference table in a CSV is the best way to go.

edited Jan 3 at 13:09

massisenergy

5971519

asked Jan 3 at 11:58

jogall

422417

I have some reference tables in a script in which some values are occasionally updated manually.

With the usual column-wise dataframe definition, it can become troublesome to identify the correct index to change in larger tables: for example, finding the 15th value in each column to change.

Whilst I know these could be defined externally (e.g. a csv), I was wondering whether they could be more easily defined directly in the script for maximum visibility to other users.

A short example using column-wise data.frame creation:

data.frame(id = 1:3, 

           start = as.Date(c("2018-01-01", "2018-03-02", "2018-05-14")), 

           end = c(as.Date("2018-06-30"), Sys.Date(), Sys.Date()))

And using row-wise data.frame creation which still requires explicit variable names in each data.frame (otherwise column names may not match for rbind):

rbind(

  data.frame(id = 1, start = as.Date("2018-01-01"), end = as.Date("2018-06-30")),

  data.frame(id = 2, start = as.Date("2018-03-02"), end = Sys.Date()),

  data.frame(id = 3, start = as.Date("2018-05-14"), end = Sys.Date())

)

How this may be achieved in sql (clearest legibility in my opinion):

CREATE TABLE test (

  id int,

  start_date date,

  end_date date

);



INSERT INTO test

VALUES (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', SYSDATE),

       (3, '2018-05-14', SYSDATE);

I'd also be interested to hear if this is bad practice in general and whether just storing every reference table in a CSV is the best way to go.

sql r dataframe

edited Jan 3 at 13:09

massisenergy

5971519

asked Jan 3 at 11:58

jogall

422417

edited Jan 3 at 13:09

massisenergy

5971519

asked Jan 3 at 11:58

jogall

422417

edited Jan 3 at 13:09

massisenergy

5971519

edited Jan 3 at 13:09

massisenergy

5971519

edited Jan 3 at 13:09

massisenergy

5971519

asked Jan 3 at 11:58

jogall

422417

asked Jan 3 at 11:58

jogall

422417

asked Jan 3 at 11:58

jogall

422417

add a comment |

1 Answer
1

active

oldest

votes

1) sqldf package You can do this:

library(sqldf)



SYSDATE <- format(Sys.Date())

fn$sqldf("with test(id, start_date, end_date) as 

  (

     values

       (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', '$SYSDATE'),

       (3, '2018-05-14', '$SYSDATE')

  )

  select * from test")

2) tibble package Another possibility is:

library(tibble)



SYSDATE <- format(Sys.Date())

test <- tribble(

  ~id, ~start_date, ~end_date,

  1, '2018-01-01', '2018-06-30',

  2, '2018-03-02', SYSDATE,

  3, '2018-05-14', SYSDATE)

3) Base R Using only base R we can do:

Lines <- "

id start_date end_date  

1 2018-01-01 2018-06-30

2 2018-03-02 2019-01-03

3 2018-05-14 2019-01-03

"

read.table(text = Lines, header = TRUE, as.is = TRUE)

edited Jan 3 at 13:14

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

Never used the tribble function before but that looks perfect. Thanks for all the suggestions!

– jogall
Jan 3 at 13:22

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54021858%2frow-wise-dataframe-creation-using-sql-style-insert-for-improved-legibility%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

1) sqldf package You can do this:

library(sqldf)



SYSDATE <- format(Sys.Date())

fn$sqldf("with test(id, start_date, end_date) as 

  (

     values

       (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', '$SYSDATE'),

       (3, '2018-05-14', '$SYSDATE')

  )

  select * from test")

2) tibble package Another possibility is:

library(tibble)



SYSDATE <- format(Sys.Date())

test <- tribble(

  ~id, ~start_date, ~end_date,

  1, '2018-01-01', '2018-06-30',

  2, '2018-03-02', SYSDATE,

  3, '2018-05-14', SYSDATE)

3) Base R Using only base R we can do:

Lines <- "

id start_date end_date  

1 2018-01-01 2018-06-30

2 2018-03-02 2019-01-03

3 2018-05-14 2019-01-03

"

read.table(text = Lines, header = TRUE, as.is = TRUE)

edited Jan 3 at 13:14

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

Never used the tribble function before but that looks perfect. Thanks for all the suggestions!

– jogall
Jan 3 at 13:22

add a comment |

1) sqldf package You can do this:

library(sqldf)



SYSDATE <- format(Sys.Date())

fn$sqldf("with test(id, start_date, end_date) as 

  (

     values

       (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', '$SYSDATE'),

       (3, '2018-05-14', '$SYSDATE')

  )

  select * from test")

2) tibble package Another possibility is:

library(tibble)



SYSDATE <- format(Sys.Date())

test <- tribble(

  ~id, ~start_date, ~end_date,

  1, '2018-01-01', '2018-06-30',

  2, '2018-03-02', SYSDATE,

  3, '2018-05-14', SYSDATE)

3) Base R Using only base R we can do:

Lines <- "

id start_date end_date  

1 2018-01-01 2018-06-30

2 2018-03-02 2019-01-03

3 2018-05-14 2019-01-03

"

read.table(text = Lines, header = TRUE, as.is = TRUE)

edited Jan 3 at 13:14

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

Never used the tribble function before but that looks perfect. Thanks for all the suggestions!

– jogall
Jan 3 at 13:22

add a comment |

1) sqldf package You can do this:

library(sqldf)



SYSDATE <- format(Sys.Date())

fn$sqldf("with test(id, start_date, end_date) as 

  (

     values

       (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', '$SYSDATE'),

       (3, '2018-05-14', '$SYSDATE')

  )

  select * from test")

2) tibble package Another possibility is:

library(tibble)



SYSDATE <- format(Sys.Date())

test <- tribble(

  ~id, ~start_date, ~end_date,

  1, '2018-01-01', '2018-06-30',

  2, '2018-03-02', SYSDATE,

  3, '2018-05-14', SYSDATE)

3) Base R Using only base R we can do:

Lines <- "

id start_date end_date  

1 2018-01-01 2018-06-30

2 2018-03-02 2019-01-03

3 2018-05-14 2019-01-03

"

read.table(text = Lines, header = TRUE, as.is = TRUE)

edited Jan 3 at 13:14

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

1) sqldf package You can do this:

library(sqldf)



SYSDATE <- format(Sys.Date())

fn$sqldf("with test(id, start_date, end_date) as 

  (

     values

       (1, '2018-01-01', '2018-06-30'),

       (2, '2018-03-02', '$SYSDATE'),

       (3, '2018-05-14', '$SYSDATE')

  )

  select * from test")

2) tibble package Another possibility is:

library(tibble)



SYSDATE <- format(Sys.Date())

test <- tribble(

  ~id, ~start_date, ~end_date,

  1, '2018-01-01', '2018-06-30',

  2, '2018-03-02', SYSDATE,

  3, '2018-05-14', SYSDATE)

3) Base R Using only base R we can do:

Lines <- "

id start_date end_date  

1 2018-01-01 2018-06-30

2 2018-03-02 2019-01-03

3 2018-05-14 2019-01-03

"

read.table(text = Lines, header = TRUE, as.is = TRUE)

edited Jan 3 at 13:14

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

edited Jan 3 at 13:14

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

answered Jan 3 at 12:58

G. Grothendieck

154k11137244

Never used the tribble function before but that looks perfect. Thanks for all the suggestions!

– jogall
Jan 3 at 13:22

add a comment |

Never used the tribble function before but that looks perfect. Thanks for all the suggestions!

– jogall
Jan 3 at 13:22

Never used the tribble function before but that looks perfect. Thanks for all the suggestions!

– jogall
Jan 3 at 13:22

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu