Using Dplyr within a user-defined function to summarise data then plot it












2















I am trying to use dplyr within a function to create a user-defined function that I can pass multiple arguments to summarise data with dplyr then plot it with ggplot.



Here is some sample data and what I am trying to do with dplyr then plot



df <-data.frame(Year = c("2006", "2006", "2006", "2007", "2007", "2007", "2008", "2009", "2010", "2010", "2009", "2009"), JudicialOrientation = c("Defense", "Plaintiff", "Plaintiff", "Neutral", "Defense", "Plaintiff", "Defense", "Plaintiff", "Neutral", "Neutral", "Plaintiff","Defense"), Loss = c(100000, 100, 2500, 100000, 25000, 0, 7500, 5200, 900, 100, 0, 50))

df1 <- df %>%
group_by(Year, JudicialOrientation) %>%
summarise(MeanLoss =mean(Loss))

ggplot(df1, aes(x = JudicialOrientation, y = MeanLoss, color = Year, group =Year)) +
geom_line() +
geom_point()


I am now trying to replicate this into a user function so that I can pass different variables to get similar results.



Here is my attempt so far:



ConsistencyPlot <- function(df,var1,timevar,lossvar){

df1 <- df %>%
group_by_(df[timevar], df[var1]) %>%
summarise_(MeanLoss = mean(df[lossvar]))

ggplot(df1, aes(x = var1, y = MeanLoss, color = timevar, group = timevar)) +
geom_line() +
geom_point()

}

ConsistencyPlot(df,"JudicialOrientation","Year",'Loss')


I am replicating the same logic and passing in df as my dataframe, var1 as the JudicialOrientation, timevar as Year and lossvar as my vector of Loss values that I want averaged through summarise. I cannot get the same results however so I feel like I am missing something with how these functions are used within a closure.










share|improve this question





























    2















    I am trying to use dplyr within a function to create a user-defined function that I can pass multiple arguments to summarise data with dplyr then plot it with ggplot.



    Here is some sample data and what I am trying to do with dplyr then plot



    df <-data.frame(Year = c("2006", "2006", "2006", "2007", "2007", "2007", "2008", "2009", "2010", "2010", "2009", "2009"), JudicialOrientation = c("Defense", "Plaintiff", "Plaintiff", "Neutral", "Defense", "Plaintiff", "Defense", "Plaintiff", "Neutral", "Neutral", "Plaintiff","Defense"), Loss = c(100000, 100, 2500, 100000, 25000, 0, 7500, 5200, 900, 100, 0, 50))

    df1 <- df %>%
    group_by(Year, JudicialOrientation) %>%
    summarise(MeanLoss =mean(Loss))

    ggplot(df1, aes(x = JudicialOrientation, y = MeanLoss, color = Year, group =Year)) +
    geom_line() +
    geom_point()


    I am now trying to replicate this into a user function so that I can pass different variables to get similar results.



    Here is my attempt so far:



    ConsistencyPlot <- function(df,var1,timevar,lossvar){

    df1 <- df %>%
    group_by_(df[timevar], df[var1]) %>%
    summarise_(MeanLoss = mean(df[lossvar]))

    ggplot(df1, aes(x = var1, y = MeanLoss, color = timevar, group = timevar)) +
    geom_line() +
    geom_point()

    }

    ConsistencyPlot(df,"JudicialOrientation","Year",'Loss')


    I am replicating the same logic and passing in df as my dataframe, var1 as the JudicialOrientation, timevar as Year and lossvar as my vector of Loss values that I want averaged through summarise. I cannot get the same results however so I feel like I am missing something with how these functions are used within a closure.










    share|improve this question



























      2












      2








      2








      I am trying to use dplyr within a function to create a user-defined function that I can pass multiple arguments to summarise data with dplyr then plot it with ggplot.



      Here is some sample data and what I am trying to do with dplyr then plot



      df <-data.frame(Year = c("2006", "2006", "2006", "2007", "2007", "2007", "2008", "2009", "2010", "2010", "2009", "2009"), JudicialOrientation = c("Defense", "Plaintiff", "Plaintiff", "Neutral", "Defense", "Plaintiff", "Defense", "Plaintiff", "Neutral", "Neutral", "Plaintiff","Defense"), Loss = c(100000, 100, 2500, 100000, 25000, 0, 7500, 5200, 900, 100, 0, 50))

      df1 <- df %>%
      group_by(Year, JudicialOrientation) %>%
      summarise(MeanLoss =mean(Loss))

      ggplot(df1, aes(x = JudicialOrientation, y = MeanLoss, color = Year, group =Year)) +
      geom_line() +
      geom_point()


      I am now trying to replicate this into a user function so that I can pass different variables to get similar results.



      Here is my attempt so far:



      ConsistencyPlot <- function(df,var1,timevar,lossvar){

      df1 <- df %>%
      group_by_(df[timevar], df[var1]) %>%
      summarise_(MeanLoss = mean(df[lossvar]))

      ggplot(df1, aes(x = var1, y = MeanLoss, color = timevar, group = timevar)) +
      geom_line() +
      geom_point()

      }

      ConsistencyPlot(df,"JudicialOrientation","Year",'Loss')


      I am replicating the same logic and passing in df as my dataframe, var1 as the JudicialOrientation, timevar as Year and lossvar as my vector of Loss values that I want averaged through summarise. I cannot get the same results however so I feel like I am missing something with how these functions are used within a closure.










      share|improve this question
















      I am trying to use dplyr within a function to create a user-defined function that I can pass multiple arguments to summarise data with dplyr then plot it with ggplot.



      Here is some sample data and what I am trying to do with dplyr then plot



      df <-data.frame(Year = c("2006", "2006", "2006", "2007", "2007", "2007", "2008", "2009", "2010", "2010", "2009", "2009"), JudicialOrientation = c("Defense", "Plaintiff", "Plaintiff", "Neutral", "Defense", "Plaintiff", "Defense", "Plaintiff", "Neutral", "Neutral", "Plaintiff","Defense"), Loss = c(100000, 100, 2500, 100000, 25000, 0, 7500, 5200, 900, 100, 0, 50))

      df1 <- df %>%
      group_by(Year, JudicialOrientation) %>%
      summarise(MeanLoss =mean(Loss))

      ggplot(df1, aes(x = JudicialOrientation, y = MeanLoss, color = Year, group =Year)) +
      geom_line() +
      geom_point()


      I am now trying to replicate this into a user function so that I can pass different variables to get similar results.



      Here is my attempt so far:



      ConsistencyPlot <- function(df,var1,timevar,lossvar){

      df1 <- df %>%
      group_by_(df[timevar], df[var1]) %>%
      summarise_(MeanLoss = mean(df[lossvar]))

      ggplot(df1, aes(x = var1, y = MeanLoss, color = timevar, group = timevar)) +
      geom_line() +
      geom_point()

      }

      ConsistencyPlot(df,"JudicialOrientation","Year",'Loss')


      I am replicating the same logic and passing in df as my dataframe, var1 as the JudicialOrientation, timevar as Year and lossvar as my vector of Loss values that I want averaged through summarise. I cannot get the same results however so I feel like I am missing something with how these functions are used within a closure.







      r ggplot2 dplyr aggregate






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 21 '18 at 17:59









      Tjebo

      2,4311429




      2,4311429










      asked Nov 21 '18 at 14:30









      Coldchain9Coldchain9

      325




      325
























          1 Answer
          1






          active

          oldest

          votes


















          5














          First of all, inside dplyr functions you don't need to call variables indexing the dataframe like df[, timevar]. Use just the variable name. Besides that, when indexing a dataframe you have to specify if you are calling columns or rows, so df[timevar] is wrong.



          About the function, it's a problem of evaluation.



          This structure below is working:



          ConsistencyPlot <- function(df, var1, timevar, lossvar){
          var1 <- enquo(var1)
          timevar <- enquo(timevar)
          lossvar <- enquo(lossvar)

          df1 <- df %>%
          group_by(!!timevar, !!var1) %>%
          summarise(MeanLoss = mean(!!lossvar))

          ggplot(df1, aes(x = !!var1, y = MeanLoss, color = !!timevar, group = !!timevar)) +
          geom_line() +
          geom_point()
          }


          Look that the parameters were transformed with enquo() and then passed in the function using !!. So, you can pass the arguments without quoting them.



          ConsistencyPlot(df, JudicialOrientation, Year, Loss)


          I hope you find it useful.






          share|improve this answer


























          • I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

            – Coldchain9
            Nov 21 '18 at 15:19











          • It is an unquoting operator. See ?"!!".

            – Anonymous coward
            Nov 21 '18 at 15:20






          • 1





            It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

            – Bruno Pinheiro
            Nov 21 '18 at 15:29













          • I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

            – Coldchain9
            Nov 21 '18 at 16:12






          • 2





            @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

            – Tung
            Nov 21 '18 at 16:36











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53414314%2fusing-dplyr-within-a-user-defined-function-to-summarise-data-then-plot-it%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          5














          First of all, inside dplyr functions you don't need to call variables indexing the dataframe like df[, timevar]. Use just the variable name. Besides that, when indexing a dataframe you have to specify if you are calling columns or rows, so df[timevar] is wrong.



          About the function, it's a problem of evaluation.



          This structure below is working:



          ConsistencyPlot <- function(df, var1, timevar, lossvar){
          var1 <- enquo(var1)
          timevar <- enquo(timevar)
          lossvar <- enquo(lossvar)

          df1 <- df %>%
          group_by(!!timevar, !!var1) %>%
          summarise(MeanLoss = mean(!!lossvar))

          ggplot(df1, aes(x = !!var1, y = MeanLoss, color = !!timevar, group = !!timevar)) +
          geom_line() +
          geom_point()
          }


          Look that the parameters were transformed with enquo() and then passed in the function using !!. So, you can pass the arguments without quoting them.



          ConsistencyPlot(df, JudicialOrientation, Year, Loss)


          I hope you find it useful.






          share|improve this answer


























          • I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

            – Coldchain9
            Nov 21 '18 at 15:19











          • It is an unquoting operator. See ?"!!".

            – Anonymous coward
            Nov 21 '18 at 15:20






          • 1





            It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

            – Bruno Pinheiro
            Nov 21 '18 at 15:29













          • I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

            – Coldchain9
            Nov 21 '18 at 16:12






          • 2





            @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

            – Tung
            Nov 21 '18 at 16:36
















          5














          First of all, inside dplyr functions you don't need to call variables indexing the dataframe like df[, timevar]. Use just the variable name. Besides that, when indexing a dataframe you have to specify if you are calling columns or rows, so df[timevar] is wrong.



          About the function, it's a problem of evaluation.



          This structure below is working:



          ConsistencyPlot <- function(df, var1, timevar, lossvar){
          var1 <- enquo(var1)
          timevar <- enquo(timevar)
          lossvar <- enquo(lossvar)

          df1 <- df %>%
          group_by(!!timevar, !!var1) %>%
          summarise(MeanLoss = mean(!!lossvar))

          ggplot(df1, aes(x = !!var1, y = MeanLoss, color = !!timevar, group = !!timevar)) +
          geom_line() +
          geom_point()
          }


          Look that the parameters were transformed with enquo() and then passed in the function using !!. So, you can pass the arguments without quoting them.



          ConsistencyPlot(df, JudicialOrientation, Year, Loss)


          I hope you find it useful.






          share|improve this answer


























          • I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

            – Coldchain9
            Nov 21 '18 at 15:19











          • It is an unquoting operator. See ?"!!".

            – Anonymous coward
            Nov 21 '18 at 15:20






          • 1





            It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

            – Bruno Pinheiro
            Nov 21 '18 at 15:29













          • I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

            – Coldchain9
            Nov 21 '18 at 16:12






          • 2





            @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

            – Tung
            Nov 21 '18 at 16:36














          5












          5








          5







          First of all, inside dplyr functions you don't need to call variables indexing the dataframe like df[, timevar]. Use just the variable name. Besides that, when indexing a dataframe you have to specify if you are calling columns or rows, so df[timevar] is wrong.



          About the function, it's a problem of evaluation.



          This structure below is working:



          ConsistencyPlot <- function(df, var1, timevar, lossvar){
          var1 <- enquo(var1)
          timevar <- enquo(timevar)
          lossvar <- enquo(lossvar)

          df1 <- df %>%
          group_by(!!timevar, !!var1) %>%
          summarise(MeanLoss = mean(!!lossvar))

          ggplot(df1, aes(x = !!var1, y = MeanLoss, color = !!timevar, group = !!timevar)) +
          geom_line() +
          geom_point()
          }


          Look that the parameters were transformed with enquo() and then passed in the function using !!. So, you can pass the arguments without quoting them.



          ConsistencyPlot(df, JudicialOrientation, Year, Loss)


          I hope you find it useful.






          share|improve this answer















          First of all, inside dplyr functions you don't need to call variables indexing the dataframe like df[, timevar]. Use just the variable name. Besides that, when indexing a dataframe you have to specify if you are calling columns or rows, so df[timevar] is wrong.



          About the function, it's a problem of evaluation.



          This structure below is working:



          ConsistencyPlot <- function(df, var1, timevar, lossvar){
          var1 <- enquo(var1)
          timevar <- enquo(timevar)
          lossvar <- enquo(lossvar)

          df1 <- df %>%
          group_by(!!timevar, !!var1) %>%
          summarise(MeanLoss = mean(!!lossvar))

          ggplot(df1, aes(x = !!var1, y = MeanLoss, color = !!timevar, group = !!timevar)) +
          geom_line() +
          geom_point()
          }


          Look that the parameters were transformed with enquo() and then passed in the function using !!. So, you can pass the arguments without quoting them.



          ConsistencyPlot(df, JudicialOrientation, Year, Loss)


          I hope you find it useful.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 21 '18 at 15:32

























          answered Nov 21 '18 at 15:12









          Bruno PinheiroBruno Pinheiro

          402112




          402112













          • I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

            – Coldchain9
            Nov 21 '18 at 15:19











          • It is an unquoting operator. See ?"!!".

            – Anonymous coward
            Nov 21 '18 at 15:20






          • 1





            It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

            – Bruno Pinheiro
            Nov 21 '18 at 15:29













          • I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

            – Coldchain9
            Nov 21 '18 at 16:12






          • 2





            @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

            – Tung
            Nov 21 '18 at 16:36



















          • I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

            – Coldchain9
            Nov 21 '18 at 15:19











          • It is an unquoting operator. See ?"!!".

            – Anonymous coward
            Nov 21 '18 at 15:20






          • 1





            It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

            – Bruno Pinheiro
            Nov 21 '18 at 15:29













          • I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

            – Coldchain9
            Nov 21 '18 at 16:12






          • 2





            @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

            – Tung
            Nov 21 '18 at 16:36

















          I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

          – Coldchain9
          Nov 21 '18 at 15:19





          I realized I was referencing column names wrong right after I posted this question. Can you explain to me what !! is doing? This is exactly what I wanted. Thank you very much.

          – Coldchain9
          Nov 21 '18 at 15:19













          It is an unquoting operator. See ?"!!".

          – Anonymous coward
          Nov 21 '18 at 15:20





          It is an unquoting operator. See ?"!!".

          – Anonymous coward
          Nov 21 '18 at 15:20




          1




          1





          It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

          – Bruno Pinheiro
          Nov 21 '18 at 15:29







          It's exactly what @Anonymouscoward said. For a deepier exaplanation, take a look here. Happy to help.

          – Bruno Pinheiro
          Nov 21 '18 at 15:29















          I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

          – Coldchain9
          Nov 21 '18 at 16:12





          I guess I am just trying to figure out why does the quoting-unquoting methodology work vs just sending the arguments in their unquoted form. It can't recognize the variables if I send them in so why does it work when they are enquoted then immediately unquoted with !!. I read ?enquo and if I understand correctly, the quosure maintains the original environment but !! just removes the quotes for evaluation purposes?

          – Coldchain9
          Nov 21 '18 at 16:12




          2




          2





          @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

          – Tung
          Nov 21 '18 at 16:36





          @Coldchain9: see this for further explanation stackoverflow.com/questions/51738267/…

          – Tung
          Nov 21 '18 at 16:36




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53414314%2fusing-dplyr-within-a-user-defined-function-to-summarise-data-then-plot-it%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          MongoDB - Not Authorized To Execute Command

          How to fix TextFormField cause rebuild widget in Flutter

          in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith