Python 3.x: Perform analysis on dictionary of dataframes in loops





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I have a dataframe (df) whose column names are ["Home", "Season", "Date", "Consumption", "Temp"]. Now what I'm trying to do is perform calculations on these dataframe by "Home", "Season", "Temp" and "Consumption".



In[56]: df['Home'].unique().tolist()
Out[56]: [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

In[57]: df['Season'].unique().tolist()
Out[57]: ['Spring', 'Summer', 'Autumn', 'Winter']


Here is what is done so far:



series = {}
for i in df['Home'].unique().tolist():
for j in df["Season"].unique().tolist():
series[i, j] = df[(df["Home"] == i) & (df["Consumption"] >= 0) & (df["Season"] == j)]
for key, value in series.items():
value["Corr"] = value["Temp"].corr(value["Consumption"])


Here is the dictionary of dataframes named "Series" as an output of loop.



Image of dictionary named "series" in the loop



What I expected from last loop is to give me a dictionary of dataframes with a new column i.e. "Corr" added that would have correlated values for "Temp" and "Consumption", but instead it gives a single dataframe for last home in the iteration i.e. 23.



To simply add sixth column named "Corr" in all dataframes in a dictionary that would be a correlation between "Temp" and "Consumption". Can you help me with the above? I'm somehow missing the use of keys in the last loop. Thanks in advance!










share|improve this question

























  • Could you add a small sample input and the expected output. It will make the problem clearer.

    – Daniel Mesejo
    Jan 3 at 9:52











  • done @DanielMesejo

    – PratikSharma
    Jan 3 at 10:02











  • Mention your output in code format. Snapshot is not giving a clear sense,

    – Abdur Rehman
    Jan 3 at 10:11


















0















I have a dataframe (df) whose column names are ["Home", "Season", "Date", "Consumption", "Temp"]. Now what I'm trying to do is perform calculations on these dataframe by "Home", "Season", "Temp" and "Consumption".



In[56]: df['Home'].unique().tolist()
Out[56]: [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

In[57]: df['Season'].unique().tolist()
Out[57]: ['Spring', 'Summer', 'Autumn', 'Winter']


Here is what is done so far:



series = {}
for i in df['Home'].unique().tolist():
for j in df["Season"].unique().tolist():
series[i, j] = df[(df["Home"] == i) & (df["Consumption"] >= 0) & (df["Season"] == j)]
for key, value in series.items():
value["Corr"] = value["Temp"].corr(value["Consumption"])


Here is the dictionary of dataframes named "Series" as an output of loop.



Image of dictionary named "series" in the loop



What I expected from last loop is to give me a dictionary of dataframes with a new column i.e. "Corr" added that would have correlated values for "Temp" and "Consumption", but instead it gives a single dataframe for last home in the iteration i.e. 23.



To simply add sixth column named "Corr" in all dataframes in a dictionary that would be a correlation between "Temp" and "Consumption". Can you help me with the above? I'm somehow missing the use of keys in the last loop. Thanks in advance!










share|improve this question

























  • Could you add a small sample input and the expected output. It will make the problem clearer.

    – Daniel Mesejo
    Jan 3 at 9:52











  • done @DanielMesejo

    – PratikSharma
    Jan 3 at 10:02











  • Mention your output in code format. Snapshot is not giving a clear sense,

    – Abdur Rehman
    Jan 3 at 10:11














0












0








0








I have a dataframe (df) whose column names are ["Home", "Season", "Date", "Consumption", "Temp"]. Now what I'm trying to do is perform calculations on these dataframe by "Home", "Season", "Temp" and "Consumption".



In[56]: df['Home'].unique().tolist()
Out[56]: [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

In[57]: df['Season'].unique().tolist()
Out[57]: ['Spring', 'Summer', 'Autumn', 'Winter']


Here is what is done so far:



series = {}
for i in df['Home'].unique().tolist():
for j in df["Season"].unique().tolist():
series[i, j] = df[(df["Home"] == i) & (df["Consumption"] >= 0) & (df["Season"] == j)]
for key, value in series.items():
value["Corr"] = value["Temp"].corr(value["Consumption"])


Here is the dictionary of dataframes named "Series" as an output of loop.



Image of dictionary named "series" in the loop



What I expected from last loop is to give me a dictionary of dataframes with a new column i.e. "Corr" added that would have correlated values for "Temp" and "Consumption", but instead it gives a single dataframe for last home in the iteration i.e. 23.



To simply add sixth column named "Corr" in all dataframes in a dictionary that would be a correlation between "Temp" and "Consumption". Can you help me with the above? I'm somehow missing the use of keys in the last loop. Thanks in advance!










share|improve this question
















I have a dataframe (df) whose column names are ["Home", "Season", "Date", "Consumption", "Temp"]. Now what I'm trying to do is perform calculations on these dataframe by "Home", "Season", "Temp" and "Consumption".



In[56]: df['Home'].unique().tolist()
Out[56]: [1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

In[57]: df['Season'].unique().tolist()
Out[57]: ['Spring', 'Summer', 'Autumn', 'Winter']


Here is what is done so far:



series = {}
for i in df['Home'].unique().tolist():
for j in df["Season"].unique().tolist():
series[i, j] = df[(df["Home"] == i) & (df["Consumption"] >= 0) & (df["Season"] == j)]
for key, value in series.items():
value["Corr"] = value["Temp"].corr(value["Consumption"])


Here is the dictionary of dataframes named "Series" as an output of loop.



Image of dictionary named "series" in the loop



What I expected from last loop is to give me a dictionary of dataframes with a new column i.e. "Corr" added that would have correlated values for "Temp" and "Consumption", but instead it gives a single dataframe for last home in the iteration i.e. 23.



To simply add sixth column named "Corr" in all dataframes in a dictionary that would be a correlation between "Temp" and "Consumption". Can you help me with the above? I'm somehow missing the use of keys in the last loop. Thanks in advance!







python python-3.x pandas loops






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 3 at 10:27







PratikSharma

















asked Jan 3 at 9:50









PratikSharmaPratikSharma

6310




6310













  • Could you add a small sample input and the expected output. It will make the problem clearer.

    – Daniel Mesejo
    Jan 3 at 9:52











  • done @DanielMesejo

    – PratikSharma
    Jan 3 at 10:02











  • Mention your output in code format. Snapshot is not giving a clear sense,

    – Abdur Rehman
    Jan 3 at 10:11



















  • Could you add a small sample input and the expected output. It will make the problem clearer.

    – Daniel Mesejo
    Jan 3 at 9:52











  • done @DanielMesejo

    – PratikSharma
    Jan 3 at 10:02











  • Mention your output in code format. Snapshot is not giving a clear sense,

    – Abdur Rehman
    Jan 3 at 10:11

















Could you add a small sample input and the expected output. It will make the problem clearer.

– Daniel Mesejo
Jan 3 at 9:52





Could you add a small sample input and the expected output. It will make the problem clearer.

– Daniel Mesejo
Jan 3 at 9:52













done @DanielMesejo

– PratikSharma
Jan 3 at 10:02





done @DanielMesejo

– PratikSharma
Jan 3 at 10:02













Mention your output in code format. Snapshot is not giving a clear sense,

– Abdur Rehman
Jan 3 at 10:11





Mention your output in code format. Snapshot is not giving a clear sense,

– Abdur Rehman
Jan 3 at 10:11












2 Answers
2






active

oldest

votes


















1














All of those loops are entirely unnecessary! Simply call:



df.groupby(['Home', 'Season'])['Consumption', 'Temp'].corr()


(thanks @jezrael for the correction)






share|improve this answer


























  • it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

    – PratikSharma
    Jan 3 at 12:39













  • I think it would help if you give an example of the output dataframe you'd like to get

    – Josh Friedlander
    Jan 3 at 12:48



















0














One of the answer on How to find the correlation between a group of values in a pandas dataframe column
helped. Avoiding all unnecessary loops. Thanks @jezrael and @JoshFriedlander for suggesting groupby method. Upvote (y).



Posting solution here:



df = df[df["Consumption"] >= 0]

corrs = (df[["Home", "Season", "Temp"]]).groupby(
["Home", "Season"]).corrwith(
df["Consumption"]).rename(
columns = {"Temp" : "Corr"}).reset_index()

df = pd.merge(df, corrs, how = "left", on = ["Home", "Season"])





share|improve this answer


























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019807%2fpython-3-x-perform-analysis-on-dictionary-of-dataframes-in-loops%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    All of those loops are entirely unnecessary! Simply call:



    df.groupby(['Home', 'Season'])['Consumption', 'Temp'].corr()


    (thanks @jezrael for the correction)






    share|improve this answer


























    • it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

      – PratikSharma
      Jan 3 at 12:39













    • I think it would help if you give an example of the output dataframe you'd like to get

      – Josh Friedlander
      Jan 3 at 12:48
















    1














    All of those loops are entirely unnecessary! Simply call:



    df.groupby(['Home', 'Season'])['Consumption', 'Temp'].corr()


    (thanks @jezrael for the correction)






    share|improve this answer


























    • it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

      – PratikSharma
      Jan 3 at 12:39













    • I think it would help if you give an example of the output dataframe you'd like to get

      – Josh Friedlander
      Jan 3 at 12:48














    1












    1








    1







    All of those loops are entirely unnecessary! Simply call:



    df.groupby(['Home', 'Season'])['Consumption', 'Temp'].corr()


    (thanks @jezrael for the correction)






    share|improve this answer















    All of those loops are entirely unnecessary! Simply call:



    df.groupby(['Home', 'Season'])['Consumption', 'Temp'].corr()


    (thanks @jezrael for the correction)







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 3 at 12:04

























    answered Jan 3 at 11:50









    Josh FriedlanderJosh Friedlander

    3,1911933




    3,1911933













    • it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

      – PratikSharma
      Jan 3 at 12:39













    • I think it would help if you give an example of the output dataframe you'd like to get

      – Josh Friedlander
      Jan 3 at 12:48



















    • it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

      – PratikSharma
      Jan 3 at 12:39













    • I think it would help if you give an example of the output dataframe you'd like to get

      – Josh Friedlander
      Jan 3 at 12:48

















    it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

    – PratikSharma
    Jan 3 at 12:39







    it would remove the date column, which I would need while performing further steps. anyway out? and further it would create two correlation columns 1. consumption with temp 2. temp with consumption.

    – PratikSharma
    Jan 3 at 12:39















    I think it would help if you give an example of the output dataframe you'd like to get

    – Josh Friedlander
    Jan 3 at 12:48





    I think it would help if you give an example of the output dataframe you'd like to get

    – Josh Friedlander
    Jan 3 at 12:48













    0














    One of the answer on How to find the correlation between a group of values in a pandas dataframe column
    helped. Avoiding all unnecessary loops. Thanks @jezrael and @JoshFriedlander for suggesting groupby method. Upvote (y).



    Posting solution here:



    df = df[df["Consumption"] >= 0]

    corrs = (df[["Home", "Season", "Temp"]]).groupby(
    ["Home", "Season"]).corrwith(
    df["Consumption"]).rename(
    columns = {"Temp" : "Corr"}).reset_index()

    df = pd.merge(df, corrs, how = "left", on = ["Home", "Season"])





    share|improve this answer






























      0














      One of the answer on How to find the correlation between a group of values in a pandas dataframe column
      helped. Avoiding all unnecessary loops. Thanks @jezrael and @JoshFriedlander for suggesting groupby method. Upvote (y).



      Posting solution here:



      df = df[df["Consumption"] >= 0]

      corrs = (df[["Home", "Season", "Temp"]]).groupby(
      ["Home", "Season"]).corrwith(
      df["Consumption"]).rename(
      columns = {"Temp" : "Corr"}).reset_index()

      df = pd.merge(df, corrs, how = "left", on = ["Home", "Season"])





      share|improve this answer




























        0












        0








        0







        One of the answer on How to find the correlation between a group of values in a pandas dataframe column
        helped. Avoiding all unnecessary loops. Thanks @jezrael and @JoshFriedlander for suggesting groupby method. Upvote (y).



        Posting solution here:



        df = df[df["Consumption"] >= 0]

        corrs = (df[["Home", "Season", "Temp"]]).groupby(
        ["Home", "Season"]).corrwith(
        df["Consumption"]).rename(
        columns = {"Temp" : "Corr"}).reset_index()

        df = pd.merge(df, corrs, how = "left", on = ["Home", "Season"])





        share|improve this answer















        One of the answer on How to find the correlation between a group of values in a pandas dataframe column
        helped. Avoiding all unnecessary loops. Thanks @jezrael and @JoshFriedlander for suggesting groupby method. Upvote (y).



        Posting solution here:



        df = df[df["Consumption"] >= 0]

        corrs = (df[["Home", "Season", "Temp"]]).groupby(
        ["Home", "Season"]).corrwith(
        df["Consumption"]).rename(
        columns = {"Temp" : "Corr"}).reset_index()

        df = pd.merge(df, corrs, how = "left", on = ["Home", "Season"])






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 4 at 7:23

























        answered Jan 4 at 7:17









        PratikSharmaPratikSharma

        6310




        6310






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54019807%2fpython-3-x-perform-analysis-on-dictionary-of-dataframes-in-loops%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            MongoDB - Not Authorized To Execute Command

            How to fix TextFormField cause rebuild widget in Flutter

            in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith