How to assign different row's value to new column












4















I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:



df = pd.DataFrame({'ID':[123,123,123,456,456,789],
'C_ID':[8,10,35,36,40,7],
'C_Type':['New','Renew','Renew','New','Term','New'],
'Rank':[1,2,3,1,2,1]})


The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:



    ID  C_ID C_Type Rank  C_End  
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None


Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.



def get_next_c_type(row):
return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']

df['c_end'] = df.apply(get_next_c_type, axis = 1)









share|improve this question



























    4















    I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:



    df = pd.DataFrame({'ID':[123,123,123,456,456,789],
    'C_ID':[8,10,35,36,40,7],
    'C_Type':['New','Renew','Renew','New','Term','New'],
    'Rank':[1,2,3,1,2,1]})


    The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:



        ID  C_ID C_Type Rank  C_End  
    0 123 8 New 1 Renew
    1 123 10 Renew 2 Renew
    2 123 35 Renew 3 None
    3 456 36 New 1 Term
    4 456 40 Term 2 None
    5 789 7 New 1 None


    Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.



    def get_next_c_type(row):
    return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']

    df['c_end'] = df.apply(get_next_c_type, axis = 1)









    share|improve this question

























      4












      4








      4








      I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:



      df = pd.DataFrame({'ID':[123,123,123,456,456,789],
      'C_ID':[8,10,35,36,40,7],
      'C_Type':['New','Renew','Renew','New','Term','New'],
      'Rank':[1,2,3,1,2,1]})


      The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:



          ID  C_ID C_Type Rank  C_End  
      0 123 8 New 1 Renew
      1 123 10 Renew 2 Renew
      2 123 35 Renew 3 None
      3 456 36 New 1 Term
      4 456 40 Term 2 None
      5 789 7 New 1 None


      Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.



      def get_next_c_type(row):
      return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']

      df['c_end'] = df.apply(get_next_c_type, axis = 1)









      share|improve this question














      I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:



      df = pd.DataFrame({'ID':[123,123,123,456,456,789],
      'C_ID':[8,10,35,36,40,7],
      'C_Type':['New','Renew','Renew','New','Term','New'],
      'Rank':[1,2,3,1,2,1]})


      The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:



          ID  C_ID C_Type Rank  C_End  
      0 123 8 New 1 Renew
      1 123 10 Renew 2 Renew
      2 123 35 Renew 3 None
      3 456 36 New 1 Term
      4 456 40 Term 2 None
      5 789 7 New 1 None


      Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.



      def get_next_c_type(row):
      return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']

      df['c_end'] = df.apply(get_next_c_type, axis = 1)






      python pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 2 at 22:49









      AndrewAndrew

      234




      234
























          2 Answers
          2






          active

          oldest

          votes


















          3














          Try:



          df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)


          Or as @W-B suggest:



          df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)


          Output:



              ID  C_ID C_Type  Rank  C_End
          0 123 8 New 1 Renew
          1 123 10 Renew 2 Renew
          2 123 35 Renew 3 NaN
          3 456 36 New 1 Term
          4 456 40 Term 2 NaN
          5 789 7 New 1 NaN





          share|improve this answer


























          • Why not just df.groupby('ID').C_Type.shift(-1)

            – Wen-Ben
            Jan 2 at 23:16











          • I think this does not satisfy ID = ID and Rank = Rank+1

            – yatu
            Jan 2 at 23:21













          • @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

            – Scott Boston
            Jan 3 at 14:10



















          1














          Here's one way using np.where:



          dfs = df.shift(-1)
          m1 = df.ID == dfs.ID
          m2 = df.Rank + 1 == dfs.Rank
          df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)

          ID C_ID C_Type Rank C_End
          0 123 8 New 1 Renew
          1 123 10 Renew 2 Renew
          2 123 35 Renew 3 None
          3 456 36 New 1 Term
          4 456 40 Term 2 None
          5 789 7 New 1 None





          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014193%2fhow-to-assign-different-rows-value-to-new-column%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            Try:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)


            Or as @W-B suggest:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)


            Output:



                ID  C_ID C_Type  Rank  C_End
            0 123 8 New 1 Renew
            1 123 10 Renew 2 Renew
            2 123 35 Renew 3 NaN
            3 456 36 New 1 Term
            4 456 40 Term 2 NaN
            5 789 7 New 1 NaN





            share|improve this answer


























            • Why not just df.groupby('ID').C_Type.shift(-1)

              – Wen-Ben
              Jan 2 at 23:16











            • I think this does not satisfy ID = ID and Rank = Rank+1

              – yatu
              Jan 2 at 23:21













            • @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

              – Scott Boston
              Jan 3 at 14:10
















            3














            Try:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)


            Or as @W-B suggest:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)


            Output:



                ID  C_ID C_Type  Rank  C_End
            0 123 8 New 1 Renew
            1 123 10 Renew 2 Renew
            2 123 35 Renew 3 NaN
            3 456 36 New 1 Term
            4 456 40 Term 2 NaN
            5 789 7 New 1 NaN





            share|improve this answer


























            • Why not just df.groupby('ID').C_Type.shift(-1)

              – Wen-Ben
              Jan 2 at 23:16











            • I think this does not satisfy ID = ID and Rank = Rank+1

              – yatu
              Jan 2 at 23:21













            • @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

              – Scott Boston
              Jan 3 at 14:10














            3












            3








            3







            Try:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)


            Or as @W-B suggest:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)


            Output:



                ID  C_ID C_Type  Rank  C_End
            0 123 8 New 1 Renew
            1 123 10 Renew 2 Renew
            2 123 35 Renew 3 NaN
            3 456 36 New 1 Term
            4 456 40 Term 2 NaN
            5 789 7 New 1 NaN





            share|improve this answer















            Try:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)


            Or as @W-B suggest:



            df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)


            Output:



                ID  C_ID C_Type  Rank  C_End
            0 123 8 New 1 Renew
            1 123 10 Renew 2 Renew
            2 123 35 Renew 3 NaN
            3 456 36 New 1 Term
            4 456 40 Term 2 NaN
            5 789 7 New 1 NaN






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Jan 3 at 14:11

























            answered Jan 2 at 22:59









            Scott BostonScott Boston

            57.7k73258




            57.7k73258













            • Why not just df.groupby('ID').C_Type.shift(-1)

              – Wen-Ben
              Jan 2 at 23:16











            • I think this does not satisfy ID = ID and Rank = Rank+1

              – yatu
              Jan 2 at 23:21













            • @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

              – Scott Boston
              Jan 3 at 14:10



















            • Why not just df.groupby('ID').C_Type.shift(-1)

              – Wen-Ben
              Jan 2 at 23:16











            • I think this does not satisfy ID = ID and Rank = Rank+1

              – yatu
              Jan 2 at 23:21













            • @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

              – Scott Boston
              Jan 3 at 14:10

















            Why not just df.groupby('ID').C_Type.shift(-1)

            – Wen-Ben
            Jan 2 at 23:16





            Why not just df.groupby('ID').C_Type.shift(-1)

            – Wen-Ben
            Jan 2 at 23:16













            I think this does not satisfy ID = ID and Rank = Rank+1

            – yatu
            Jan 2 at 23:21







            I think this does not satisfy ID = ID and Rank = Rank+1

            – yatu
            Jan 2 at 23:21















            @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

            – Scott Boston
            Jan 3 at 14:10





            @W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However, df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1) will work also. :) Thanks.

            – Scott Boston
            Jan 3 at 14:10













            1














            Here's one way using np.where:



            dfs = df.shift(-1)
            m1 = df.ID == dfs.ID
            m2 = df.Rank + 1 == dfs.Rank
            df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)

            ID C_ID C_Type Rank C_End
            0 123 8 New 1 Renew
            1 123 10 Renew 2 Renew
            2 123 35 Renew 3 None
            3 456 36 New 1 Term
            4 456 40 Term 2 None
            5 789 7 New 1 None





            share|improve this answer




























              1














              Here's one way using np.where:



              dfs = df.shift(-1)
              m1 = df.ID == dfs.ID
              m2 = df.Rank + 1 == dfs.Rank
              df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)

              ID C_ID C_Type Rank C_End
              0 123 8 New 1 Renew
              1 123 10 Renew 2 Renew
              2 123 35 Renew 3 None
              3 456 36 New 1 Term
              4 456 40 Term 2 None
              5 789 7 New 1 None





              share|improve this answer


























                1












                1








                1







                Here's one way using np.where:



                dfs = df.shift(-1)
                m1 = df.ID == dfs.ID
                m2 = df.Rank + 1 == dfs.Rank
                df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)

                ID C_ID C_Type Rank C_End
                0 123 8 New 1 Renew
                1 123 10 Renew 2 Renew
                2 123 35 Renew 3 None
                3 456 36 New 1 Term
                4 456 40 Term 2 None
                5 789 7 New 1 None





                share|improve this answer













                Here's one way using np.where:



                dfs = df.shift(-1)
                m1 = df.ID == dfs.ID
                m2 = df.Rank + 1 == dfs.Rank
                df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)

                ID C_ID C_Type Rank C_End
                0 123 8 New 1 Renew
                1 123 10 Renew 2 Renew
                2 123 35 Renew 3 None
                3 456 36 New 1 Term
                4 456 40 Term 2 None
                5 789 7 New 1 None






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jan 2 at 23:14









                yatuyatu

                15.6k41542




                15.6k41542






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014193%2fhow-to-assign-different-rows-value-to-new-column%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    MongoDB - Not Authorized To Execute Command

                    How to fix TextFormField cause rebuild widget in Flutter

                    Npm cannot find a required file even through it is in the searched directory