How to assign different row's value to new column
I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:
df = pd.DataFrame({'ID':[123,123,123,456,456,789],
'C_ID':[8,10,35,36,40,7],
'C_Type':['New','Renew','Renew','New','Term','New'],
'Rank':[1,2,3,1,2,1]})
The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.
def get_next_c_type(row):
return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']
df['c_end'] = df.apply(get_next_c_type, axis = 1)
python pandas dataframe
add a comment |
I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:
df = pd.DataFrame({'ID':[123,123,123,456,456,789],
'C_ID':[8,10,35,36,40,7],
'C_Type':['New','Renew','Renew','New','Term','New'],
'Rank':[1,2,3,1,2,1]})
The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.
def get_next_c_type(row):
return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']
df['c_end'] = df.apply(get_next_c_type, axis = 1)
python pandas dataframe
add a comment |
I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:
df = pd.DataFrame({'ID':[123,123,123,456,456,789],
'C_ID':[8,10,35,36,40,7],
'C_Type':['New','Renew','Renew','New','Term','New'],
'Rank':[1,2,3,1,2,1]})
The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.
def get_next_c_type(row):
return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']
df['c_end'] = df.apply(get_next_c_type, axis = 1)
python pandas dataframe
I'm trying to add a column, 'C_End', to a DataFrame in Pandas that looks something like this:
df = pd.DataFrame({'ID':[123,123,123,456,456,789],
'C_ID':[8,10,35,36,40,7],
'C_Type':['New','Renew','Renew','New','Term','New'],
'Rank':[1,2,3,1,2,1]})
The new column needs to be the next 'C_Type' for each ID based on 'Rank', resulting in a DataFrame that looks like this:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
Essentially, I want to find the row where ID = ID and Rank = Rank+1 and assign C_Type to new column C_End. I've tried creating a function and using Apply (below), but that took forever and eventually gave me an error. I'm still new to Pandas and Python in general, but I feel like there has to be an easy solution that I'm not seeing.
def get_next_c_type(row):
return df.loc[(df['id'] == row['id']) & (df['rank'] == row['rank'] + 1),'c_type']
df['c_end'] = df.apply(get_next_c_type, axis = 1)
python pandas dataframe
python pandas dataframe
asked Jan 2 at 22:49
AndrewAndrew
234
234
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Try:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)
Or as @W-B suggest:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
Output:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 NaN
3 456 36 New 1 Term
4 456 40 Term 2 NaN
5 789 7 New 1 NaN
Why not justdf.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
I think this does not satisfyID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.
– Scott Boston
Jan 3 at 14:10
add a comment |
Here's one way using np.where
:
dfs = df.shift(-1)
m1 = df.ID == dfs.ID
m2 = df.Rank + 1 == dfs.Rank
df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014193%2fhow-to-assign-different-rows-value-to-new-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)
Or as @W-B suggest:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
Output:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 NaN
3 456 36 New 1 Term
4 456 40 Term 2 NaN
5 789 7 New 1 NaN
Why not justdf.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
I think this does not satisfyID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.
– Scott Boston
Jan 3 at 14:10
add a comment |
Try:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)
Or as @W-B suggest:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
Output:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 NaN
3 456 36 New 1 Term
4 456 40 Term 2 NaN
5 789 7 New 1 NaN
Why not justdf.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
I think this does not satisfyID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.
– Scott Boston
Jan 3 at 14:10
add a comment |
Try:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)
Or as @W-B suggest:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
Output:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 NaN
3 456 36 New 1 Term
4 456 40 Term 2 NaN
5 789 7 New 1 NaN
Try:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].transform('shift',-1)
Or as @W-B suggest:
df['C_End'] = df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
Output:
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 NaN
3 456 36 New 1 Term
4 456 40 Term 2 NaN
5 789 7 New 1 NaN
edited Jan 3 at 14:11
answered Jan 2 at 22:59


Scott BostonScott Boston
57.7k73258
57.7k73258
Why not justdf.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
I think this does not satisfyID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.
– Scott Boston
Jan 3 at 14:10
add a comment |
Why not justdf.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
I think this does not satisfyID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.
– Scott Boston
Jan 3 at 14:10
Why not just
df.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
Why not just
df.groupby('ID').C_Type.shift(-1)
– Wen-Ben
Jan 2 at 23:16
I think this does not satisfy
ID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
I think this does not satisfy
ID = ID and Rank = Rank+1
– yatu
Jan 2 at 23:21
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,
df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.– Scott Boston
Jan 3 at 14:10
@W-B I do think we should include the sort_values by rank just to ensure the condition of getting the C_Type for the next rank. However,
df.sort_values('Rank').groupby('ID')['C_Type'].shift(-1)
will work also. :) Thanks.– Scott Boston
Jan 3 at 14:10
add a comment |
Here's one way using np.where
:
dfs = df.shift(-1)
m1 = df.ID == dfs.ID
m2 = df.Rank + 1 == dfs.Rank
df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
add a comment |
Here's one way using np.where
:
dfs = df.shift(-1)
m1 = df.ID == dfs.ID
m2 = df.Rank + 1 == dfs.Rank
df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
add a comment |
Here's one way using np.where
:
dfs = df.shift(-1)
m1 = df.ID == dfs.ID
m2 = df.Rank + 1 == dfs.Rank
df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
Here's one way using np.where
:
dfs = df.shift(-1)
m1 = df.ID == dfs.ID
m2 = df.Rank + 1 == dfs.Rank
df.loc[:, 'C_End'] = np.where(m1 & m2, dfs.C_Type, None)
ID C_ID C_Type Rank C_End
0 123 8 New 1 Renew
1 123 10 Renew 2 Renew
2 123 35 Renew 3 None
3 456 36 New 1 Term
4 456 40 Term 2 None
5 789 7 New 1 None
answered Jan 2 at 23:14


yatuyatu
15.6k41542
15.6k41542
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014193%2fhow-to-assign-different-rows-value-to-new-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown