Populate Pandas DataFrame using a dictionary based on a condition
I have a DataFrame
>> test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4], 'C': [np.nan, np.nan, np.nan, np.nan], 'D': [np.nan, np.nan, np.nan, np.nan]})
A B C D
0 a 1
1 b 2
2 b 3
3 b 4
I also have a dictionary, where b
in input_b
signifies that I'm only modifying rows where row.A = b
.
>> input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
How do I populate the DataFrame with values from the dictionary to get
A B C D
0 a 1
1 b 2 Moon Elephant
2 b 3
3 b 4 Sun Mouse
python pandas dataframe assign
add a comment |
I have a DataFrame
>> test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4], 'C': [np.nan, np.nan, np.nan, np.nan], 'D': [np.nan, np.nan, np.nan, np.nan]})
A B C D
0 a 1
1 b 2
2 b 3
3 b 4
I also have a dictionary, where b
in input_b
signifies that I'm only modifying rows where row.A = b
.
>> input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
How do I populate the DataFrame with values from the dictionary to get
A B C D
0 a 1
1 b 2 Moon Elephant
2 b 3
3 b 4 Sun Mouse
python pandas dataframe assign
I understand how to do this by first turning the dictionary into a dataframe and then merge'ing it with the first. However, this gets very slow if the first dataframe (test) is much larger than its portion that I wish to modify.
– GingerBadger
Jan 2 at 23:55
1
I think transforming your dictionary into a dataframe usingpd.from_dict(your_dictionary)
and then merging withinput_b
seems to me like the best solution. I would be surprised if there is a better solution.
– KenHBS
Jan 3 at 0:30
add a comment |
I have a DataFrame
>> test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4], 'C': [np.nan, np.nan, np.nan, np.nan], 'D': [np.nan, np.nan, np.nan, np.nan]})
A B C D
0 a 1
1 b 2
2 b 3
3 b 4
I also have a dictionary, where b
in input_b
signifies that I'm only modifying rows where row.A = b
.
>> input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
How do I populate the DataFrame with values from the dictionary to get
A B C D
0 a 1
1 b 2 Moon Elephant
2 b 3
3 b 4 Sun Mouse
python pandas dataframe assign
I have a DataFrame
>> test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4], 'C': [np.nan, np.nan, np.nan, np.nan], 'D': [np.nan, np.nan, np.nan, np.nan]})
A B C D
0 a 1
1 b 2
2 b 3
3 b 4
I also have a dictionary, where b
in input_b
signifies that I'm only modifying rows where row.A = b
.
>> input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
How do I populate the DataFrame with values from the dictionary to get
A B C D
0 a 1
1 b 2 Moon Elephant
2 b 3
3 b 4 Sun Mouse
python pandas dataframe assign
python pandas dataframe assign
edited Jan 3 at 0:46
GingerBadger
asked Jan 2 at 23:52
GingerBadgerGingerBadger
1166
1166
I understand how to do this by first turning the dictionary into a dataframe and then merge'ing it with the first. However, this gets very slow if the first dataframe (test) is much larger than its portion that I wish to modify.
– GingerBadger
Jan 2 at 23:55
1
I think transforming your dictionary into a dataframe usingpd.from_dict(your_dictionary)
and then merging withinput_b
seems to me like the best solution. I would be surprised if there is a better solution.
– KenHBS
Jan 3 at 0:30
add a comment |
I understand how to do this by first turning the dictionary into a dataframe and then merge'ing it with the first. However, this gets very slow if the first dataframe (test) is much larger than its portion that I wish to modify.
– GingerBadger
Jan 2 at 23:55
1
I think transforming your dictionary into a dataframe usingpd.from_dict(your_dictionary)
and then merging withinput_b
seems to me like the best solution. I would be surprised if there is a better solution.
– KenHBS
Jan 3 at 0:30
I understand how to do this by first turning the dictionary into a dataframe and then merge'ing it with the first. However, this gets very slow if the first dataframe (test) is much larger than its portion that I wish to modify.
– GingerBadger
Jan 2 at 23:55
I understand how to do this by first turning the dictionary into a dataframe and then merge'ing it with the first. However, this gets very slow if the first dataframe (test) is much larger than its portion that I wish to modify.
– GingerBadger
Jan 2 at 23:55
1
1
I think transforming your dictionary into a dataframe using
pd.from_dict(your_dictionary)
and then merging with input_b
seems to me like the best solution. I would be surprised if there is a better solution.– KenHBS
Jan 3 at 0:30
I think transforming your dictionary into a dataframe using
pd.from_dict(your_dictionary)
and then merging with input_b
seems to me like the best solution. I would be surprised if there is a better solution.– KenHBS
Jan 3 at 0:30
add a comment |
4 Answers
4
active
oldest
votes
This may not be the most efficient solution, but from what I understand it got the job done:
import pandas as pd
import numpy as np
test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4],
'C': [np.nan, np.nan, np.nan, np.nan],
'D': [np.nan, np.nan, np.nan, np.nan]})
input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
for key, value in input_b.items():
test.loc[test['B'] == key, ['C', 'D']] = value
print(test)
Yields:
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
This will get slower if the dictionary input_b
gets too large (too many rows are being updated, too many iterations in the for loop), but should be relatively fast with small input_b
's even with large test
dataframes.
This answer also assumes the keys in the input_b
dictionary refer to the values of the B
column in the original dataframe, and will add repeated values in the C
and D
columns for repeated values in the B
column.
add a comment |
Using update
test=test.set_index('B')
test.update(pd.DataFrame(input_b,index=['C','D']).T)
test=test.reset_index()
test
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
add a comment |
You can use loc
indexing after setting your index to B
:
test = test.set_index('B')
test.loc[input_b, ['C', 'D']] = list(input_b.values())
test = test.reset_index()
print(test)
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
add a comment |
Using apply
test['C'] = test['B'].map(input_b).apply(lambda x: x[0] if type(x)==list else x)
test['D'] = test['B'].map(input_b).apply(lambda x: x[1] if type(x)==list else x)
yields
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014687%2fpopulate-pandas-dataframe-using-a-dictionary-based-on-a-condition%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
This may not be the most efficient solution, but from what I understand it got the job done:
import pandas as pd
import numpy as np
test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4],
'C': [np.nan, np.nan, np.nan, np.nan],
'D': [np.nan, np.nan, np.nan, np.nan]})
input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
for key, value in input_b.items():
test.loc[test['B'] == key, ['C', 'D']] = value
print(test)
Yields:
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
This will get slower if the dictionary input_b
gets too large (too many rows are being updated, too many iterations in the for loop), but should be relatively fast with small input_b
's even with large test
dataframes.
This answer also assumes the keys in the input_b
dictionary refer to the values of the B
column in the original dataframe, and will add repeated values in the C
and D
columns for repeated values in the B
column.
add a comment |
This may not be the most efficient solution, but from what I understand it got the job done:
import pandas as pd
import numpy as np
test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4],
'C': [np.nan, np.nan, np.nan, np.nan],
'D': [np.nan, np.nan, np.nan, np.nan]})
input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
for key, value in input_b.items():
test.loc[test['B'] == key, ['C', 'D']] = value
print(test)
Yields:
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
This will get slower if the dictionary input_b
gets too large (too many rows are being updated, too many iterations in the for loop), but should be relatively fast with small input_b
's even with large test
dataframes.
This answer also assumes the keys in the input_b
dictionary refer to the values of the B
column in the original dataframe, and will add repeated values in the C
and D
columns for repeated values in the B
column.
add a comment |
This may not be the most efficient solution, but from what I understand it got the job done:
import pandas as pd
import numpy as np
test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4],
'C': [np.nan, np.nan, np.nan, np.nan],
'D': [np.nan, np.nan, np.nan, np.nan]})
input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
for key, value in input_b.items():
test.loc[test['B'] == key, ['C', 'D']] = value
print(test)
Yields:
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
This will get slower if the dictionary input_b
gets too large (too many rows are being updated, too many iterations in the for loop), but should be relatively fast with small input_b
's even with large test
dataframes.
This answer also assumes the keys in the input_b
dictionary refer to the values of the B
column in the original dataframe, and will add repeated values in the C
and D
columns for repeated values in the B
column.
This may not be the most efficient solution, but from what I understand it got the job done:
import pandas as pd
import numpy as np
test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4],
'C': [np.nan, np.nan, np.nan, np.nan],
'D': [np.nan, np.nan, np.nan, np.nan]})
input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
for key, value in input_b.items():
test.loc[test['B'] == key, ['C', 'D']] = value
print(test)
Yields:
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
This will get slower if the dictionary input_b
gets too large (too many rows are being updated, too many iterations in the for loop), but should be relatively fast with small input_b
's even with large test
dataframes.
This answer also assumes the keys in the input_b
dictionary refer to the values of the B
column in the original dataframe, and will add repeated values in the C
and D
columns for repeated values in the B
column.
edited Jan 3 at 0:15
answered Jan 3 at 0:00


PasaPasa
35238
35238
add a comment |
add a comment |
Using update
test=test.set_index('B')
test.update(pd.DataFrame(input_b,index=['C','D']).T)
test=test.reset_index()
test
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
add a comment |
Using update
test=test.set_index('B')
test.update(pd.DataFrame(input_b,index=['C','D']).T)
test=test.reset_index()
test
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
add a comment |
Using update
test=test.set_index('B')
test.update(pd.DataFrame(input_b,index=['C','D']).T)
test=test.reset_index()
test
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
Using update
test=test.set_index('B')
test.update(pd.DataFrame(input_b,index=['C','D']).T)
test=test.reset_index()
test
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
answered Jan 3 at 0:42


Wen-BenWen-Ben
122k83671
122k83671
add a comment |
add a comment |
You can use loc
indexing after setting your index to B
:
test = test.set_index('B')
test.loc[input_b, ['C', 'D']] = list(input_b.values())
test = test.reset_index()
print(test)
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
add a comment |
You can use loc
indexing after setting your index to B
:
test = test.set_index('B')
test.loc[input_b, ['C', 'D']] = list(input_b.values())
test = test.reset_index()
print(test)
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
add a comment |
You can use loc
indexing after setting your index to B
:
test = test.set_index('B')
test.loc[input_b, ['C', 'D']] = list(input_b.values())
test = test.reset_index()
print(test)
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
You can use loc
indexing after setting your index to B
:
test = test.set_index('B')
test.loc[input_b, ['C', 'D']] = list(input_b.values())
test = test.reset_index()
print(test)
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
answered Jan 3 at 0:54


jppjpp
102k2166116
102k2166116
add a comment |
add a comment |
Using apply
test['C'] = test['B'].map(input_b).apply(lambda x: x[0] if type(x)==list else x)
test['D'] = test['B'].map(input_b).apply(lambda x: x[1] if type(x)==list else x)
yields
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
add a comment |
Using apply
test['C'] = test['B'].map(input_b).apply(lambda x: x[0] if type(x)==list else x)
test['D'] = test['B'].map(input_b).apply(lambda x: x[1] if type(x)==list else x)
yields
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
add a comment |
Using apply
test['C'] = test['B'].map(input_b).apply(lambda x: x[0] if type(x)==list else x)
test['D'] = test['B'].map(input_b).apply(lambda x: x[1] if type(x)==list else x)
yields
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
Using apply
test['C'] = test['B'].map(input_b).apply(lambda x: x[0] if type(x)==list else x)
test['D'] = test['B'].map(input_b).apply(lambda x: x[1] if type(x)==list else x)
yields
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
answered Jan 3 at 1:14
raninjanraninjan
1315
1315
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014687%2fpopulate-pandas-dataframe-using-a-dictionary-based-on-a-condition%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I understand how to do this by first turning the dictionary into a dataframe and then merge'ing it with the first. However, this gets very slow if the first dataframe (test) is much larger than its portion that I wish to modify.
– GingerBadger
Jan 2 at 23:55
1
I think transforming your dictionary into a dataframe using
pd.from_dict(your_dictionary)
and then merging withinput_b
seems to me like the best solution. I would be surprised if there is a better solution.– KenHBS
Jan 3 at 0:30