Populating a data frame using from separate table using loc

data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}



data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}



output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}





df1=pd.DataFrame(data1)



df2=pd.DataFrame(data2)



df3=pd.DataFrame(output)

Test data above.

I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:

The state is 'TX'

The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']

View df3 for correct output.

What I tried doing was the following;

if df1.State.any()=="TX":

    if df1["Policy Number"].str.contains("FSH").any():

        for i in df["TERR"]:

            df1['% TERR']=df2.loc[[i],["FSH"]]

However, my output is riddled with NAN, as well as 1 unique correct answer.

I tried checking to ensure the correct i values were being fed into df2 via

print(df2.loc[[i],["FSH"]]

and it is printing correctly.

Any thoughts?

asked Nov 19 '18 at 18:27

Bjc51192

527

add a comment |

data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}



data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}



output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}





df1=pd.DataFrame(data1)



df2=pd.DataFrame(data2)



df3=pd.DataFrame(output)

Test data above.

I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:

The state is 'TX'

The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']

View df3 for correct output.

What I tried doing was the following;

if df1.State.any()=="TX":

    if df1["Policy Number"].str.contains("FSH").any():

        for i in df["TERR"]:

            df1['% TERR']=df2.loc[[i],["FSH"]]

However, my output is riddled with NAN, as well as 1 unique correct answer.

I tried checking to ensure the correct i values were being fed into df2 via

print(df2.loc[[i],["FSH"]]

and it is printing correctly.

Any thoughts?

asked Nov 19 '18 at 18:27

Bjc51192

527

add a comment |

data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}



data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}



output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}





df1=pd.DataFrame(data1)



df2=pd.DataFrame(data2)



df3=pd.DataFrame(output)

Test data above.

I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:

The state is 'TX'

The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']

View df3 for correct output.

What I tried doing was the following;

if df1.State.any()=="TX":

    if df1["Policy Number"].str.contains("FSH").any():

        for i in df["TERR"]:

            df1['% TERR']=df2.loc[[i],["FSH"]]

However, my output is riddled with NAN, as well as 1 unique correct answer.

I tried checking to ensure the correct i values were being fed into df2 via

print(df2.loc[[i],["FSH"]]

and it is printing correctly.

Any thoughts?

asked Nov 19 '18 at 18:27

Bjc51192

527

data1={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5]}



data2={'TERR':[1,2,3,4,5],'CHH':[0,.15,.65,.35,.20],'FSH':[0,.15,.25,.35,.20]}



output={'Policy Number':['FSH1235456','FSH7643643','CHH123124','CHH123145252','CHH124124'],'State':['FL','TX','GA','TX','TX'],'TERR':[1,2,3,4,5],'Test':[0,.15,0,0,0]}





df1=pd.DataFrame(data1)



df2=pd.DataFrame(data2)



df3=pd.DataFrame(output)

Test data above.

I am trying to create a new column in df1 call it df1['Test'], which contains the values of df2['FSH'] based on the following criteria:

The state is 'TX'

The Policy Number contains 'FSH'
3.The value of df1["Terr"] = value of df2['TERR']

View df3 for correct output.

What I tried doing was the following;

if df1.State.any()=="TX":

    if df1["Policy Number"].str.contains("FSH").any():

        for i in df["TERR"]:

            df1['% TERR']=df2.loc[[i],["FSH"]]

However, my output is riddled with NAN, as well as 1 unique correct answer.

I tried checking to ensure the correct i values were being fed into df2 via

print(df2.loc[[i],["FSH"]]

and it is printing correctly.

Any thoughts?

python pandas

asked Nov 19 '18 at 18:27

Bjc51192

527

asked Nov 19 '18 at 18:27

Bjc51192

527

asked Nov 19 '18 at 18:27

Bjc51192

527

asked Nov 19 '18 at 18:27

Bjc51192

527

asked Nov 19 '18 at 18:27

Bjc51192

527

add a comment |

4 Answers
4

active

oldest

votes

I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:

new = df1.merge(df2, on='TERR')

mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]



df1['Test'] = 0

df1['Test'].update(mask['FSH'])



    Policy Number   State   TERR    Test

0   FSH1235456        FL    1       0.00

1   FSH7643643        TX    2       0.15

2   CHH123124         GA    3       0.00

3   CHH123145252      TX    4       0.00

4   CHH124124         TX    5       0.00

answered Nov 19 '18 at 18:39

Chris

1,9411316

add a comment |

You can use numpy where by passing conditions,

cond1 = (df1['State'] == 'TX')

cond2 = (df1['Policy Number'].str.contains('FSH'))

cond3 = (df1["TERR"] == df2['TERR'])

df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)



    Policy Number   State   TERR    Test

0   FSH1235456      FL      1       0.00

1   FSH7643643      TX      2       0.15

2   CHH123124       GA      3       0.00

3   CHH123145252    TX      4       0.00

4   CHH124124       TX      5       0.00

answered Nov 19 '18 at 19:23

Vaishali

18k31028

This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 '18 at 20:32

You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 '18 at 20:37

add a comment |

Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.

df1['policy_prefix'] = df1['Policy Number'].str[:3]

df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'], 

               value_name='Test', 

               var_name='policy_prefix')

df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])

if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:

   import numpy as np 

   df1.loc[df1.State!='TX', 'Test'] = np.nan

edited Nov 19 '18 at 18:45

answered Nov 19 '18 at 18:40

Robert

33429

add a comment |

Here is your solution:

# ... initialize df1 and df2 here

df3 = df1.join(df2.FSH)  # Merge df1 and df2 into a single dataframe

df3 = df3.rename({"FSH": "TEST"}, axis=1)  # Change column name



def set_tx_fsh(row):

    if row.State == "TX" and "FSH" in row["Policy Number"]:

        return row.TEST

    else:

        return 0



df3.TEST = df3.apply(set_tx_fsh, axis=1)  # Set values in "TEST" column based on your condition

answered Nov 19 '18 at 18:47

jadelord

493511

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53380615%2fpopulating-a-data-frame-using-from-separate-table-using-loc%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:

new = df1.merge(df2, on='TERR')

mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]



df1['Test'] = 0

df1['Test'].update(mask['FSH'])



    Policy Number   State   TERR    Test

0   FSH1235456        FL    1       0.00

1   FSH7643643        TX    2       0.15

2   CHH123124         GA    3       0.00

3   CHH123145252      TX    4       0.00

4   CHH124124         TX    5       0.00

answered Nov 19 '18 at 18:39

Chris

1,9411316

add a comment |

I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:

new = df1.merge(df2, on='TERR')

mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]



df1['Test'] = 0

df1['Test'].update(mask['FSH'])



    Policy Number   State   TERR    Test

0   FSH1235456        FL    1       0.00

1   FSH7643643        TX    2       0.15

2   CHH123124         GA    3       0.00

3   CHH123145252      TX    4       0.00

4   CHH124124         TX    5       0.00

answered Nov 19 '18 at 18:39

Chris

1,9411316

add a comment |

I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:

new = df1.merge(df2, on='TERR')

mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]



df1['Test'] = 0

df1['Test'].update(mask['FSH'])



    Policy Number   State   TERR    Test

0   FSH1235456        FL    1       0.00

1   FSH7643643        TX    2       0.15

2   CHH123124         GA    3       0.00

3   CHH123145252      TX    4       0.00

4   CHH124124         TX    5       0.00

answered Nov 19 '18 at 18:39

Chris

1,9411316

I dont know if this is the best or fastest solution but one option is to merge your 2 dataframes then filter based on your conditions then update:

new = df1.merge(df2, on='TERR')

mask = new[((new['State']=='TX') & (new['Policy Number'].str.contains('FSH')))]



df1['Test'] = 0

df1['Test'].update(mask['FSH'])



    Policy Number   State   TERR    Test

0   FSH1235456        FL    1       0.00

1   FSH7643643        TX    2       0.15

2   CHH123124         GA    3       0.00

3   CHH123145252      TX    4       0.00

4   CHH124124         TX    5       0.00

answered Nov 19 '18 at 18:39

Chris

1,9411316

answered Nov 19 '18 at 18:39

Chris

1,9411316

answered Nov 19 '18 at 18:39

Chris

1,9411316

answered Nov 19 '18 at 18:39

Chris

1,9411316

add a comment |

You can use numpy where by passing conditions,

cond1 = (df1['State'] == 'TX')

cond2 = (df1['Policy Number'].str.contains('FSH'))

cond3 = (df1["TERR"] == df2['TERR'])

df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)



    Policy Number   State   TERR    Test

0   FSH1235456      FL      1       0.00

1   FSH7643643      TX      2       0.15

2   CHH123124       GA      3       0.00

3   CHH123145252    TX      4       0.00

4   CHH124124       TX      5       0.00

answered Nov 19 '18 at 19:23

Vaishali

18k31028

This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 '18 at 20:32

You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 '18 at 20:37

add a comment |

You can use numpy where by passing conditions,

cond1 = (df1['State'] == 'TX')

cond2 = (df1['Policy Number'].str.contains('FSH'))

cond3 = (df1["TERR"] == df2['TERR'])

df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)



    Policy Number   State   TERR    Test

0   FSH1235456      FL      1       0.00

1   FSH7643643      TX      2       0.15

2   CHH123124       GA      3       0.00

3   CHH123145252    TX      4       0.00

4   CHH124124       TX      5       0.00

answered Nov 19 '18 at 19:23

Vaishali

18k31028

This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 '18 at 20:32

You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 '18 at 20:37

add a comment |

You can use numpy where by passing conditions,

cond1 = (df1['State'] == 'TX')

cond2 = (df1['Policy Number'].str.contains('FSH'))

cond3 = (df1["TERR"] == df2['TERR'])

df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)



    Policy Number   State   TERR    Test

0   FSH1235456      FL      1       0.00

1   FSH7643643      TX      2       0.15

2   CHH123124       GA      3       0.00

3   CHH123145252    TX      4       0.00

4   CHH124124       TX      5       0.00

answered Nov 19 '18 at 19:23

Vaishali

18k31028

You can use numpy where by passing conditions,

cond1 = (df1['State'] == 'TX')

cond2 = (df1['Policy Number'].str.contains('FSH'))

cond3 = (df1["TERR"] == df2['TERR'])

df1['Test'] = np.where(cond1 & cond2 & cond3, df2['FSH'], 0)



    Policy Number   State   TERR    Test

0   FSH1235456      FL      1       0.00

1   FSH7643643      TX      2       0.15

2   CHH123124       GA      3       0.00

3   CHH123145252    TX      4       0.00

4   CHH124124       TX      5       0.00

answered Nov 19 '18 at 19:23

Vaishali

18k31028

answered Nov 19 '18 at 19:23

Vaishali

18k31028

answered Nov 19 '18 at 19:23

Vaishali

18k31028

answered Nov 19 '18 at 19:23

Vaishali

18k31028

This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 '18 at 20:32

You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 '18 at 20:37

add a comment |

This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 '18 at 20:32

You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 '18 at 20:37

This is a great solution, however how would you take into consideration the case where the shapes do not match exactly?
– Bjc51192
Nov 19 '18 at 20:32

You can use map when the shapes of dataframes do not match but in this case, the mapping is possible only based TERR columns. The other two conditions are still index based
– Vaishali
Nov 19 '18 at 20:37

add a comment |

Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.

df1['policy_prefix'] = df1['Policy Number'].str[:3]

df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'], 

               value_name='Test', 

               var_name='policy_prefix')

df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])

if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:

   import numpy as np 

   df1.loc[df1.State!='TX', 'Test'] = np.nan

edited Nov 19 '18 at 18:45

answered Nov 19 '18 at 18:40

Robert

33429

add a comment |

Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.

df1['policy_prefix'] = df1['Policy Number'].str[:3]

df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'], 

               value_name='Test', 

               var_name='policy_prefix')

df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])

if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:

   import numpy as np 

   df1.loc[df1.State!='TX', 'Test'] = np.nan

edited Nov 19 '18 at 18:45

answered Nov 19 '18 at 18:40

Robert

33429

add a comment |

Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.

df1['policy_prefix'] = df1['Policy Number'].str[:3]

df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'], 

               value_name='Test', 

               var_name='policy_prefix')

df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])

if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:

   import numpy as np 

   df1.loc[df1.State!='TX', 'Test'] = np.nan

edited Nov 19 '18 at 18:45

answered Nov 19 '18 at 18:40

Robert

33429

Are you just trying to get the data from df2 into df1? If so, you could reshape df2 using melt and then do a merge.

df1['policy_prefix'] = df1['Policy Number'].str[:3]

df2 = df2.melt(id_vars='TERR', value_vars=['CHH', 'FSH'], 

               value_name='Test', 

               var_name='policy_prefix')

df1 = df1.merge(df2, on=['policy_prefix', 'TERR'])

if you only want this to apply to rows where the state is 'TX' then you could set the other values to null after the merge:

   import numpy as np 

   df1.loc[df1.State!='TX', 'Test'] = np.nan

edited Nov 19 '18 at 18:45

answered Nov 19 '18 at 18:40

Robert

33429

edited Nov 19 '18 at 18:45

answered Nov 19 '18 at 18:40

Robert

33429

answered Nov 19 '18 at 18:40

Robert

33429

answered Nov 19 '18 at 18:40

Robert

33429

add a comment |

Here is your solution:

# ... initialize df1 and df2 here

df3 = df1.join(df2.FSH)  # Merge df1 and df2 into a single dataframe

df3 = df3.rename({"FSH": "TEST"}, axis=1)  # Change column name



def set_tx_fsh(row):

    if row.State == "TX" and "FSH" in row["Policy Number"]:

        return row.TEST

    else:

        return 0



df3.TEST = df3.apply(set_tx_fsh, axis=1)  # Set values in "TEST" column based on your condition

answered Nov 19 '18 at 18:47

jadelord

493511

add a comment |

Here is your solution:

# ... initialize df1 and df2 here

df3 = df1.join(df2.FSH)  # Merge df1 and df2 into a single dataframe

df3 = df3.rename({"FSH": "TEST"}, axis=1)  # Change column name



def set_tx_fsh(row):

    if row.State == "TX" and "FSH" in row["Policy Number"]:

        return row.TEST

    else:

        return 0



df3.TEST = df3.apply(set_tx_fsh, axis=1)  # Set values in "TEST" column based on your condition

answered Nov 19 '18 at 18:47

jadelord

493511

add a comment |

Here is your solution:

# ... initialize df1 and df2 here

df3 = df1.join(df2.FSH)  # Merge df1 and df2 into a single dataframe

df3 = df3.rename({"FSH": "TEST"}, axis=1)  # Change column name



def set_tx_fsh(row):

    if row.State == "TX" and "FSH" in row["Policy Number"]:

        return row.TEST

    else:

        return 0



df3.TEST = df3.apply(set_tx_fsh, axis=1)  # Set values in "TEST" column based on your condition

answered Nov 19 '18 at 18:47

jadelord

493511

Here is your solution:

# ... initialize df1 and df2 here

df3 = df1.join(df2.FSH)  # Merge df1 and df2 into a single dataframe

df3 = df3.rename({"FSH": "TEST"}, axis=1)  # Change column name



def set_tx_fsh(row):

    if row.State == "TX" and "FSH" in row["Policy Number"]:

        return row.TEST

    else:

        return 0



df3.TEST = df3.apply(set_tx_fsh, axis=1)  # Set values in "TEST" column based on your condition

answered Nov 19 '18 at 18:47

jadelord

493511

answered Nov 19 '18 at 18:47

jadelord

493511

answered Nov 19 '18 at 18:47

jadelord

493511

answered Nov 19 '18 at 18:47

jadelord

493511

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu