crosstab in Pandas DataFrame

I created a DataFrame

    A1  A2  A3  A4

0   cccc    xx  6   5

1   aaaa    yy  8   0

2   aaaa    xx  15  0

3   bbbb    xx  21  4

4   bbbb    xx  26  0

5   cccc    yy  33  2

6   aaaa    xx  44  1

7   cccc    xx  48  2

8   aaaa    yy  58  0

9   cccc    yy  59  5

10  bbbb    yy  77  0

11  bbbb    yy  99  0

and now using crosstab() with the command given below I was created new DataFrame.

df5 = pd.crosstab(df4['A1'], df4['A2'], margins=False,values=df4['A3'] , 

                 dropna=False, aggfunc='mean').reset_index().fillna(0)

this works properl.
it gives me output as follows

A2   A1      xx      yy

0   aaaa    29.5    33.0

1   bbbb    23.5    88.0

2   cccc    27.0    46.0

Now I want to store the mean values into the DataFrame df4

How can I do it, since I want to change A3 which contain 0 in df5 based on the crosstab()? and I want output as follows

    A1      A2  A3  A4    

0   aaaa    xx  15  29.5    

1   aaaa    xx  44  1.0    

2   aaaa    yy  8   33.0    

3   aaaa    yy  58  33.0    

4   bbbb    xx  21  4.0    

5   bbbb    xx  26  23.5    

6   bbbb    yy  77  88.0    

7   bbbb    yy  99  88.0    

8   cccc    xx  6   5.0    

9   cccc    xx  48  2.0

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

asked Nov 19 '18 at 12:54

Anuprita

285

3

Can you create minimal, complete, and verifiable example?
– jezrael
Nov 19 '18 at 12:55

How do you go from the input to the output? Whats that new output calculating?
– Franco Piccolo
Nov 19 '18 at 14:07

in some rows A4 contain 0. I want to replace it with the mean value which I received from crosstab
– Anuprita
Nov 19 '18 at 15:19

add a comment |

I created a DataFrame

    A1  A2  A3  A4

0   cccc    xx  6   5

1   aaaa    yy  8   0

2   aaaa    xx  15  0

3   bbbb    xx  21  4

4   bbbb    xx  26  0

5   cccc    yy  33  2

6   aaaa    xx  44  1

7   cccc    xx  48  2

8   aaaa    yy  58  0

9   cccc    yy  59  5

10  bbbb    yy  77  0

11  bbbb    yy  99  0

and now using crosstab() with the command given below I was created new DataFrame.

df5 = pd.crosstab(df4['A1'], df4['A2'], margins=False,values=df4['A3'] , 

                 dropna=False, aggfunc='mean').reset_index().fillna(0)

this works properl.
it gives me output as follows

A2   A1      xx      yy

0   aaaa    29.5    33.0

1   bbbb    23.5    88.0

2   cccc    27.0    46.0

Now I want to store the mean values into the DataFrame df4

How can I do it, since I want to change A3 which contain 0 in df5 based on the crosstab()? and I want output as follows

    A1      A2  A3  A4    

0   aaaa    xx  15  29.5    

1   aaaa    xx  44  1.0    

2   aaaa    yy  8   33.0    

3   aaaa    yy  58  33.0    

4   bbbb    xx  21  4.0    

5   bbbb    xx  26  23.5    

6   bbbb    yy  77  88.0    

7   bbbb    yy  99  88.0    

8   cccc    xx  6   5.0    

9   cccc    xx  48  2.0

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

asked Nov 19 '18 at 12:54

Anuprita

285

3

Can you create minimal, complete, and verifiable example?
– jezrael
Nov 19 '18 at 12:55

How do you go from the input to the output? Whats that new output calculating?
– Franco Piccolo
Nov 19 '18 at 14:07

in some rows A4 contain 0. I want to replace it with the mean value which I received from crosstab
– Anuprita
Nov 19 '18 at 15:19

add a comment |

I created a DataFrame

    A1  A2  A3  A4

0   cccc    xx  6   5

1   aaaa    yy  8   0

2   aaaa    xx  15  0

3   bbbb    xx  21  4

4   bbbb    xx  26  0

5   cccc    yy  33  2

6   aaaa    xx  44  1

7   cccc    xx  48  2

8   aaaa    yy  58  0

9   cccc    yy  59  5

10  bbbb    yy  77  0

11  bbbb    yy  99  0

and now using crosstab() with the command given below I was created new DataFrame.

df5 = pd.crosstab(df4['A1'], df4['A2'], margins=False,values=df4['A3'] , 

                 dropna=False, aggfunc='mean').reset_index().fillna(0)

this works properl.
it gives me output as follows

A2   A1      xx      yy

0   aaaa    29.5    33.0

1   bbbb    23.5    88.0

2   cccc    27.0    46.0

Now I want to store the mean values into the DataFrame df4

How can I do it, since I want to change A3 which contain 0 in df5 based on the crosstab()? and I want output as follows

    A1      A2  A3  A4    

0   aaaa    xx  15  29.5    

1   aaaa    xx  44  1.0    

2   aaaa    yy  8   33.0    

3   aaaa    yy  58  33.0    

4   bbbb    xx  21  4.0    

5   bbbb    xx  26  23.5    

6   bbbb    yy  77  88.0    

7   bbbb    yy  99  88.0    

8   cccc    xx  6   5.0    

9   cccc    xx  48  2.0

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

asked Nov 19 '18 at 12:54

Anuprita

285

I created a DataFrame

    A1  A2  A3  A4

0   cccc    xx  6   5

1   aaaa    yy  8   0

2   aaaa    xx  15  0

3   bbbb    xx  21  4

4   bbbb    xx  26  0

5   cccc    yy  33  2

6   aaaa    xx  44  1

7   cccc    xx  48  2

8   aaaa    yy  58  0

9   cccc    yy  59  5

10  bbbb    yy  77  0

11  bbbb    yy  99  0

and now using crosstab() with the command given below I was created new DataFrame.

df5 = pd.crosstab(df4['A1'], df4['A2'], margins=False,values=df4['A3'] , 

                 dropna=False, aggfunc='mean').reset_index().fillna(0)

this works properl.
it gives me output as follows

A2   A1      xx      yy

0   aaaa    29.5    33.0

1   bbbb    23.5    88.0

2   cccc    27.0    46.0

Now I want to store the mean values into the DataFrame df4

How can I do it, since I want to change A3 which contain 0 in df5 based on the crosstab()? and I want output as follows

    A1      A2  A3  A4    

0   aaaa    xx  15  29.5    

1   aaaa    xx  44  1.0    

2   aaaa    yy  8   33.0    

3   aaaa    yy  58  33.0    

4   bbbb    xx  21  4.0    

5   bbbb    xx  26  23.5    

6   bbbb    yy  77  88.0    

7   bbbb    yy  99  88.0    

8   cccc    xx  6   5.0    

9   cccc    xx  48  2.0

python pandas pandas-groupby

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

asked Nov 19 '18 at 12:54

Anuprita

285

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

asked Nov 19 '18 at 12:54

Anuprita

285

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

edited Nov 19 '18 at 15:34

jpp

91.7k2052102

asked Nov 19 '18 at 12:54

Anuprita

285

asked Nov 19 '18 at 12:54

Anuprita

285

asked Nov 19 '18 at 12:54

Anuprita

285

3

Can you create minimal, complete, and verifiable example?
– jezrael
Nov 19 '18 at 12:55

How do you go from the input to the output? Whats that new output calculating?
– Franco Piccolo
Nov 19 '18 at 14:07

in some rows A4 contain 0. I want to replace it with the mean value which I received from crosstab
– Anuprita
Nov 19 '18 at 15:19

add a comment |

3

Can you create minimal, complete, and verifiable example?
– jezrael
Nov 19 '18 at 12:55

How do you go from the input to the output? Whats that new output calculating?
– Franco Piccolo
Nov 19 '18 at 14:07

in some rows A4 contain 0. I want to replace it with the mean value which I received from crosstab
– Anuprita
Nov 19 '18 at 15:19

Can you create minimal, complete, and verifiable example?
– jezrael
Nov 19 '18 at 12:55

How do you go from the input to the output? Whats that new output calculating?
– Franco Piccolo
Nov 19 '18 at 14:07

in some rows A4 contain 0. I want to replace it with the mean value which I received from crosstab
– Anuprita
Nov 19 '18 at 15:19

add a comment |

1 Answer
1

active

oldest

votes

`mask` + `groupby` + `transform`

Ignoring the unnecessary reordering and removal of some rows in your desired output, you can use mask with groupby:

group_mean = df4.groupby(['A1', 'A2'])['A3'].transform('mean')



df4['A4'] = df4['A4'].mask(df4['A4'] == 0, group_mean)



print(df4)



      A1  A2  A3    A4

0   cccc  xx   6   5.0

1   aaaa  yy   8  33.0

2   aaaa  xx  15  29.5

3   bbbb  xx  21   4.0

4   bbbb  xx  26  23.5

5   cccc  yy  33   2.0

6   aaaa  xx  44   1.0

7   cccc  xx  48   2.0

8   aaaa  yy  58  33.0

9   cccc  yy  59   5.0

10  bbbb  yy  77  88.0

11  bbbb  yy  99  88.0

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53375106%2fcrosstab-in-pandas-dataframe%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

`mask` + `groupby` + `transform`

Ignoring the unnecessary reordering and removal of some rows in your desired output, you can use mask with groupby:

group_mean = df4.groupby(['A1', 'A2'])['A3'].transform('mean')



df4['A4'] = df4['A4'].mask(df4['A4'] == 0, group_mean)



print(df4)



      A1  A2  A3    A4

0   cccc  xx   6   5.0

1   aaaa  yy   8  33.0

2   aaaa  xx  15  29.5

3   bbbb  xx  21   4.0

4   bbbb  xx  26  23.5

5   cccc  yy  33   2.0

6   aaaa  xx  44   1.0

7   cccc  xx  48   2.0

8   aaaa  yy  58  33.0

9   cccc  yy  59   5.0

10  bbbb  yy  77  88.0

11  bbbb  yy  99  88.0

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

add a comment |

`mask` + `groupby` + `transform`

Ignoring the unnecessary reordering and removal of some rows in your desired output, you can use mask with groupby:

group_mean = df4.groupby(['A1', 'A2'])['A3'].transform('mean')



df4['A4'] = df4['A4'].mask(df4['A4'] == 0, group_mean)



print(df4)



      A1  A2  A3    A4

0   cccc  xx   6   5.0

1   aaaa  yy   8  33.0

2   aaaa  xx  15  29.5

3   bbbb  xx  21   4.0

4   bbbb  xx  26  23.5

5   cccc  yy  33   2.0

6   aaaa  xx  44   1.0

7   cccc  xx  48   2.0

8   aaaa  yy  58  33.0

9   cccc  yy  59   5.0

10  bbbb  yy  77  88.0

11  bbbb  yy  99  88.0

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

add a comment |

`mask` + `groupby` + `transform`

Ignoring the unnecessary reordering and removal of some rows in your desired output, you can use mask with groupby:

group_mean = df4.groupby(['A1', 'A2'])['A3'].transform('mean')



df4['A4'] = df4['A4'].mask(df4['A4'] == 0, group_mean)



print(df4)



      A1  A2  A3    A4

0   cccc  xx   6   5.0

1   aaaa  yy   8  33.0

2   aaaa  xx  15  29.5

3   bbbb  xx  21   4.0

4   bbbb  xx  26  23.5

5   cccc  yy  33   2.0

6   aaaa  xx  44   1.0

7   cccc  xx  48   2.0

8   aaaa  yy  58  33.0

9   cccc  yy  59   5.0

10  bbbb  yy  77  88.0

11  bbbb  yy  99  88.0

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

`mask` + `groupby` + `transform`

Ignoring the unnecessary reordering and removal of some rows in your desired output, you can use mask with groupby:

group_mean = df4.groupby(['A1', 'A2'])['A3'].transform('mean')



df4['A4'] = df4['A4'].mask(df4['A4'] == 0, group_mean)



print(df4)



      A1  A2  A3    A4

0   cccc  xx   6   5.0

1   aaaa  yy   8  33.0

2   aaaa  xx  15  29.5

3   bbbb  xx  21   4.0

4   bbbb  xx  26  23.5

5   cccc  yy  33   2.0

6   aaaa  xx  44   1.0

7   cccc  xx  48   2.0

8   aaaa  yy  58  33.0

9   cccc  yy  59   5.0

10  bbbb  yy  77  88.0

11  bbbb  yy  99  88.0

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

answered Nov 19 '18 at 15:31

jpp

91.7k2052102

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu