IndexError when replacing missing values with mode using groupby in pandas

I have a dataset which requires missing value treatment.

 Column                      Missing Values



 Complaint_ID                    0         

 Date_received                   0         

 Transaction_Type                0         

 Complaint_reason                0         

 Company_response              22506         

 Date_sent_to_company            0         

 Complaint_Status                0         

 Consumer_disputes             7698

Now the problem is, when I try to replace the missing values with mode of other columns using groupby:

Code:

data11["Company_response"] = 

data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode() 

[0]))["Company_response"]



data11["Consumer_disputes"] = 

data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode() 

[0]))["Consumer_disputes"]

I get the following error:

Stacktrace

Traceback (most recent call last):



File "<ipython-input-89-8de6a010a299>", line 1, in <module>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3741, in transform

    return self._transform_general(func, *args, **kwargs)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3699, in _transform_general

    res = path(group)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4360, in apply

    ignore_failures=ignore_failures)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4456, in _apply_standard

    results[i] = func(v)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "<ipython-input-89-8de6a010a299>", line 1, in <lambda>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoreseries.py", line 601, in __getitem__

    result = self.index.get_value(self, key)



  File "C:Anaconda3libsite-packagespandascoreindexesbase.py", line 2434, in get_value

    return libts.get_value_box(s, key)



  File "pandas_libstslib.pyx", line 923, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18843)



  File "pandas_libstslib.pyx", line 939, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18560)



IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

I have checked the length of the dataframeand all of its columns and it is same: 43266.

I have also found a question similar to this but does not have correct answer: Click here

Please help resolve the error.

IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

Here is a snapshot of the dataset if it helps in any way: Dataset Snapshot

I am using the below code successfully. But it does not serve my purpose exactly. Helps to fill the missing values though.

data11['Company_response'].fillna(data11['Company_response'].mode()[0], 

inplace=True)

data11['Consumer_disputes'].fillna(data11['Consumer_disputes'].mode()[0], 

inplace=True)

Edit1: (Attaching Sample)

Input Given:
InputImage

Expected Output:
OutputImage

You can see that the missing values for company-response of Tr-1 and Tr-3 are filled by taking mode of Complaint-Reason.
And similarly for the Consumer-Disputes by taking mode of transaction-type, for Tr-5.

The below snippet consists of the dataframe and the code for those who want to replicate and give it a try.

Replication Code

import pandas as pd

import numpy as np



data11=pd.DataFrame({'Complaint_ID':['Tr-1','Tr-2','Tr-3','Tr-4','Tr-5','Tr-6'],

                    'Transaction_Type':['Mortgage','Credit card','Bank account or service','Debt collection','Credit card','Mortgage'],

                    'Complaint_reason':['Loan servicing, payments, escrow account','Incorrect information on credit report',"Cont'd attempts collect debt not owed","Cont'd attempts collect debt not owed",'Payoff process','Loan servicing, payments, escrow account'],

                    'Company_response':[np.nan,'Company chooses not to provide a public response',np.nan,'Company believes it acted appropriately as authorized by contract or law','Company has responded to the consumer and the CFPB and chooses not to provide a public response','Company disputes the facts presented in the complaint'],

                    'Consumer_disputes':['Yes','No','No','No',np.nan,'Yes']})



data11.isnull().sum()



data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]

data11["Consumer_disputes"] = data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0]))["Consumer_disputes"]

edited Jan 1 at 17:46

asked Jan 1 at 10:12

Ashu Grover

151112

the question literally died last time, i edited it, left comments but no one answered for almost 6 days, so unfortunately i had to post it again as i do not have any bounties to offer, so guys if you find it interesting and are unable to solve it, please upvote the question so that it might interest others as well...

– Ashu Grover
Jan 1 at 11:10

Could you add a small input sample and the expected output

– Daniel Mesejo
Jan 1 at 11:36

1

the question did not "literally die" - this is a metaphor. it figuratively died!

– Josh Friedlander
Jan 1 at 12:33

@JoshFriedlander haha... yes Josh... got a bit carried away i guess...

– Ashu Grover
Jan 1 at 12:40

:) as for your question - it would help if you could post like 5 rows of your data, or made-up equivalents - that screenshot is the right idea but text is much easier to work with than an image

– Josh Friedlander
Jan 1 at 12:45

|
show 5 more comments

I have a dataset which requires missing value treatment.

 Column                      Missing Values



 Complaint_ID                    0         

 Date_received                   0         

 Transaction_Type                0         

 Complaint_reason                0         

 Company_response              22506         

 Date_sent_to_company            0         

 Complaint_Status                0         

 Consumer_disputes             7698

Now the problem is, when I try to replace the missing values with mode of other columns using groupby:

Code:

data11["Company_response"] = 

data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode() 

[0]))["Company_response"]



data11["Consumer_disputes"] = 

data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode() 

[0]))["Consumer_disputes"]

I get the following error:

Stacktrace

Traceback (most recent call last):



File "<ipython-input-89-8de6a010a299>", line 1, in <module>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3741, in transform

    return self._transform_general(func, *args, **kwargs)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3699, in _transform_general

    res = path(group)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4360, in apply

    ignore_failures=ignore_failures)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4456, in _apply_standard

    results[i] = func(v)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "<ipython-input-89-8de6a010a299>", line 1, in <lambda>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoreseries.py", line 601, in __getitem__

    result = self.index.get_value(self, key)



  File "C:Anaconda3libsite-packagespandascoreindexesbase.py", line 2434, in get_value

    return libts.get_value_box(s, key)



  File "pandas_libstslib.pyx", line 923, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18843)



  File "pandas_libstslib.pyx", line 939, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18560)



IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

I have checked the length of the dataframeand all of its columns and it is same: 43266.

I have also found a question similar to this but does not have correct answer: Click here

Please help resolve the error.

IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

Here is a snapshot of the dataset if it helps in any way: Dataset Snapshot

I am using the below code successfully. But it does not serve my purpose exactly. Helps to fill the missing values though.

data11['Company_response'].fillna(data11['Company_response'].mode()[0], 

inplace=True)

data11['Consumer_disputes'].fillna(data11['Consumer_disputes'].mode()[0], 

inplace=True)

Edit1: (Attaching Sample)

Input Given:
InputImage

Expected Output:
OutputImage

The below snippet consists of the dataframe and the code for those who want to replicate and give it a try.

Replication Code

import pandas as pd

import numpy as np



data11=pd.DataFrame({'Complaint_ID':['Tr-1','Tr-2','Tr-3','Tr-4','Tr-5','Tr-6'],

                    'Transaction_Type':['Mortgage','Credit card','Bank account or service','Debt collection','Credit card','Mortgage'],

                    'Complaint_reason':['Loan servicing, payments, escrow account','Incorrect information on credit report',"Cont'd attempts collect debt not owed","Cont'd attempts collect debt not owed",'Payoff process','Loan servicing, payments, escrow account'],

                    'Company_response':[np.nan,'Company chooses not to provide a public response',np.nan,'Company believes it acted appropriately as authorized by contract or law','Company has responded to the consumer and the CFPB and chooses not to provide a public response','Company disputes the facts presented in the complaint'],

                    'Consumer_disputes':['Yes','No','No','No',np.nan,'Yes']})



data11.isnull().sum()



data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]

data11["Consumer_disputes"] = data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0]))["Consumer_disputes"]

edited Jan 1 at 17:46

asked Jan 1 at 10:12

Ashu Grover

151112

the question literally died last time, i edited it, left comments but no one answered for almost 6 days, so unfortunately i had to post it again as i do not have any bounties to offer, so guys if you find it interesting and are unable to solve it, please upvote the question so that it might interest others as well...

– Ashu Grover
Jan 1 at 11:10

Could you add a small input sample and the expected output

– Daniel Mesejo
Jan 1 at 11:36

1

the question did not "literally die" - this is a metaphor. it figuratively died!

– Josh Friedlander
Jan 1 at 12:33

@JoshFriedlander haha... yes Josh... got a bit carried away i guess...

– Ashu Grover
Jan 1 at 12:40

:) as for your question - it would help if you could post like 5 rows of your data, or made-up equivalents - that screenshot is the right idea but text is much easier to work with than an image

– Josh Friedlander
Jan 1 at 12:45

|
show 5 more comments

I have a dataset which requires missing value treatment.

 Column                      Missing Values



 Complaint_ID                    0         

 Date_received                   0         

 Transaction_Type                0         

 Complaint_reason                0         

 Company_response              22506         

 Date_sent_to_company            0         

 Complaint_Status                0         

 Consumer_disputes             7698

Now the problem is, when I try to replace the missing values with mode of other columns using groupby:

Code:

data11["Company_response"] = 

data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode() 

[0]))["Company_response"]



data11["Consumer_disputes"] = 

data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode() 

[0]))["Consumer_disputes"]

I get the following error:

Stacktrace

Traceback (most recent call last):



File "<ipython-input-89-8de6a010a299>", line 1, in <module>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3741, in transform

    return self._transform_general(func, *args, **kwargs)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3699, in _transform_general

    res = path(group)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4360, in apply

    ignore_failures=ignore_failures)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4456, in _apply_standard

    results[i] = func(v)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "<ipython-input-89-8de6a010a299>", line 1, in <lambda>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoreseries.py", line 601, in __getitem__

    result = self.index.get_value(self, key)



  File "C:Anaconda3libsite-packagespandascoreindexesbase.py", line 2434, in get_value

    return libts.get_value_box(s, key)



  File "pandas_libstslib.pyx", line 923, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18843)



  File "pandas_libstslib.pyx", line 939, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18560)



IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

I have checked the length of the dataframeand all of its columns and it is same: 43266.

I have also found a question similar to this but does not have correct answer: Click here

Please help resolve the error.

IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

Here is a snapshot of the dataset if it helps in any way: Dataset Snapshot

I am using the below code successfully. But it does not serve my purpose exactly. Helps to fill the missing values though.

data11['Company_response'].fillna(data11['Company_response'].mode()[0], 

inplace=True)

data11['Consumer_disputes'].fillna(data11['Consumer_disputes'].mode()[0], 

inplace=True)

Edit1: (Attaching Sample)

Input Given:
InputImage

Expected Output:
OutputImage

The below snippet consists of the dataframe and the code for those who want to replicate and give it a try.

Replication Code

import pandas as pd

import numpy as np



data11=pd.DataFrame({'Complaint_ID':['Tr-1','Tr-2','Tr-3','Tr-4','Tr-5','Tr-6'],

                    'Transaction_Type':['Mortgage','Credit card','Bank account or service','Debt collection','Credit card','Mortgage'],

                    'Complaint_reason':['Loan servicing, payments, escrow account','Incorrect information on credit report',"Cont'd attempts collect debt not owed","Cont'd attempts collect debt not owed",'Payoff process','Loan servicing, payments, escrow account'],

                    'Company_response':[np.nan,'Company chooses not to provide a public response',np.nan,'Company believes it acted appropriately as authorized by contract or law','Company has responded to the consumer and the CFPB and chooses not to provide a public response','Company disputes the facts presented in the complaint'],

                    'Consumer_disputes':['Yes','No','No','No',np.nan,'Yes']})



data11.isnull().sum()



data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]

data11["Consumer_disputes"] = data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0]))["Consumer_disputes"]

edited Jan 1 at 17:46

asked Jan 1 at 10:12

Ashu Grover

151112

I have a dataset which requires missing value treatment.

 Column                      Missing Values



 Complaint_ID                    0         

 Date_received                   0         

 Transaction_Type                0         

 Complaint_reason                0         

 Company_response              22506         

 Date_sent_to_company            0         

 Complaint_Status                0         

 Consumer_disputes             7698

Now the problem is, when I try to replace the missing values with mode of other columns using groupby:

Code:

data11["Company_response"] = 

data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode() 

[0]))["Company_response"]



data11["Consumer_disputes"] = 

data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode() 

[0]))["Consumer_disputes"]

I get the following error:

Stacktrace

Traceback (most recent call last):



File "<ipython-input-89-8de6a010a299>", line 1, in <module>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3741, in transform

    return self._transform_general(func, *args, **kwargs)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3699, in _transform_general

    res = path(group)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4360, in apply

    ignore_failures=ignore_failures)



  File "C:Anaconda3libsite-packagespandascoreframe.py", line 4456, in _apply_standard

    results[i] = func(v)



  File "C:Anaconda3libsite-packagespandascoregroupby.py", line 3783, in <lambda>

    lambda x: func(x, *args, **kwargs), axis=self.axis)



  File "<ipython-input-89-8de6a010a299>", line 1, in <lambda>

    data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]



  File "C:Anaconda3libsite-packagespandascoreseries.py", line 601, in __getitem__

    result = self.index.get_value(self, key)



  File "C:Anaconda3libsite-packagespandascoreindexesbase.py", line 2434, in get_value

    return libts.get_value_box(s, key)



  File "pandas_libstslib.pyx", line 923, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18843)



  File "pandas_libstslib.pyx", line 939, in pandas._libs.tslib.get_value_box (pandas_libstslib.c:18560)



IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

I have checked the length of the dataframeand all of its columns and it is same: 43266.

I have also found a question similar to this but does not have correct answer: Click here

Please help resolve the error.

IndexError: ('index out of bounds', 'occurred at index Consumer_disputes')

Here is a snapshot of the dataset if it helps in any way: Dataset Snapshot

I am using the below code successfully. But it does not serve my purpose exactly. Helps to fill the missing values though.

data11['Company_response'].fillna(data11['Company_response'].mode()[0], 

inplace=True)

data11['Consumer_disputes'].fillna(data11['Consumer_disputes'].mode()[0], 

inplace=True)

Edit1: (Attaching Sample)

Input Given:
InputImage

Expected Output:
OutputImage

The below snippet consists of the dataframe and the code for those who want to replicate and give it a try.

Replication Code

import pandas as pd

import numpy as np



data11=pd.DataFrame({'Complaint_ID':['Tr-1','Tr-2','Tr-3','Tr-4','Tr-5','Tr-6'],

                    'Transaction_Type':['Mortgage','Credit card','Bank account or service','Debt collection','Credit card','Mortgage'],

                    'Complaint_reason':['Loan servicing, payments, escrow account','Incorrect information on credit report',"Cont'd attempts collect debt not owed","Cont'd attempts collect debt not owed",'Payoff process','Loan servicing, payments, escrow account'],

                    'Company_response':[np.nan,'Company chooses not to provide a public response',np.nan,'Company believes it acted appropriately as authorized by contract or law','Company has responded to the consumer and the CFPB and chooses not to provide a public response','Company disputes the facts presented in the complaint'],

                    'Consumer_disputes':['Yes','No','No','No',np.nan,'Yes']})



data11.isnull().sum()



data11["Company_response"] = data11.groupby("Complaint_reason").transform(lambda x: x.fillna(x.mode()[0]))["Company_response"]

data11["Consumer_disputes"] = data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0]))["Consumer_disputes"]

python pandas dataframe pandas-groupby missing-data

edited Jan 1 at 17:46

asked Jan 1 at 10:12

Ashu Grover

151112

edited Jan 1 at 17:46

asked Jan 1 at 10:12

Ashu Grover

151112

edited Jan 1 at 17:46

asked Jan 1 at 10:12

Ashu Grover

151112

asked Jan 1 at 10:12

Ashu Grover

151112

asked Jan 1 at 10:12

Ashu Grover

151112

the question literally died last time, i edited it, left comments but no one answered for almost 6 days, so unfortunately i had to post it again as i do not have any bounties to offer, so guys if you find it interesting and are unable to solve it, please upvote the question so that it might interest others as well...

– Ashu Grover
Jan 1 at 11:10

Could you add a small input sample and the expected output

– Daniel Mesejo
Jan 1 at 11:36

1

the question did not "literally die" - this is a metaphor. it figuratively died!

– Josh Friedlander
Jan 1 at 12:33

@JoshFriedlander haha... yes Josh... got a bit carried away i guess...

– Ashu Grover
Jan 1 at 12:40

:) as for your question - it would help if you could post like 5 rows of your data, or made-up equivalents - that screenshot is the right idea but text is much easier to work with than an image

– Josh Friedlander
Jan 1 at 12:45

|
show 5 more comments

the question literally died last time, i edited it, left comments but no one answered for almost 6 days, so unfortunately i had to post it again as i do not have any bounties to offer, so guys if you find it interesting and are unable to solve it, please upvote the question so that it might interest others as well...

– Ashu Grover
Jan 1 at 11:10

Could you add a small input sample and the expected output

– Daniel Mesejo
Jan 1 at 11:36

1

the question did not "literally die" - this is a metaphor. it figuratively died!

– Josh Friedlander
Jan 1 at 12:33

@JoshFriedlander haha... yes Josh... got a bit carried away i guess...

– Ashu Grover
Jan 1 at 12:40

:) as for your question - it would help if you could post like 5 rows of your data, or made-up equivalents - that screenshot is the right idea but text is much easier to work with than an image

– Josh Friedlander
Jan 1 at 12:45

the question literally died last time, i edited it, left comments but no one answered for almost 6 days, so unfortunately i had to post it again as i do not have any bounties to offer, so guys if you find it interesting and are unable to solve it, please upvote the question so that it might interest others as well...

– Ashu Grover
Jan 1 at 11:10

Could you add a small input sample and the expected output

– Daniel Mesejo
Jan 1 at 11:36

the question did not "literally die" - this is a metaphor. it figuratively died!

– Josh Friedlander
Jan 1 at 12:33

@JoshFriedlander haha... yes Josh... got a bit carried away i guess...

– Ashu Grover
Jan 1 at 12:40

:) as for your question - it would help if you could post like 5 rows of your data, or made-up equivalents - that screenshot is the right idea but text is much easier to work with than an image

– Josh Friedlander
Jan 1 at 12:45

|
show 5 more comments

3 Answers
3

active

oldest

votes

Try:

data11["Company_response"] = data11.groupby("Complaint_reason")['Company_response'].transform(lambda x: x.fillna(x.mode()[0]))



data11["Consumer_disputes"] = data11.groupby("Transaction_Type")['Consumer_disputes'].transform(lambda x: x.fillna(x.mode()[0]))

edited Jan 2 at 16:34

Ashu Grover

151112

answered Jan 1 at 20:09

Scott Boston

56.3k73157

Thanks Scott... :)

– Ashu Grover
Jan 2 at 16:36

@AshuGrover You're welcome. Happy coding. Thanks for editing my solution to match your needs.

– Scott Boston
Jan 2 at 16:37

add a comment |

The error is raised because for at least one of the groups the values in corresponding aggregated columns contains only np.nan values. In this case pd.Series([np.nan]).mode() returns an empty series which leads to an error when you take the first value.

So, you may use something like transform(lambda x: x.fillna(x.mode()[0] if not x.mode().empty else "Empty") ).

edited Jan 1 at 19:10

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

On running this I get the exact error mentioned in the question. Mikhail can you please replicate it on your local (I have given the code in the question for replication) , I am really stuck on this since long now..

– Ashu Grover
Jan 1 at 16:58

Could you provide a full stacktrace when you run it along with a self-sufficient code to create an input data for that?

– Mikhail Berlinkov
Jan 1 at 17:03

Mikhail I have mentioned the self sufficient code with input and the stacktrace both in the question itself..

– Ashu Grover
Jan 1 at 17:19

I meant the stacktrace when you run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])). I want to be sure it's indeed the same which is very unlikely. Also, I didn't find code snippet to create an input dataframe. I can't take it from screenshot of an excel file.

– Mikhail Berlinkov
Jan 1 at 17:22

Mikhail please observe carefully I have mentioned the Replication Code (in bold letters below the excel screenshots) with the input dataframe... and the stacktrace I get when I run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])) is very long it cannot be put in the comments section, so let me highlight the stacktrace also in the question itself in bolds. Note: the stacktrace is exactly same for the code you have asked me to run.

– Ashu Grover
Jan 1 at 17:36

|
show 4 more comments

@Mikhail Berlinkov is almost certainly correct. I was able to reproduce your error, and then avoid it by using dropna():

data11.groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Returns IndexError



data11.dropna().groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Works

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

Thanks for the input Josh but this fill further mess up the dataframe. Try it for yourself and see the results...

– Ashu Grover
Jan 2 at 16:24

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53994621%2findexerror-when-replacing-missing-values-with-mode-using-groupby-in-pandas%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Try:

data11["Company_response"] = data11.groupby("Complaint_reason")['Company_response'].transform(lambda x: x.fillna(x.mode()[0]))



data11["Consumer_disputes"] = data11.groupby("Transaction_Type")['Consumer_disputes'].transform(lambda x: x.fillna(x.mode()[0]))

edited Jan 2 at 16:34

Ashu Grover

151112

answered Jan 1 at 20:09

Scott Boston

56.3k73157

Thanks Scott... :)

– Ashu Grover
Jan 2 at 16:36

@AshuGrover You're welcome. Happy coding. Thanks for editing my solution to match your needs.

– Scott Boston
Jan 2 at 16:37

add a comment |

Try:

data11["Company_response"] = data11.groupby("Complaint_reason")['Company_response'].transform(lambda x: x.fillna(x.mode()[0]))



data11["Consumer_disputes"] = data11.groupby("Transaction_Type")['Consumer_disputes'].transform(lambda x: x.fillna(x.mode()[0]))

edited Jan 2 at 16:34

Ashu Grover

151112

answered Jan 1 at 20:09

Scott Boston

56.3k73157

Thanks Scott... :)

– Ashu Grover
Jan 2 at 16:36

@AshuGrover You're welcome. Happy coding. Thanks for editing my solution to match your needs.

– Scott Boston
Jan 2 at 16:37

add a comment |

Try:

data11["Company_response"] = data11.groupby("Complaint_reason")['Company_response'].transform(lambda x: x.fillna(x.mode()[0]))



data11["Consumer_disputes"] = data11.groupby("Transaction_Type")['Consumer_disputes'].transform(lambda x: x.fillna(x.mode()[0]))

edited Jan 2 at 16:34

Ashu Grover

151112

answered Jan 1 at 20:09

Scott Boston

56.3k73157

Try:

data11["Company_response"] = data11.groupby("Complaint_reason")['Company_response'].transform(lambda x: x.fillna(x.mode()[0]))



data11["Consumer_disputes"] = data11.groupby("Transaction_Type")['Consumer_disputes'].transform(lambda x: x.fillna(x.mode()[0]))

edited Jan 2 at 16:34

Ashu Grover

151112

answered Jan 1 at 20:09

Scott Boston

56.3k73157

edited Jan 2 at 16:34

Ashu Grover

151112

edited Jan 2 at 16:34

Ashu Grover

151112

edited Jan 2 at 16:34

Ashu Grover

151112

answered Jan 1 at 20:09

Scott Boston

56.3k73157

answered Jan 1 at 20:09

Scott Boston

56.3k73157

answered Jan 1 at 20:09

Scott Boston

56.3k73157

Thanks Scott... :)

– Ashu Grover
Jan 2 at 16:36

@AshuGrover You're welcome. Happy coding. Thanks for editing my solution to match your needs.

– Scott Boston
Jan 2 at 16:37

add a comment |

Thanks Scott... :)

– Ashu Grover
Jan 2 at 16:36

@AshuGrover You're welcome. Happy coding. Thanks for editing my solution to match your needs.

– Scott Boston
Jan 2 at 16:37

Thanks Scott... :)

– Ashu Grover
Jan 2 at 16:36

@AshuGrover You're welcome. Happy coding. Thanks for editing my solution to match your needs.

– Scott Boston
Jan 2 at 16:37

add a comment |

So, you may use something like transform(lambda x: x.fillna(x.mode()[0] if not x.mode().empty else "Empty") ).

edited Jan 1 at 19:10

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

On running this I get the exact error mentioned in the question. Mikhail can you please replicate it on your local (I have given the code in the question for replication) , I am really stuck on this since long now..

– Ashu Grover
Jan 1 at 16:58

Could you provide a full stacktrace when you run it along with a self-sufficient code to create an input data for that?

– Mikhail Berlinkov
Jan 1 at 17:03

Mikhail I have mentioned the self sufficient code with input and the stacktrace both in the question itself..

– Ashu Grover
Jan 1 at 17:19

I meant the stacktrace when you run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])). I want to be sure it's indeed the same which is very unlikely. Also, I didn't find code snippet to create an input dataframe. I can't take it from screenshot of an excel file.

– Mikhail Berlinkov
Jan 1 at 17:22

Mikhail please observe carefully I have mentioned the Replication Code (in bold letters below the excel screenshots) with the input dataframe... and the stacktrace I get when I run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])) is very long it cannot be put in the comments section, so let me highlight the stacktrace also in the question itself in bolds. Note: the stacktrace is exactly same for the code you have asked me to run.

– Ashu Grover
Jan 1 at 17:36

|
show 4 more comments

So, you may use something like transform(lambda x: x.fillna(x.mode()[0] if not x.mode().empty else "Empty") ).

edited Jan 1 at 19:10

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

On running this I get the exact error mentioned in the question. Mikhail can you please replicate it on your local (I have given the code in the question for replication) , I am really stuck on this since long now..

– Ashu Grover
Jan 1 at 16:58

Could you provide a full stacktrace when you run it along with a self-sufficient code to create an input data for that?

– Mikhail Berlinkov
Jan 1 at 17:03

Mikhail I have mentioned the self sufficient code with input and the stacktrace both in the question itself..

– Ashu Grover
Jan 1 at 17:19

I meant the stacktrace when you run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])). I want to be sure it's indeed the same which is very unlikely. Also, I didn't find code snippet to create an input dataframe. I can't take it from screenshot of an excel file.

– Mikhail Berlinkov
Jan 1 at 17:22

Mikhail please observe carefully I have mentioned the Replication Code (in bold letters below the excel screenshots) with the input dataframe... and the stacktrace I get when I run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])) is very long it cannot be put in the comments section, so let me highlight the stacktrace also in the question itself in bolds. Note: the stacktrace is exactly same for the code you have asked me to run.

– Ashu Grover
Jan 1 at 17:36

|
show 4 more comments

So, you may use something like transform(lambda x: x.fillna(x.mode()[0] if not x.mode().empty else "Empty") ).

edited Jan 1 at 19:10

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

So, you may use something like transform(lambda x: x.fillna(x.mode()[0] if not x.mode().empty else "Empty") ).

edited Jan 1 at 19:10

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

edited Jan 1 at 19:10

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

answered Jan 1 at 16:42

Mikhail Berlinkov

1,174411

On running this I get the exact error mentioned in the question. Mikhail can you please replicate it on your local (I have given the code in the question for replication) , I am really stuck on this since long now..

– Ashu Grover
Jan 1 at 16:58

Could you provide a full stacktrace when you run it along with a self-sufficient code to create an input data for that?

– Mikhail Berlinkov
Jan 1 at 17:03

Mikhail I have mentioned the self sufficient code with input and the stacktrace both in the question itself..

– Ashu Grover
Jan 1 at 17:19

I meant the stacktrace when you run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])). I want to be sure it's indeed the same which is very unlikely. Also, I didn't find code snippet to create an input dataframe. I can't take it from screenshot of an excel file.

– Mikhail Berlinkov
Jan 1 at 17:22

Mikhail please observe carefully I have mentioned the Replication Code (in bold letters below the excel screenshots) with the input dataframe... and the stacktrace I get when I run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])) is very long it cannot be put in the comments section, so let me highlight the stacktrace also in the question itself in bolds. Note: the stacktrace is exactly same for the code you have asked me to run.

– Ashu Grover
Jan 1 at 17:36

|
show 4 more comments

On running this I get the exact error mentioned in the question. Mikhail can you please replicate it on your local (I have given the code in the question for replication) , I am really stuck on this since long now..

– Ashu Grover
Jan 1 at 16:58

Could you provide a full stacktrace when you run it along with a self-sufficient code to create an input data for that?

– Mikhail Berlinkov
Jan 1 at 17:03

Mikhail I have mentioned the self sufficient code with input and the stacktrace both in the question itself..

– Ashu Grover
Jan 1 at 17:19

I meant the stacktrace when you run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])). I want to be sure it's indeed the same which is very unlikely. Also, I didn't find code snippet to create an input dataframe. I can't take it from screenshot of an excel file.

– Mikhail Berlinkov
Jan 1 at 17:22

Mikhail please observe carefully I have mentioned the Replication Code (in bold letters below the excel screenshots) with the input dataframe... and the stacktrace I get when I run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])) is very long it cannot be put in the comments section, so let me highlight the stacktrace also in the question itself in bolds. Note: the stacktrace is exactly same for the code you have asked me to run.

– Ashu Grover
Jan 1 at 17:36

On running this I get the exact error mentioned in the question. Mikhail can you please replicate it on your local (I have given the code in the question for replication) , I am really stuck on this since long now..

– Ashu Grover
Jan 1 at 16:58

Could you provide a full stacktrace when you run it along with a self-sufficient code to create an input data for that?

– Mikhail Berlinkov
Jan 1 at 17:03

Mikhail I have mentioned the self sufficient code with input and the stacktrace both in the question itself..

– Ashu Grover
Jan 1 at 17:19

I meant the stacktrace when you run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])). I want to be sure it's indeed the same which is very unlikely. Also, I didn't find code snippet to create an input dataframe. I can't take it from screenshot of an excel file.

– Mikhail Berlinkov
Jan 1 at 17:22

Mikhail please observe carefully I have mentioned the Replication Code (in bold letters below the excel screenshots) with the input dataframe... and the stacktrace I get when I run data11.groupby("Transaction_Type").transform(lambda x: x.fillna(x.mode()[0])) is very long it cannot be put in the comments section, so let me highlight the stacktrace also in the question itself in bolds. Note: the stacktrace is exactly same for the code you have asked me to run.

– Ashu Grover
Jan 1 at 17:36

|
show 4 more comments

@Mikhail Berlinkov is almost certainly correct. I was able to reproduce your error, and then avoid it by using dropna():

data11.groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Returns IndexError



data11.dropna().groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Works

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

Thanks for the input Josh but this fill further mess up the dataframe. Try it for yourself and see the results...

– Ashu Grover
Jan 2 at 16:24

add a comment |

@Mikhail Berlinkov is almost certainly correct. I was able to reproduce your error, and then avoid it by using dropna():

data11.groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Returns IndexError



data11.dropna().groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Works

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

Thanks for the input Josh but this fill further mess up the dataframe. Try it for yourself and see the results...

– Ashu Grover
Jan 2 at 16:24

add a comment |

@Mikhail Berlinkov is almost certainly correct. I was able to reproduce your error, and then avoid it by using dropna():

data11.groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Returns IndexError



data11.dropna().groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Works

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

@Mikhail Berlinkov is almost certainly correct. I was able to reproduce your error, and then avoid it by using dropna():

data11.groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Returns IndexError



data11.dropna().groupby("Transaction-Type").transform(

    lambda x: x.fillna(x.mode() [0]))["Consumer-disputes"]  

# Works

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

answered Jan 2 at 7:53

Josh Friedlander

2,7171928

Thanks for the input Josh but this fill further mess up the dataframe. Try it for yourself and see the results...

– Ashu Grover
Jan 2 at 16:24

add a comment |

Thanks for the input Josh but this fill further mess up the dataframe. Try it for yourself and see the results...

– Ashu Grover
Jan 2 at 16:24

Thanks for the input Josh but this fill further mess up the dataframe. Try it for yourself and see the results...

– Ashu Grover
Jan 2 at 16:24

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu