Removing elements from Pandas Series of lists

-1

I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.

I have scraped text data from various sites and have split the text by using str.split('n'). The text contains a lot of 'n' and splitting this way made it pretty desirable. (Please let me know if this approach is too bad.)

df['scrape']

0       nWebsite:nnnnVisitnn nWhite paper:nn...

1       nWebsite:nnnnVisitnn nWhite paper:nn...

2       nWebsite:nnnnVisitnn nWhite paper:nn...

3       nWebsite:nnnnVisitnn nWhite paper:nn...

4       nWebsite:nnnnVisitnn nWhite paper:nn...

5       nWebsite:nnnnVisitnn nWhite paper:nn...

The result was a Pandas Series of lists – all elements are list of strings.

df['split'] = df['scrape'].str.split('n')

0       [, Website:, , , , Visit, ,  , White paper:, ,...

1       [, Website:, , , , Visit, ,  , White paper:, ,...

2       [, Website:, , , , Visit, ,  , White paper:, ,...

3       [, Website:, , , , Visit, ,  , White paper:, ,...

4       [, Website:, , , , Visit, ,  , White paper:, ,...

5       [, Website:, , , , Visit, ,  , White paper:, ,...

6       [, Website:, , , , Visit, ,  , White paper:, ,...

I want to get rid of None element (‘’ and ‘ ‘) on each list.

I tried looping:

for i in series:

    While ‘’ in i:

        i.remove(‘’)

Above code works with some arbitrary example I made, but with my real data it produces an error.

for i in df['split']:

...     while '' in i:

...         i.remove('')

...

Traceback (most recent call last):

  File "<stdin>", line 2, in <module>

TypeError: argument of type 'float' is not iterable

I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!

edited Nov 19 '18 at 19:46

asked Nov 19 '18 at 19:28

Matthew Son

1

Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32

What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50

Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00

add a comment |

-1

I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.

df['scrape']

0       nWebsite:nnnnVisitnn nWhite paper:nn...

1       nWebsite:nnnnVisitnn nWhite paper:nn...

2       nWebsite:nnnnVisitnn nWhite paper:nn...

3       nWebsite:nnnnVisitnn nWhite paper:nn...

4       nWebsite:nnnnVisitnn nWhite paper:nn...

5       nWebsite:nnnnVisitnn nWhite paper:nn...

The result was a Pandas Series of lists – all elements are list of strings.

df['split'] = df['scrape'].str.split('n')

0       [, Website:, , , , Visit, ,  , White paper:, ,...

1       [, Website:, , , , Visit, ,  , White paper:, ,...

2       [, Website:, , , , Visit, ,  , White paper:, ,...

3       [, Website:, , , , Visit, ,  , White paper:, ,...

4       [, Website:, , , , Visit, ,  , White paper:, ,...

5       [, Website:, , , , Visit, ,  , White paper:, ,...

6       [, Website:, , , , Visit, ,  , White paper:, ,...

I want to get rid of None element (‘’ and ‘ ‘) on each list.

I tried looping:

for i in series:

    While ‘’ in i:

        i.remove(‘’)

Above code works with some arbitrary example I made, but with my real data it produces an error.

for i in df['split']:

...     while '' in i:

...         i.remove('')

...

Traceback (most recent call last):

  File "<stdin>", line 2, in <module>

TypeError: argument of type 'float' is not iterable

I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!

edited Nov 19 '18 at 19:46

asked Nov 19 '18 at 19:28

Matthew Son

1

Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32

What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50

Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00

add a comment |

-1

I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.

df['scrape']

0       nWebsite:nnnnVisitnn nWhite paper:nn...

1       nWebsite:nnnnVisitnn nWhite paper:nn...

2       nWebsite:nnnnVisitnn nWhite paper:nn...

3       nWebsite:nnnnVisitnn nWhite paper:nn...

4       nWebsite:nnnnVisitnn nWhite paper:nn...

5       nWebsite:nnnnVisitnn nWhite paper:nn...

The result was a Pandas Series of lists – all elements are list of strings.

df['split'] = df['scrape'].str.split('n')

0       [, Website:, , , , Visit, ,  , White paper:, ,...

1       [, Website:, , , , Visit, ,  , White paper:, ,...

2       [, Website:, , , , Visit, ,  , White paper:, ,...

3       [, Website:, , , , Visit, ,  , White paper:, ,...

4       [, Website:, , , , Visit, ,  , White paper:, ,...

5       [, Website:, , , , Visit, ,  , White paper:, ,...

6       [, Website:, , , , Visit, ,  , White paper:, ,...

I want to get rid of None element (‘’ and ‘ ‘) on each list.

I tried looping:

for i in series:

    While ‘’ in i:

        i.remove(‘’)

Above code works with some arbitrary example I made, but with my real data it produces an error.

for i in df['split']:

...     while '' in i:

...         i.remove('')

...

Traceback (most recent call last):

  File "<stdin>", line 2, in <module>

TypeError: argument of type 'float' is not iterable

I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!

edited Nov 19 '18 at 19:46

asked Nov 19 '18 at 19:28

Matthew Son

I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.

df['scrape']

0       nWebsite:nnnnVisitnn nWhite paper:nn...

1       nWebsite:nnnnVisitnn nWhite paper:nn...

2       nWebsite:nnnnVisitnn nWhite paper:nn...

3       nWebsite:nnnnVisitnn nWhite paper:nn...

4       nWebsite:nnnnVisitnn nWhite paper:nn...

5       nWebsite:nnnnVisitnn nWhite paper:nn...

The result was a Pandas Series of lists – all elements are list of strings.

df['split'] = df['scrape'].str.split('n')

0       [, Website:, , , , Visit, ,  , White paper:, ,...

1       [, Website:, , , , Visit, ,  , White paper:, ,...

2       [, Website:, , , , Visit, ,  , White paper:, ,...

3       [, Website:, , , , Visit, ,  , White paper:, ,...

4       [, Website:, , , , Visit, ,  , White paper:, ,...

5       [, Website:, , , , Visit, ,  , White paper:, ,...

6       [, Website:, , , , Visit, ,  , White paper:, ,...

I want to get rid of None element (‘’ and ‘ ‘) on each list.

I tried looping:

for i in series:

    While ‘’ in i:

        i.remove(‘’)

Above code works with some arbitrary example I made, but with my real data it produces an error.

for i in df['split']:

...     while '' in i:

...         i.remove('')

...

Traceback (most recent call last):

  File "<stdin>", line 2, in <module>

TypeError: argument of type 'float' is not iterable

I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!

python string pandas list series

edited Nov 19 '18 at 19:46

asked Nov 19 '18 at 19:28

Matthew Son

edited Nov 19 '18 at 19:46

asked Nov 19 '18 at 19:28

Matthew Son

edited Nov 19 '18 at 19:46

asked Nov 19 '18 at 19:28

Matthew Son

asked Nov 19 '18 at 19:28

Matthew Son

asked Nov 19 '18 at 19:28

Matthew Son

1

Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32

What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50

Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00

add a comment |

1

Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32

What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50

Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00

Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32

What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50

Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00

add a comment |

1 Answer
1

active

oldest

votes

You could use list comprehension:

new_series = [s for s in series if s!='' and s!=' ' and s!=None]

To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:

new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]

edited Nov 19 '18 at 20:59

answered Nov 19 '18 at 19:33

Toby Petty

661412

Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51

Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15

Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40

Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52

Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18

|
show 4 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381366%2fremoving-elements-from-pandas-series-of-lists%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You could use list comprehension:

new_series = [s for s in series if s!='' and s!=' ' and s!=None]

To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:

new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]

edited Nov 19 '18 at 20:59

answered Nov 19 '18 at 19:33

Toby Petty

661412

Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51

Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15

Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40

Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52

Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18

|
show 4 more comments

You could use list comprehension:

new_series = [s for s in series if s!='' and s!=' ' and s!=None]

To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:

new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]

edited Nov 19 '18 at 20:59

answered Nov 19 '18 at 19:33

Toby Petty

661412

Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51

Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15

Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40

Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52

Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18

|
show 4 more comments

You could use list comprehension:

new_series = [s for s in series if s!='' and s!=' ' and s!=None]

To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:

new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]

edited Nov 19 '18 at 20:59

answered Nov 19 '18 at 19:33

Toby Petty

661412

You could use list comprehension:

new_series = [s for s in series if s!='' and s!=' ' and s!=None]

To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:

new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]

edited Nov 19 '18 at 20:59

answered Nov 19 '18 at 19:33

Toby Petty

661412

edited Nov 19 '18 at 20:59

answered Nov 19 '18 at 19:33

Toby Petty

661412

answered Nov 19 '18 at 19:33

Toby Petty

661412

answered Nov 19 '18 at 19:33

Toby Petty

661412

Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51

Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15

Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40

Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52

Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18

|
show 4 more comments

Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51

Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15

Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40

Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52

Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18

Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51

Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15

Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40

Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52

Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18

|
show 4 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu