Removing elements from Pandas Series of lists












-1














I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.



I have scraped text data from various sites and have split the text by using str.split('n'). The text contains a lot of 'n' and splitting this way made it pretty desirable. (Please let me know if this approach is too bad.)



df['scrape']
0 nWebsite:nnnnVisitnn nWhite paper:nn...
1 nWebsite:nnnnVisitnn nWhite paper:nn...
2 nWebsite:nnnnVisitnn nWhite paper:nn...
3 nWebsite:nnnnVisitnn nWhite paper:nn...
4 nWebsite:nnnnVisitnn nWhite paper:nn...
5 nWebsite:nnnnVisitnn nWhite paper:nn...


The result was a Pandas Series of lists – all elements are list of strings.



df['split'] = df['scrape'].str.split('n')
0 [, Website:, , , , Visit, , , White paper:, ,...
1 [, Website:, , , , Visit, , , White paper:, ,...
2 [, Website:, , , , Visit, , , White paper:, ,...
3 [, Website:, , , , Visit, , , White paper:, ,...
4 [, Website:, , , , Visit, , , White paper:, ,...
5 [, Website:, , , , Visit, , , White paper:, ,...
6 [, Website:, , , , Visit, , , White paper:, ,...


I want to get rid of None element (‘’ and ‘ ‘) on each list.



I tried looping:



for i in series:
While ‘’ in i:
i.remove(‘’)


Above code works with some arbitrary example I made, but with my real data it produces an error.



for i in df['split']:
... while '' in i:
... i.remove('')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: argument of type 'float' is not iterable


I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!










share|improve this question




















  • 1




    Don't store lists in a Series
    – user3483203
    Nov 19 '18 at 19:32










  • What's the suggestion for this case then, if I don't store lists in a Series?
    – Matthew Son
    Nov 19 '18 at 19:50










  • Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
    – Matthew Son
    Nov 19 '18 at 22:00


















-1














I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.



I have scraped text data from various sites and have split the text by using str.split('n'). The text contains a lot of 'n' and splitting this way made it pretty desirable. (Please let me know if this approach is too bad.)



df['scrape']
0 nWebsite:nnnnVisitnn nWhite paper:nn...
1 nWebsite:nnnnVisitnn nWhite paper:nn...
2 nWebsite:nnnnVisitnn nWhite paper:nn...
3 nWebsite:nnnnVisitnn nWhite paper:nn...
4 nWebsite:nnnnVisitnn nWhite paper:nn...
5 nWebsite:nnnnVisitnn nWhite paper:nn...


The result was a Pandas Series of lists – all elements are list of strings.



df['split'] = df['scrape'].str.split('n')
0 [, Website:, , , , Visit, , , White paper:, ,...
1 [, Website:, , , , Visit, , , White paper:, ,...
2 [, Website:, , , , Visit, , , White paper:, ,...
3 [, Website:, , , , Visit, , , White paper:, ,...
4 [, Website:, , , , Visit, , , White paper:, ,...
5 [, Website:, , , , Visit, , , White paper:, ,...
6 [, Website:, , , , Visit, , , White paper:, ,...


I want to get rid of None element (‘’ and ‘ ‘) on each list.



I tried looping:



for i in series:
While ‘’ in i:
i.remove(‘’)


Above code works with some arbitrary example I made, but with my real data it produces an error.



for i in df['split']:
... while '' in i:
... i.remove('')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: argument of type 'float' is not iterable


I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!










share|improve this question




















  • 1




    Don't store lists in a Series
    – user3483203
    Nov 19 '18 at 19:32










  • What's the suggestion for this case then, if I don't store lists in a Series?
    – Matthew Son
    Nov 19 '18 at 19:50










  • Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
    – Matthew Son
    Nov 19 '18 at 22:00
















-1












-1








-1







I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.



I have scraped text data from various sites and have split the text by using str.split('n'). The text contains a lot of 'n' and splitting this way made it pretty desirable. (Please let me know if this approach is too bad.)



df['scrape']
0 nWebsite:nnnnVisitnn nWhite paper:nn...
1 nWebsite:nnnnVisitnn nWhite paper:nn...
2 nWebsite:nnnnVisitnn nWhite paper:nn...
3 nWebsite:nnnnVisitnn nWhite paper:nn...
4 nWebsite:nnnnVisitnn nWhite paper:nn...
5 nWebsite:nnnnVisitnn nWhite paper:nn...


The result was a Pandas Series of lists – all elements are list of strings.



df['split'] = df['scrape'].str.split('n')
0 [, Website:, , , , Visit, , , White paper:, ,...
1 [, Website:, , , , Visit, , , White paper:, ,...
2 [, Website:, , , , Visit, , , White paper:, ,...
3 [, Website:, , , , Visit, , , White paper:, ,...
4 [, Website:, , , , Visit, , , White paper:, ,...
5 [, Website:, , , , Visit, , , White paper:, ,...
6 [, Website:, , , , Visit, , , White paper:, ,...


I want to get rid of None element (‘’ and ‘ ‘) on each list.



I tried looping:



for i in series:
While ‘’ in i:
i.remove(‘’)


Above code works with some arbitrary example I made, but with my real data it produces an error.



for i in df['split']:
... while '' in i:
... i.remove('')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: argument of type 'float' is not iterable


I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!










share|improve this question















I've been searching for solutions and hints from the site, but couldn't find issue directly related with my case.



I have scraped text data from various sites and have split the text by using str.split('n'). The text contains a lot of 'n' and splitting this way made it pretty desirable. (Please let me know if this approach is too bad.)



df['scrape']
0 nWebsite:nnnnVisitnn nWhite paper:nn...
1 nWebsite:nnnnVisitnn nWhite paper:nn...
2 nWebsite:nnnnVisitnn nWhite paper:nn...
3 nWebsite:nnnnVisitnn nWhite paper:nn...
4 nWebsite:nnnnVisitnn nWhite paper:nn...
5 nWebsite:nnnnVisitnn nWhite paper:nn...


The result was a Pandas Series of lists – all elements are list of strings.



df['split'] = df['scrape'].str.split('n')
0 [, Website:, , , , Visit, , , White paper:, ,...
1 [, Website:, , , , Visit, , , White paper:, ,...
2 [, Website:, , , , Visit, , , White paper:, ,...
3 [, Website:, , , , Visit, , , White paper:, ,...
4 [, Website:, , , , Visit, , , White paper:, ,...
5 [, Website:, , , , Visit, , , White paper:, ,...
6 [, Website:, , , , Visit, , , White paper:, ,...


I want to get rid of None element (‘’ and ‘ ‘) on each list.



I tried looping:



for i in series:
While ‘’ in i:
i.remove(‘’)


Above code works with some arbitrary example I made, but with my real data it produces an error.



for i in df['split']:
... while '' in i:
... i.remove('')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: argument of type 'float' is not iterable


I'm not sure why I am getting an error with my data. Could get get some advice on this? Thanks!







python string pandas list series






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 '18 at 19:46







Matthew Son

















asked Nov 19 '18 at 19:28









Matthew SonMatthew Son

33




33








  • 1




    Don't store lists in a Series
    – user3483203
    Nov 19 '18 at 19:32










  • What's the suggestion for this case then, if I don't store lists in a Series?
    – Matthew Son
    Nov 19 '18 at 19:50










  • Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
    – Matthew Son
    Nov 19 '18 at 22:00
















  • 1




    Don't store lists in a Series
    – user3483203
    Nov 19 '18 at 19:32










  • What's the suggestion for this case then, if I don't store lists in a Series?
    – Matthew Son
    Nov 19 '18 at 19:50










  • Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
    – Matthew Son
    Nov 19 '18 at 22:00










1




1




Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32




Don't store lists in a Series
– user3483203
Nov 19 '18 at 19:32












What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50




What's the suggestion for this case then, if I don't store lists in a Series?
– Matthew Son
Nov 19 '18 at 19:50












Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00






Solution Thanks to Toby's idea : def remover(list): return [s for s in list if s !='' and s != ' '] df['new'] = df['split'].apply(remover) With this method you don't need to drop NaN values.
– Matthew Son
Nov 19 '18 at 22:00














1 Answer
1






active

oldest

votes


















1














You could use list comprehension:



new_series = [s for s in series if s!='' and s!=' ' and s!=None]


To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:



new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]





share|improve this answer























  • Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
    – Matthew Son
    Nov 19 '18 at 19:51










  • Do you need to add None criteria also? See my updated example
    – Toby Petty
    Nov 19 '18 at 20:15










  • Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
    – Matthew Son
    Nov 19 '18 at 20:40










  • Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
    – Toby Petty
    Nov 19 '18 at 20:52












  • Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
    – Matthew Son
    Nov 19 '18 at 21:18













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381366%2fremoving-elements-from-pandas-series-of-lists%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














You could use list comprehension:



new_series = [s for s in series if s!='' and s!=' ' and s!=None]


To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:



new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]





share|improve this answer























  • Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
    – Matthew Son
    Nov 19 '18 at 19:51










  • Do you need to add None criteria also? See my updated example
    – Toby Petty
    Nov 19 '18 at 20:15










  • Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
    – Matthew Son
    Nov 19 '18 at 20:40










  • Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
    – Toby Petty
    Nov 19 '18 at 20:52












  • Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
    – Matthew Son
    Nov 19 '18 at 21:18


















1














You could use list comprehension:



new_series = [s for s in series if s!='' and s!=' ' and s!=None]


To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:



new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]





share|improve this answer























  • Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
    – Matthew Son
    Nov 19 '18 at 19:51










  • Do you need to add None criteria also? See my updated example
    – Toby Petty
    Nov 19 '18 at 20:15










  • Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
    – Matthew Son
    Nov 19 '18 at 20:40










  • Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
    – Toby Petty
    Nov 19 '18 at 20:52












  • Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
    – Matthew Son
    Nov 19 '18 at 21:18
















1












1








1






You could use list comprehension:



new_series = [s for s in series if s!='' and s!=' ' and s!=None]


To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:



new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]





share|improve this answer














You could use list comprehension:



new_series = [s for s in series if s!='' and s!=' ' and s!=None]


To apply the list comprehension to each element in a Pandas Series of lists (essentially a list of lists), you need to nest the list comprehension like this:



new_series = [[s for s in element if s!='' and s!=' ' and s!=None] for element in series]






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 19 '18 at 20:59

























answered Nov 19 '18 at 19:33









Toby PettyToby Petty

661412




661412












  • Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
    – Matthew Son
    Nov 19 '18 at 19:51










  • Do you need to add None criteria also? See my updated example
    – Toby Petty
    Nov 19 '18 at 20:15










  • Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
    – Matthew Son
    Nov 19 '18 at 20:40










  • Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
    – Toby Petty
    Nov 19 '18 at 20:52












  • Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
    – Matthew Son
    Nov 19 '18 at 21:18




















  • Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
    – Matthew Son
    Nov 19 '18 at 19:51










  • Do you need to add None criteria also? See my updated example
    – Toby Petty
    Nov 19 '18 at 20:15










  • Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
    – Matthew Son
    Nov 19 '18 at 20:40










  • Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
    – Toby Petty
    Nov 19 '18 at 20:52












  • Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
    – Matthew Son
    Nov 19 '18 at 21:18


















Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51




Doesn't work. I tried series = [s for s in df['split'] if s!='' and s!=' '] and still it contains '' and ' ' values.
– Matthew Son
Nov 19 '18 at 19:51












Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15




Do you need to add None criteria also? See my updated example
– Toby Petty
Nov 19 '18 at 20:15












Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40




Still it doesn't work... tried converting it into list of lists too. Your suggestion yields one big list collapsed into, but I do have to keep those separate.
– Matthew Son
Nov 19 '18 at 20:40












Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52






Ah ok I think I understand, you want to apply the list comprehension to each list in the series (essentially a list of lists). If I understand correctly this should work: [[s for s in x if s!='' and s!=' ' and s!=None] for x in series]
– Toby Petty
Nov 19 '18 at 20:52














Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18






Thanks for keep updating. This answer looks like what I want, but honestly don't know why it makes error still.. [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] >>> [[s for s in x if s!='' and s!=' ' and s!=None] for x in df['split']] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> TypeError: 'float' object is not iterable
– Matthew Son
Nov 19 '18 at 21:18




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53381366%2fremoving-elements-from-pandas-series-of-lists%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

Npm cannot find a required file even through it is in the searched directory