Pandas Series.tolist Incorrectly Transcribing Empty Space?












0















I have a Pandas DataFrame called df, with a column called eligible? which can contain either "Yes" or "No".



So I want to select all names from rows where the eligible? column contains "Yes". So I do this:



df[df["eligible?"].str.lower().str.startswith("y")]["name"]


I look at the Series output and it looks as expected. One of the names is "John " (note the trailing space).



When I do:



df[df["eligible?"].str.lower().str.startswith("y")]["name"].tolist()


I get back a list, but now the row that had "John " has "Johnxe3x80x80".



Any idea what's going on here?



If it's helpful, I read my DataFrame in from a CSV that was utf-8 encoded.



Thanks!










share|improve this question























  • Related: Why won't Python display this text correctly? (UTF-8 Decoding Issue)

    – jpp
    Jan 2 at 23:01











  • @jpp Thanks for the help! Took a look, so maybe (since I'm using Python 2), I shouldn't even be encoding as utf-8 when writing to CSV to begin with?

    – bclayman
    Jan 2 at 23:03











  • Possibly, I'm not really sure tbh, someone with more knowledge of these decoding issues may be able to answer comprehensively.

    – jpp
    Jan 2 at 23:04
















0















I have a Pandas DataFrame called df, with a column called eligible? which can contain either "Yes" or "No".



So I want to select all names from rows where the eligible? column contains "Yes". So I do this:



df[df["eligible?"].str.lower().str.startswith("y")]["name"]


I look at the Series output and it looks as expected. One of the names is "John " (note the trailing space).



When I do:



df[df["eligible?"].str.lower().str.startswith("y")]["name"].tolist()


I get back a list, but now the row that had "John " has "Johnxe3x80x80".



Any idea what's going on here?



If it's helpful, I read my DataFrame in from a CSV that was utf-8 encoded.



Thanks!










share|improve this question























  • Related: Why won't Python display this text correctly? (UTF-8 Decoding Issue)

    – jpp
    Jan 2 at 23:01











  • @jpp Thanks for the help! Took a look, so maybe (since I'm using Python 2), I shouldn't even be encoding as utf-8 when writing to CSV to begin with?

    – bclayman
    Jan 2 at 23:03











  • Possibly, I'm not really sure tbh, someone with more knowledge of these decoding issues may be able to answer comprehensively.

    – jpp
    Jan 2 at 23:04














0












0








0








I have a Pandas DataFrame called df, with a column called eligible? which can contain either "Yes" or "No".



So I want to select all names from rows where the eligible? column contains "Yes". So I do this:



df[df["eligible?"].str.lower().str.startswith("y")]["name"]


I look at the Series output and it looks as expected. One of the names is "John " (note the trailing space).



When I do:



df[df["eligible?"].str.lower().str.startswith("y")]["name"].tolist()


I get back a list, but now the row that had "John " has "Johnxe3x80x80".



Any idea what's going on here?



If it's helpful, I read my DataFrame in from a CSV that was utf-8 encoded.



Thanks!










share|improve this question














I have a Pandas DataFrame called df, with a column called eligible? which can contain either "Yes" or "No".



So I want to select all names from rows where the eligible? column contains "Yes". So I do this:



df[df["eligible?"].str.lower().str.startswith("y")]["name"]


I look at the Series output and it looks as expected. One of the names is "John " (note the trailing space).



When I do:



df[df["eligible?"].str.lower().str.startswith("y")]["name"].tolist()


I get back a list, but now the row that had "John " has "Johnxe3x80x80".



Any idea what's going on here?



If it's helpful, I read my DataFrame in from a CSV that was utf-8 encoded.



Thanks!







python pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 2 at 22:54









bclaymanbclayman

2,30682663




2,30682663













  • Related: Why won't Python display this text correctly? (UTF-8 Decoding Issue)

    – jpp
    Jan 2 at 23:01











  • @jpp Thanks for the help! Took a look, so maybe (since I'm using Python 2), I shouldn't even be encoding as utf-8 when writing to CSV to begin with?

    – bclayman
    Jan 2 at 23:03











  • Possibly, I'm not really sure tbh, someone with more knowledge of these decoding issues may be able to answer comprehensively.

    – jpp
    Jan 2 at 23:04



















  • Related: Why won't Python display this text correctly? (UTF-8 Decoding Issue)

    – jpp
    Jan 2 at 23:01











  • @jpp Thanks for the help! Took a look, so maybe (since I'm using Python 2), I shouldn't even be encoding as utf-8 when writing to CSV to begin with?

    – bclayman
    Jan 2 at 23:03











  • Possibly, I'm not really sure tbh, someone with more knowledge of these decoding issues may be able to answer comprehensively.

    – jpp
    Jan 2 at 23:04

















Related: Why won't Python display this text correctly? (UTF-8 Decoding Issue)

– jpp
Jan 2 at 23:01





Related: Why won't Python display this text correctly? (UTF-8 Decoding Issue)

– jpp
Jan 2 at 23:01













@jpp Thanks for the help! Took a look, so maybe (since I'm using Python 2), I shouldn't even be encoding as utf-8 when writing to CSV to begin with?

– bclayman
Jan 2 at 23:03





@jpp Thanks for the help! Took a look, so maybe (since I'm using Python 2), I shouldn't even be encoding as utf-8 when writing to CSV to begin with?

– bclayman
Jan 2 at 23:03













Possibly, I'm not really sure tbh, someone with more knowledge of these decoding issues may be able to answer comprehensively.

– jpp
Jan 2 at 23:04





Possibly, I'm not really sure tbh, someone with more knowledge of these decoding issues may be able to answer comprehensively.

– jpp
Jan 2 at 23:04












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014235%2fpandas-series-tolist-incorrectly-transcribing-empty-space%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54014235%2fpandas-series-tolist-incorrectly-transcribing-empty-space%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

MongoDB - Not Authorized To Execute Command

How to fix TextFormField cause rebuild widget in Flutter

in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith