How do I write code to determine whether the EOL character in a CSV file is `r` or `n` without looking at the...












0















I'm using Python in Jupyter Notebooks to work with a CSV file. I'm writing the same code in two different versions of Jupyter Notebook--one that's running directly on my computer and another that's running off a kind of emulator within an online lesson from Dataquest. When I open the CSV file and read it into a string on my computer's Jupyter Notebook, the EOL character is r but when I do the same on Dataquest's emulator, the EOL character is n. I have two questions:




  1. Why does this happen?


  2. How can I write a Python code that tests for the EOL character without opening the file to find out visually?



This code in in a Jupyter notebook on my own Mac.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_MyComp = f.read()
data_MyComp


This code is on Dataquest's Jupyter notebook browser emulator.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_dataquest = f.read()
data_dataquest


This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).



'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r


This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).



'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n









share|improve this question




















  • 1





    docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

    – aws_apprentice
    Jan 2 at 0:06








  • 1





    I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

    – tripleee
    Jan 2 at 0:12








  • 1





    Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

    – tripleee
    Jan 2 at 0:14








  • 1





    If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

    – mkrieger1
    Jan 2 at 0:45













  • @tripleee Yes, I mean "manual inspection". Thanks for clarifying.

    – user10200596
    Jan 2 at 5:20
















0















I'm using Python in Jupyter Notebooks to work with a CSV file. I'm writing the same code in two different versions of Jupyter Notebook--one that's running directly on my computer and another that's running off a kind of emulator within an online lesson from Dataquest. When I open the CSV file and read it into a string on my computer's Jupyter Notebook, the EOL character is r but when I do the same on Dataquest's emulator, the EOL character is n. I have two questions:




  1. Why does this happen?


  2. How can I write a Python code that tests for the EOL character without opening the file to find out visually?



This code in in a Jupyter notebook on my own Mac.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_MyComp = f.read()
data_MyComp


This code is on Dataquest's Jupyter notebook browser emulator.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_dataquest = f.read()
data_dataquest


This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).



'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r


This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).



'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n









share|improve this question




















  • 1





    docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

    – aws_apprentice
    Jan 2 at 0:06








  • 1





    I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

    – tripleee
    Jan 2 at 0:12








  • 1





    Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

    – tripleee
    Jan 2 at 0:14








  • 1





    If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

    – mkrieger1
    Jan 2 at 0:45













  • @tripleee Yes, I mean "manual inspection". Thanks for clarifying.

    – user10200596
    Jan 2 at 5:20














0












0








0








I'm using Python in Jupyter Notebooks to work with a CSV file. I'm writing the same code in two different versions of Jupyter Notebook--one that's running directly on my computer and another that's running off a kind of emulator within an online lesson from Dataquest. When I open the CSV file and read it into a string on my computer's Jupyter Notebook, the EOL character is r but when I do the same on Dataquest's emulator, the EOL character is n. I have two questions:




  1. Why does this happen?


  2. How can I write a Python code that tests for the EOL character without opening the file to find out visually?



This code in in a Jupyter notebook on my own Mac.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_MyComp = f.read()
data_MyComp


This code is on Dataquest's Jupyter notebook browser emulator.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_dataquest = f.read()
data_dataquest


This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).



'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r


This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).



'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n









share|improve this question
















I'm using Python in Jupyter Notebooks to work with a CSV file. I'm writing the same code in two different versions of Jupyter Notebook--one that's running directly on my computer and another that's running off a kind of emulator within an online lesson from Dataquest. When I open the CSV file and read it into a string on my computer's Jupyter Notebook, the EOL character is r but when I do the same on Dataquest's emulator, the EOL character is n. I have two questions:




  1. Why does this happen?


  2. How can I write a Python code that tests for the EOL character without opening the file to find out visually?



This code in in a Jupyter notebook on my own Mac.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_MyComp = f.read()
data_MyComp


This code is on Dataquest's Jupyter notebook browser emulator.



f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')
data_dataquest = f.read()
data_dataquest


This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).



'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r


This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).



'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n






python eol end-of-line






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 2 at 5:30









tripleee

94.2k13132186




94.2k13132186










asked Jan 2 at 0:03









user10200596user10200596

93




93








  • 1





    docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

    – aws_apprentice
    Jan 2 at 0:06








  • 1





    I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

    – tripleee
    Jan 2 at 0:12








  • 1





    Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

    – tripleee
    Jan 2 at 0:14








  • 1





    If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

    – mkrieger1
    Jan 2 at 0:45













  • @tripleee Yes, I mean "manual inspection". Thanks for clarifying.

    – user10200596
    Jan 2 at 5:20














  • 1





    docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

    – aws_apprentice
    Jan 2 at 0:06








  • 1





    I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

    – tripleee
    Jan 2 at 0:12








  • 1





    Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

    – tripleee
    Jan 2 at 0:14








  • 1





    If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

    – mkrieger1
    Jan 2 at 0:45













  • @tripleee Yes, I mean "manual inspection". Thanks for clarifying.

    – user10200596
    Jan 2 at 5:20








1




1





docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06







docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06






1




1





I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12







I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12






1




1





Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14







Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14






1




1





If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45







If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45















@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20





@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20












1 Answer
1






active

oldest

votes


















0














Without any indication of how you downloaded or otherwise made the file available to Python and Jupyter, we can't really tell why this is happening. Line endings are platform-specific but Python 3 should generally neutralize differences between platforms unless you specifically request opening a file as "binary".



You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:



with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:
buf = peek.read(1024)
if b'rn' in peek:
print("DOS CR/LF line terminator")
elif b'r' in peek:
print("Plain CR seen (legacy Mac or CP/M file)?")
elif b'n' in peek:
print("Plain LF seen (standard Unix text file)")


This doesn't attempt to do any statistical analysis, but might work well enough for your limited case. The file will be closed again after the end of the with block so you can then just open it a second time with the parameters you actually need.






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53999914%2fhow-do-i-write-code-to-determine-whether-the-eol-character-in-a-csv-file-is-r%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Without any indication of how you downloaded or otherwise made the file available to Python and Jupyter, we can't really tell why this is happening. Line endings are platform-specific but Python 3 should generally neutralize differences between platforms unless you specifically request opening a file as "binary".



    You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:



    with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:
    buf = peek.read(1024)
    if b'rn' in peek:
    print("DOS CR/LF line terminator")
    elif b'r' in peek:
    print("Plain CR seen (legacy Mac or CP/M file)?")
    elif b'n' in peek:
    print("Plain LF seen (standard Unix text file)")


    This doesn't attempt to do any statistical analysis, but might work well enough for your limited case. The file will be closed again after the end of the with block so you can then just open it a second time with the parameters you actually need.






    share|improve this answer






























      0














      Without any indication of how you downloaded or otherwise made the file available to Python and Jupyter, we can't really tell why this is happening. Line endings are platform-specific but Python 3 should generally neutralize differences between platforms unless you specifically request opening a file as "binary".



      You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:



      with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:
      buf = peek.read(1024)
      if b'rn' in peek:
      print("DOS CR/LF line terminator")
      elif b'r' in peek:
      print("Plain CR seen (legacy Mac or CP/M file)?")
      elif b'n' in peek:
      print("Plain LF seen (standard Unix text file)")


      This doesn't attempt to do any statistical analysis, but might work well enough for your limited case. The file will be closed again after the end of the with block so you can then just open it a second time with the parameters you actually need.






      share|improve this answer




























        0












        0








        0







        Without any indication of how you downloaded or otherwise made the file available to Python and Jupyter, we can't really tell why this is happening. Line endings are platform-specific but Python 3 should generally neutralize differences between platforms unless you specifically request opening a file as "binary".



        You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:



        with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:
        buf = peek.read(1024)
        if b'rn' in peek:
        print("DOS CR/LF line terminator")
        elif b'r' in peek:
        print("Plain CR seen (legacy Mac or CP/M file)?")
        elif b'n' in peek:
        print("Plain LF seen (standard Unix text file)")


        This doesn't attempt to do any statistical analysis, but might work well enough for your limited case. The file will be closed again after the end of the with block so you can then just open it a second time with the parameters you actually need.






        share|improve this answer















        Without any indication of how you downloaded or otherwise made the file available to Python and Jupyter, we can't really tell why this is happening. Line endings are platform-specific but Python 3 should generally neutralize differences between platforms unless you specifically request opening a file as "binary".



        You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:



        with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:
        buf = peek.read(1024)
        if b'rn' in peek:
        print("DOS CR/LF line terminator")
        elif b'r' in peek:
        print("Plain CR seen (legacy Mac or CP/M file)?")
        elif b'n' in peek:
        print("Plain LF seen (standard Unix text file)")


        This doesn't attempt to do any statistical analysis, but might work well enough for your limited case. The file will be closed again after the end of the with block so you can then just open it a second time with the parameters you actually need.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jan 2 at 5:44

























        answered Jan 2 at 5:35









        tripleeetripleee

        94.2k13132186




        94.2k13132186
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53999914%2fhow-do-i-write-code-to-determine-whether-the-eol-character-in-a-csv-file-is-r%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            MongoDB - Not Authorized To Execute Command

            in spring boot 2.1 many test slices are not allowed anymore due to multiple @BootstrapWith

            Npm cannot find a required file even through it is in the searched directory