How do I write code to determine whether the EOL character in a CSV file is `r` or `n` without looking at the...

I'm using Python in Jupyter Notebooks to work with a CSV file. I'm writing the same code in two different versions of Jupyter Notebook--one that's running directly on my computer and another that's running off a kind of emulator within an online lesson from Dataquest. When I open the CSV file and read it into a string on my computer's Jupyter Notebook, the EOL character is r but when I do the same on Dataquest's emulator, the EOL character is n. I have two questions:

Why does this happen?

How can I write a Python code that tests for the EOL character without opening the file to find out visually?

This code in in a Jupyter notebook on my own Mac.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_MyComp = f.read()

data_MyComp

This code is on Dataquest's Jupyter notebook browser emulator.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_dataquest = f.read()

data_dataquest

This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).

'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r

This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).

'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n

edited Jan 2 at 5:30

tripleee

94.2k13132186

asked Jan 2 at 0:03

user10200596

1

docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06

1

I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12

1

Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14

1

If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45

@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20

|
show 2 more comments

Why does this happen?

How can I write a Python code that tests for the EOL character without opening the file to find out visually?

This code in in a Jupyter notebook on my own Mac.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_MyComp = f.read()

data_MyComp

This code is on Dataquest's Jupyter notebook browser emulator.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_dataquest = f.read()

data_dataquest

This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).

'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r

This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).

'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n

edited Jan 2 at 5:30

tripleee

94.2k13132186

asked Jan 2 at 0:03

user10200596

1

docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06

1

I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12

1

Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14

1

If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45

@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20

|
show 2 more comments

Why does this happen?

How can I write a Python code that tests for the EOL character without opening the file to find out visually?

This code in in a Jupyter notebook on my own Mac.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_MyComp = f.read()

data_MyComp

This code is on Dataquest's Jupyter notebook browser emulator.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_dataquest = f.read()

data_dataquest

This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).

'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r

This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).

'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n

edited Jan 2 at 5:30

tripleee

94.2k13132186

asked Jan 2 at 0:03

user10200596

Why does this happen?

How can I write a Python code that tests for the EOL character without opening the file to find out visually?

This code in in a Jupyter notebook on my own Mac.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_MyComp = f.read()

data_MyComp

This code is on Dataquest's Jupyter notebook browser emulator.

f = open('US_births_1994-2003_CDC_NCHS.csv', 'r')

data_dataquest = f.read()

data_dataquest

This is a few lines of output from my computer when I run data_MyComp (note the EOL character is r).

'year,month,date_of_month,day_of_week,birthsr1994,1,1,6,8096r1994,1,2,7,7772r1994,1,3,1,10142r1994,1,4,2,11248r1994,1,5,3,11053r1994,1,6,4,11406r1994,1,7,5,11251r1994,1,8,6,8653r1994,1,9,7,7910r1994,1,10,1,10498r1994,1,11,2,11706r

This is a few lines of output from the Dataquest emulator when I run data_dataquest (note the EOL character is n).

'year,month,date_of_month,day_of_week,birthsn1994,1,1,6,8096n1994,1,2,7,7772n1994,1,3,1,10142n1994,1,4,2,11248n1994,1,5,3,11053n1994,1,6,4,11406n

python eol end-of-line

edited Jan 2 at 5:30

tripleee

94.2k13132186

asked Jan 2 at 0:03

user10200596

edited Jan 2 at 5:30

tripleee

94.2k13132186

asked Jan 2 at 0:03

user10200596

edited Jan 2 at 5:30

tripleee

94.2k13132186

edited Jan 2 at 5:30

tripleee

94.2k13132186

edited Jan 2 at 5:30

tripleee

94.2k13132186

asked Jan 2 at 0:03

user10200596

asked Jan 2 at 0:03

user10200596

asked Jan 2 at 0:03

user10200596

1

docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06

1

I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12

1

Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14

1

If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45

@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20

|
show 2 more comments

1

docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06

1

I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12

1

Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14

1

If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45

@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20

docs.python.org/3/library/functions.html#open the newline flag handles that for you, or am I missing something?

– aws_apprentice
Jan 2 at 0:06

I suppose "opening the file" really means "manual inspection" here. In order to process the contents of a file you have to open() it.

– tripleee
Jan 2 at 0:12

Is your own computer by any chance running Windows? How exactly are you making the file available to Jupyter?

– tripleee
Jan 2 at 0:14

If you just want to read the CSV file, use the csv module from the standard library. It should properly handle the line endings on its own.

– mkrieger1
Jan 2 at 0:45

@tripleee Yes, I mean "manual inspection". Thanks for clarifying.

– user10200596
Jan 2 at 5:20

|
show 2 more comments

1 Answer
1

active

oldest

votes

Without any indication of how you downloaded or otherwise made the file available to Python and Jupyter, we can't really tell why this is happening. Line endings are platform-specific but Python 3 should generally neutralize differences between platforms unless you specifically request opening a file as "binary".

You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:

with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:

    buf = peek.read(1024)

    if b'rn' in peek:

        print("DOS CR/LF line terminator")

    elif b'r' in peek:

        print("Plain CR seen (legacy Mac or CP/M file)?")

    elif b'n' in peek:

        print("Plain LF seen (standard Unix text file)")

This doesn't attempt to do any statistical analysis, but might work well enough for your limited case. The file will be closed again after the end of the with block so you can then just open it a second time with the parameters you actually need.

edited Jan 2 at 5:44

answered Jan 2 at 5:35

tripleee

94.2k13132186

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53999914%2fhow-do-i-write-code-to-determine-whether-the-eol-character-in-a-csv-file-is-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:

with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:

    buf = peek.read(1024)

    if b'rn' in peek:

        print("DOS CR/LF line terminator")

    elif b'r' in peek:

        print("Plain CR seen (legacy Mac or CP/M file)?")

    elif b'n' in peek:

        print("Plain LF seen (standard Unix text file)")

edited Jan 2 at 5:44

answered Jan 2 at 5:35

tripleee

94.2k13132186

add a comment |

You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:

with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:

    buf = peek.read(1024)

    if b'rn' in peek:

        print("DOS CR/LF line terminator")

    elif b'r' in peek:

        print("Plain CR seen (legacy Mac or CP/M file)?")

    elif b'n' in peek:

        print("Plain LF seen (standard Unix text file)")

edited Jan 2 at 5:44

answered Jan 2 at 5:35

tripleee

94.2k13132186

add a comment |

You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:

with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:

    buf = peek.read(1024)

    if b'rn' in peek:

        print("DOS CR/LF line terminator")

    elif b'r' in peek:

        print("Plain CR seen (legacy Mac or CP/M file)?")

    elif b'n' in peek:

        print("Plain LF seen (standard Unix text file)")

edited Jan 2 at 5:44

answered Jan 2 at 5:35

tripleee

94.2k13132186

You can discover the line-ending conventions by simply opening the file and reading enough of it. What's "enough" depends on the file type. Perhaps something like this in your case:

with open('US_births_1994-2003_CDC_NCHS.csv', 'rb') as peek:

    buf = peek.read(1024)

    if b'rn' in peek:

        print("DOS CR/LF line terminator")

    elif b'r' in peek:

        print("Plain CR seen (legacy Mac or CP/M file)?")

    elif b'n' in peek:

        print("Plain LF seen (standard Unix text file)")

edited Jan 2 at 5:44

answered Jan 2 at 5:35

tripleee

94.2k13132186

edited Jan 2 at 5:44

answered Jan 2 at 5:35

tripleee

94.2k13132186

answered Jan 2 at 5:35

tripleee

94.2k13132186

answered Jan 2 at 5:35

tripleee

94.2k13132186

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu