Regex: don't match string ending with newline (n) with end-of-line anchor ($)

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I can't figure out how to match a string but not if it has a trailing newline character (n), which seems automatically stripped:

import re



print(re.match(r'^foobar$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarn'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarnn'))

# None

For me, the second case should also return None.

When we set the end of a pattern with $, like ^foobar$, it should only match a string like foobar, not foobarn.

What am I missing?

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

asked Feb 11 '18 at 10:02

Arthur White

2817

1

What exactly is a single line string and how are you reading this? Are you reading a binary file and parsing this yourself?

– Roger Credlin
Feb 11 '18 at 10:06

I removed the "single line" term. It was misleading and adds no value to the question. Otherwise, no, I'm not parsing a file. I just need to check the pattern of a simple string, like in my example.

– Arthur White
Feb 11 '18 at 10:11

add a comment |

I can't figure out how to match a string but not if it has a trailing newline character (n), which seems automatically stripped:

import re



print(re.match(r'^foobar$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarn'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarnn'))

# None

For me, the second case should also return None.

When we set the end of a pattern with $, like ^foobar$, it should only match a string like foobar, not foobarn.

What am I missing?

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

asked Feb 11 '18 at 10:02

Arthur White

2817

1

What exactly is a single line string and how are you reading this? Are you reading a binary file and parsing this yourself?

– Roger Credlin
Feb 11 '18 at 10:06

I removed the "single line" term. It was misleading and adds no value to the question. Otherwise, no, I'm not parsing a file. I just need to check the pattern of a simple string, like in my example.

– Arthur White
Feb 11 '18 at 10:11

add a comment |

I can't figure out how to match a string but not if it has a trailing newline character (n), which seems automatically stripped:

import re



print(re.match(r'^foobar$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarn'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarnn'))

# None

For me, the second case should also return None.

When we set the end of a pattern with $, like ^foobar$, it should only match a string like foobar, not foobarn.

What am I missing?

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

asked Feb 11 '18 at 10:02

Arthur White

2817

I can't figure out how to match a string but not if it has a trailing newline character (n), which seems automatically stripped:

import re



print(re.match(r'^foobar$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarn'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar$', 'foobarnn'))

# None

For me, the second case should also return None.

When we set the end of a pattern with $, like ^foobar$, it should only match a string like foobar, not foobarn.

What am I missing?

python regex python-3.x match newline

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

asked Feb 11 '18 at 10:02

Arthur White

2817

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

asked Feb 11 '18 at 10:02

Arthur White

2817

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

edited Feb 11 '18 at 10:15

DeepSpace

40.6k44779

asked Feb 11 '18 at 10:02

Arthur White

2817

asked Feb 11 '18 at 10:02

Arthur White

2817

asked Feb 11 '18 at 10:02

Arthur White

2817

1

What exactly is a single line string and how are you reading this? Are you reading a binary file and parsing this yourself?

– Roger Credlin
Feb 11 '18 at 10:06

I removed the "single line" term. It was misleading and adds no value to the question. Otherwise, no, I'm not parsing a file. I just need to check the pattern of a simple string, like in my example.

– Arthur White
Feb 11 '18 at 10:11

add a comment |

1

What exactly is a single line string and how are you reading this? Are you reading a binary file and parsing this yourself?

– Roger Credlin
Feb 11 '18 at 10:06

I removed the "single line" term. It was misleading and adds no value to the question. Otherwise, no, I'm not parsing a file. I just need to check the pattern of a simple string, like in my example.

– Arthur White
Feb 11 '18 at 10:11

What exactly is a single line string and how are you reading this? Are you reading a binary file and parsing this yourself?

– Roger Credlin
Feb 11 '18 at 10:06

I removed the "single line" term. It was misleading and adds no value to the question. Otherwise, no, I'm not parsing a file. I just need to check the pattern of a simple string, like in my example.

– Arthur White
Feb 11 '18 at 10:11

add a comment |

3 Answers
3

active

oldest

votes

You more likely don't need $ but rather Z:

>>> print(re.match(r'^foobarZ', 'foobarn'))

None

Z matches only at the end of the string.

answered Feb 11 '18 at 10:43

revo

34.3k135188

1

That's the way to go. I suspected the negative lookahead to be a dirty hack for this. Thank you!

– Arthur White
Feb 11 '18 at 11:39

add a comment |

This is the defined behavior of $, as can be read in the docs that @zvone linked to or even on https://regex101.com:

$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

You can use an explicit negative lookahead to counter this behavior:

import re



print(re.match(r'^foobar(?!n)$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar(?!n)$', 'foobarn'))

# None



print(re.match(r'^foobar(?!n)$', 'foobarnn'))

# None

edited Feb 11 '18 at 10:19

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

add a comment |

The documentation says this about the $ character:

Matches the end of the string or just before the newline at the end of
the string, and in MULTILINE mode also matches before a newline.

So, without the MULTILINE option, it matches exactly the first two strings you tried: 'foobar' and 'foobarn', but not 'foobarnn', because that is not a newline at the end of the string.

On the other hand, if you choose MULTILINE option, it will match the end of any line:

>>> re.match(r'^foobar$', 'foobarnn', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

Of course, this will also match in the following case, which may or may not be what you want:

>>> re.match(r'^foobar$', 'foobarnanother linen', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

In order to NOT match the ending newline, use the negative lookahead as DeepSpace wrote.

edited Feb 11 '18 at 10:20

answered Feb 11 '18 at 10:10

zvone

10.2k12448

@DeepSpace You are right, I did not see that point...

– zvone
Feb 11 '18 at 10:15

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f48730327%2fregex-dont-match-string-ending-with-newline-n-with-end-of-line-anchor%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

You more likely don't need $ but rather Z:

>>> print(re.match(r'^foobarZ', 'foobarn'))

None

Z matches only at the end of the string.

answered Feb 11 '18 at 10:43

revo

34.3k135188

1

That's the way to go. I suspected the negative lookahead to be a dirty hack for this. Thank you!

– Arthur White
Feb 11 '18 at 11:39

add a comment |

You more likely don't need $ but rather Z:

>>> print(re.match(r'^foobarZ', 'foobarn'))

None

Z matches only at the end of the string.

answered Feb 11 '18 at 10:43

revo

34.3k135188

1

That's the way to go. I suspected the negative lookahead to be a dirty hack for this. Thank you!

– Arthur White
Feb 11 '18 at 11:39

add a comment |

You more likely don't need $ but rather Z:

>>> print(re.match(r'^foobarZ', 'foobarn'))

None

Z matches only at the end of the string.

answered Feb 11 '18 at 10:43

revo

34.3k135188

You more likely don't need $ but rather Z:

>>> print(re.match(r'^foobarZ', 'foobarn'))

None

Z matches only at the end of the string.

answered Feb 11 '18 at 10:43

revo

34.3k135188

answered Feb 11 '18 at 10:43

revo

34.3k135188

answered Feb 11 '18 at 10:43

revo

34.3k135188

answered Feb 11 '18 at 10:43

revo

34.3k135188

1

That's the way to go. I suspected the negative lookahead to be a dirty hack for this. Thank you!

– Arthur White
Feb 11 '18 at 11:39

add a comment |

1

That's the way to go. I suspected the negative lookahead to be a dirty hack for this. Thank you!

– Arthur White
Feb 11 '18 at 11:39

That's the way to go. I suspected the negative lookahead to be a dirty hack for this. Thank you!

– Arthur White
Feb 11 '18 at 11:39

add a comment |

This is the defined behavior of $, as can be read in the docs that @zvone linked to or even on https://regex101.com:

$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

You can use an explicit negative lookahead to counter this behavior:

import re



print(re.match(r'^foobar(?!n)$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar(?!n)$', 'foobarn'))

# None



print(re.match(r'^foobar(?!n)$', 'foobarnn'))

# None

edited Feb 11 '18 at 10:19

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

add a comment |

This is the defined behavior of $, as can be read in the docs that @zvone linked to or even on https://regex101.com:

$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

You can use an explicit negative lookahead to counter this behavior:

import re



print(re.match(r'^foobar(?!n)$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar(?!n)$', 'foobarn'))

# None



print(re.match(r'^foobar(?!n)$', 'foobarnn'))

# None

edited Feb 11 '18 at 10:19

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

add a comment |

This is the defined behavior of $, as can be read in the docs that @zvone linked to or even on https://regex101.com:

$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

You can use an explicit negative lookahead to counter this behavior:

import re



print(re.match(r'^foobar(?!n)$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar(?!n)$', 'foobarn'))

# None



print(re.match(r'^foobar(?!n)$', 'foobarnn'))

# None

edited Feb 11 '18 at 10:19

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

This is the defined behavior of $, as can be read in the docs that @zvone linked to or even on https://regex101.com:

$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)

You can use an explicit negative lookahead to counter this behavior:

import re



print(re.match(r'^foobar(?!n)$', 'foobar'))

# <_sre.SRE_Match object; span=(0, 6), match='foobar'>



print(re.match(r'^foobar(?!n)$', 'foobarn'))

# None



print(re.match(r'^foobar(?!n)$', 'foobarnn'))

# None

edited Feb 11 '18 at 10:19

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

edited Feb 11 '18 at 10:19

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

answered Feb 11 '18 at 10:09

DeepSpace

40.6k44779

add a comment |

The documentation says this about the $ character:

Matches the end of the string or just before the newline at the end of
the string, and in MULTILINE mode also matches before a newline.

So, without the MULTILINE option, it matches exactly the first two strings you tried: 'foobar' and 'foobarn', but not 'foobarnn', because that is not a newline at the end of the string.

On the other hand, if you choose MULTILINE option, it will match the end of any line:

>>> re.match(r'^foobar$', 'foobarnn', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

Of course, this will also match in the following case, which may or may not be what you want:

>>> re.match(r'^foobar$', 'foobarnanother linen', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

In order to NOT match the ending newline, use the negative lookahead as DeepSpace wrote.

edited Feb 11 '18 at 10:20

answered Feb 11 '18 at 10:10

zvone

10.2k12448

@DeepSpace You are right, I did not see that point...

– zvone
Feb 11 '18 at 10:15

add a comment |

The documentation says this about the $ character:

Matches the end of the string or just before the newline at the end of
the string, and in MULTILINE mode also matches before a newline.

So, without the MULTILINE option, it matches exactly the first two strings you tried: 'foobar' and 'foobarn', but not 'foobarnn', because that is not a newline at the end of the string.

On the other hand, if you choose MULTILINE option, it will match the end of any line:

>>> re.match(r'^foobar$', 'foobarnn', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

Of course, this will also match in the following case, which may or may not be what you want:

>>> re.match(r'^foobar$', 'foobarnanother linen', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

In order to NOT match the ending newline, use the negative lookahead as DeepSpace wrote.

edited Feb 11 '18 at 10:20

answered Feb 11 '18 at 10:10

zvone

10.2k12448

@DeepSpace You are right, I did not see that point...

– zvone
Feb 11 '18 at 10:15

add a comment |

The documentation says this about the $ character:

Matches the end of the string or just before the newline at the end of
the string, and in MULTILINE mode also matches before a newline.

So, without the MULTILINE option, it matches exactly the first two strings you tried: 'foobar' and 'foobarn', but not 'foobarnn', because that is not a newline at the end of the string.

On the other hand, if you choose MULTILINE option, it will match the end of any line:

>>> re.match(r'^foobar$', 'foobarnn', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

Of course, this will also match in the following case, which may or may not be what you want:

>>> re.match(r'^foobar$', 'foobarnanother linen', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

In order to NOT match the ending newline, use the negative lookahead as DeepSpace wrote.

edited Feb 11 '18 at 10:20

answered Feb 11 '18 at 10:10

zvone

10.2k12448

The documentation says this about the $ character:

Matches the end of the string or just before the newline at the end of
the string, and in MULTILINE mode also matches before a newline.

So, without the MULTILINE option, it matches exactly the first two strings you tried: 'foobar' and 'foobarn', but not 'foobarnn', because that is not a newline at the end of the string.

On the other hand, if you choose MULTILINE option, it will match the end of any line:

>>> re.match(r'^foobar$', 'foobarnn', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

Of course, this will also match in the following case, which may or may not be what you want:

>>> re.match(r'^foobar$', 'foobarnanother linen', re.MULTILINE)

<_sre.SRE_Match object; span=(0, 6), match='foobar'>

In order to NOT match the ending newline, use the negative lookahead as DeepSpace wrote.

edited Feb 11 '18 at 10:20

answered Feb 11 '18 at 10:10

zvone

10.2k12448

edited Feb 11 '18 at 10:20

answered Feb 11 '18 at 10:10

zvone

10.2k12448

answered Feb 11 '18 at 10:10

zvone

10.2k12448

answered Feb 11 '18 at 10:10

zvone

10.2k12448

@DeepSpace You are right, I did not see that point...

– zvone
Feb 11 '18 at 10:15

add a comment |

@DeepSpace You are right, I did not see that point...

– zvone
Feb 11 '18 at 10:15

@DeepSpace You are right, I did not see that point...

– zvone
Feb 11 '18 at 10:15

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu