recursive matching for string delimiter with regular expression
In verilog language, the statements are enclosed in a begin-end delimiter instead of bracket.
always@ (*) begin
if (condA) begin
a = c
end
else begin
b = d
end
end
I'd like to parse outermost begin-end with its statements to check coding rule in python. Using regular expression, I want results with regular expression like:
if (condA) begin
a = c
end
else begin
b = d
end
I found similar answer for bracket delimiter.
int funcA() {
if (condA) {
b = a
}
}
regular expression:
/({(?>[^{}]+|(?R))*})/g
However, I don't know how to modify atomic group ([^{}]) for "begin-end"?
/(begin(?>[??????]+|(?R))*end)/g
regex recursion
add a comment |
In verilog language, the statements are enclosed in a begin-end delimiter instead of bracket.
always@ (*) begin
if (condA) begin
a = c
end
else begin
b = d
end
end
I'd like to parse outermost begin-end with its statements to check coding rule in python. Using regular expression, I want results with regular expression like:
if (condA) begin
a = c
end
else begin
b = d
end
I found similar answer for bracket delimiter.
int funcA() {
if (condA) {
b = a
}
}
regular expression:
/({(?>[^{}]+|(?R))*})/g
However, I don't know how to modify atomic group ([^{}]) for "begin-end"?
/(begin(?>[??????]+|(?R))*end)/g
regex recursion
*
is a greedy quantifier. It will find the longest matching sequence. Maybe I misunderstand something, but I think this may work:begin([sS]*)end
Check here
– Jónás Balázs
Jan 2 at 11:27
add a comment |
In verilog language, the statements are enclosed in a begin-end delimiter instead of bracket.
always@ (*) begin
if (condA) begin
a = c
end
else begin
b = d
end
end
I'd like to parse outermost begin-end with its statements to check coding rule in python. Using regular expression, I want results with regular expression like:
if (condA) begin
a = c
end
else begin
b = d
end
I found similar answer for bracket delimiter.
int funcA() {
if (condA) {
b = a
}
}
regular expression:
/({(?>[^{}]+|(?R))*})/g
However, I don't know how to modify atomic group ([^{}]) for "begin-end"?
/(begin(?>[??????]+|(?R))*end)/g
regex recursion
In verilog language, the statements are enclosed in a begin-end delimiter instead of bracket.
always@ (*) begin
if (condA) begin
a = c
end
else begin
b = d
end
end
I'd like to parse outermost begin-end with its statements to check coding rule in python. Using regular expression, I want results with regular expression like:
if (condA) begin
a = c
end
else begin
b = d
end
I found similar answer for bracket delimiter.
int funcA() {
if (condA) {
b = a
}
}
regular expression:
/({(?>[^{}]+|(?R))*})/g
However, I don't know how to modify atomic group ([^{}]) for "begin-end"?
/(begin(?>[??????]+|(?R))*end)/g
regex recursion
regex recursion
asked Jan 2 at 11:15
Jonghun yooJonghun yoo
61
61
*
is a greedy quantifier. It will find the longest matching sequence. Maybe I misunderstand something, but I think this may work:begin([sS]*)end
Check here
– Jónás Balázs
Jan 2 at 11:27
add a comment |
*
is a greedy quantifier. It will find the longest matching sequence. Maybe I misunderstand something, but I think this may work:begin([sS]*)end
Check here
– Jónás Balázs
Jan 2 at 11:27
*
is a greedy quantifier. It will find the longest matching sequence. Maybe I misunderstand something, but I think this may work: begin([sS]*)end
Check here– Jónás Balázs
Jan 2 at 11:27
*
is a greedy quantifier. It will find the longest matching sequence. Maybe I misunderstand something, but I think this may work: begin([sS]*)end
Check here– Jónás Balázs
Jan 2 at 11:27
add a comment |
1 Answer
1
active
oldest
votes
The point of the [??????]+
part is to match any text that does not match a char that is equal or is the starting point of the delimiters.
So, in your case, you need to match any char other than a char that starts either begin
or end
substring:
/begin(?>(?!begin|end).|(?R))*end/gs
See the regex demo
The .
here will match any char including line break chars due to the s
modifier. Note that the actual implementation might need adjustments (e.g. in PHP, the g
modifier should not be used as there are specific functions/features for that).
Also, since you recurse the whole pattern, you need no outer parentheses.
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, seebegin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.
– Wiktor Stribiżew
Jan 2 at 11:44
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54005324%2frecursive-matching-for-string-delimiter-with-regular-expression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The point of the [??????]+
part is to match any text that does not match a char that is equal or is the starting point of the delimiters.
So, in your case, you need to match any char other than a char that starts either begin
or end
substring:
/begin(?>(?!begin|end).|(?R))*end/gs
See the regex demo
The .
here will match any char including line break chars due to the s
modifier. Note that the actual implementation might need adjustments (e.g. in PHP, the g
modifier should not be used as there are specific functions/features for that).
Also, since you recurse the whole pattern, you need no outer parentheses.
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, seebegin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.
– Wiktor Stribiżew
Jan 2 at 11:44
add a comment |
The point of the [??????]+
part is to match any text that does not match a char that is equal or is the starting point of the delimiters.
So, in your case, you need to match any char other than a char that starts either begin
or end
substring:
/begin(?>(?!begin|end).|(?R))*end/gs
See the regex demo
The .
here will match any char including line break chars due to the s
modifier. Note that the actual implementation might need adjustments (e.g. in PHP, the g
modifier should not be used as there are specific functions/features for that).
Also, since you recurse the whole pattern, you need no outer parentheses.
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, seebegin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.
– Wiktor Stribiżew
Jan 2 at 11:44
add a comment |
The point of the [??????]+
part is to match any text that does not match a char that is equal or is the starting point of the delimiters.
So, in your case, you need to match any char other than a char that starts either begin
or end
substring:
/begin(?>(?!begin|end).|(?R))*end/gs
See the regex demo
The .
here will match any char including line break chars due to the s
modifier. Note that the actual implementation might need adjustments (e.g. in PHP, the g
modifier should not be used as there are specific functions/features for that).
Also, since you recurse the whole pattern, you need no outer parentheses.
The point of the [??????]+
part is to match any text that does not match a char that is equal or is the starting point of the delimiters.
So, in your case, you need to match any char other than a char that starts either begin
or end
substring:
/begin(?>(?!begin|end).|(?R))*end/gs
See the regex demo
The .
here will match any char including line break chars due to the s
modifier. Note that the actual implementation might need adjustments (e.g. in PHP, the g
modifier should not be used as there are specific functions/features for that).
Also, since you recurse the whole pattern, you need no outer parentheses.
answered Jan 2 at 11:18
Wiktor StribiżewWiktor Stribiżew
325k16146226
325k16146226
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, seebegin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.
– Wiktor Stribiżew
Jan 2 at 11:44
add a comment |
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, seebegin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.
– Wiktor Stribiżew
Jan 2 at 11:44
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Awesome! All done, thanks to your help.
– Jonghun yoo
Jan 2 at 11:37
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, see
begin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.– Wiktor Stribiżew
Jan 2 at 11:44
Unrolling the pattern may yield better performance, but it becomes too cryptic and unwieldly, see
begin(?:[^be]*(?:(?:b(?!egin)|e(?!nd))[^be]*)*|(?R))*end
.– Wiktor Stribiżew
Jan 2 at 11:44
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54005324%2frecursive-matching-for-string-delimiter-with-regular-expression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
*
is a greedy quantifier. It will find the longest matching sequence. Maybe I misunderstand something, but I think this may work:begin([sS]*)end
Check here– Jónás Balázs
Jan 2 at 11:27