Why is there no permutation in Regexes? (Even if regular languages seem to be able to do this)
up vote
9
down vote
favorite
The Problem
There is no easy way to get a permutation with a regex.
Permutation: Getting a word $$w=x_1…x_n$$ ("aabc") to another order, without changing number or kind of letters.
Regex: Regular expression.
For verification:
"Regex permutations without repetition" The answer creates JavaScript code instead of a regex, assuming this would be more simple.
"How to find all permutations of a given word in a given text" – The answer doesn't use regexes either.
"Regex to match all {1, 2, 3, 4} without repetition" – The answer uses regexes, but it's neither adaptable nor simple.- This answer even claims: "A regular expression cannot do what you're asking for. It cannot generate permutations from a string".
The kind of solution I am searching for
It should have the form:
- »aabc« (or anything else you could use a opening and closing parentheses)
- (aabc)! (similar to (abc)? but with another symbol in the end)
- [aabc]! (similar to [abc]+ but with another symbol in the end)
Advantages of these solutions
They are:
- easy
- adaptable
- reusable
Why this should exist
- Regexes are a way to describe a grammar of a regular language. They have the full power to be any kind of regular language.
- Let's say, regular languages are powerful enough for permutations (proof below) – why is there no easy way to express this?
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
The proof
- Regular expressions are one way to note the grammar of a regular language. They can describe any regular languages grammar.
- Another way to describe any regular languages (that have a finite number of letters within their alphabet) grammar are non-deterministic Automatons (with a finite number of states).
Having a finite number of letters I can create this automaton: (Example. Formal: see below)
Grammar that accepts permutations of "abbc":
(sry for numbers on top, maybe someone knows how to make this part looking better)
s -> ah¹
s -> bh²
s -> ch³
h¹ -> bh¹¹
h¹ -> ch¹²
h² -> ah¹¹ (no typo! equivalence)
h² -> bh²²
h² -> ch²³
h³ -> ah¹²
h³ -> bh²³
h¹¹ -> bc
h¹¹ -> cb
h¹² -> bb
h²² -> ac
h²² -> ca
h²³ -> ab
h²³ -> ba
More formal: (using a finite-state-automaton but this could be made with grammar as well)
- A word q (with finite length) to which any permutation should reach an accepting state.
- X is the finite alphabet.
- Set of states S contains any order of letters up to the length of q. (So the size of S is finite.) Plus one state of "any longer word".
- state transition function d which takes a letter and moves on the state that corresponds to the now read part of the word.
- F is a set of that states that are exact permutations of q.
So it is possible to create a finite-state automaton for accepting permutations of a given word.
Moving on with the proof
So I have proven that regular languages have the power to check for permutations, haven't I?
So why is there no approach to reach this with Regexes? It's a useful functionality.
formal-languages regular-languages regular-expressions
New contributor
add a comment |
up vote
9
down vote
favorite
The Problem
There is no easy way to get a permutation with a regex.
Permutation: Getting a word $$w=x_1…x_n$$ ("aabc") to another order, without changing number or kind of letters.
Regex: Regular expression.
For verification:
"Regex permutations without repetition" The answer creates JavaScript code instead of a regex, assuming this would be more simple.
"How to find all permutations of a given word in a given text" – The answer doesn't use regexes either.
"Regex to match all {1, 2, 3, 4} without repetition" – The answer uses regexes, but it's neither adaptable nor simple.- This answer even claims: "A regular expression cannot do what you're asking for. It cannot generate permutations from a string".
The kind of solution I am searching for
It should have the form:
- »aabc« (or anything else you could use a opening and closing parentheses)
- (aabc)! (similar to (abc)? but with another symbol in the end)
- [aabc]! (similar to [abc]+ but with another symbol in the end)
Advantages of these solutions
They are:
- easy
- adaptable
- reusable
Why this should exist
- Regexes are a way to describe a grammar of a regular language. They have the full power to be any kind of regular language.
- Let's say, regular languages are powerful enough for permutations (proof below) – why is there no easy way to express this?
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
The proof
- Regular expressions are one way to note the grammar of a regular language. They can describe any regular languages grammar.
- Another way to describe any regular languages (that have a finite number of letters within their alphabet) grammar are non-deterministic Automatons (with a finite number of states).
Having a finite number of letters I can create this automaton: (Example. Formal: see below)
Grammar that accepts permutations of "abbc":
(sry for numbers on top, maybe someone knows how to make this part looking better)
s -> ah¹
s -> bh²
s -> ch³
h¹ -> bh¹¹
h¹ -> ch¹²
h² -> ah¹¹ (no typo! equivalence)
h² -> bh²²
h² -> ch²³
h³ -> ah¹²
h³ -> bh²³
h¹¹ -> bc
h¹¹ -> cb
h¹² -> bb
h²² -> ac
h²² -> ca
h²³ -> ab
h²³ -> ba
More formal: (using a finite-state-automaton but this could be made with grammar as well)
- A word q (with finite length) to which any permutation should reach an accepting state.
- X is the finite alphabet.
- Set of states S contains any order of letters up to the length of q. (So the size of S is finite.) Plus one state of "any longer word".
- state transition function d which takes a letter and moves on the state that corresponds to the now read part of the word.
- F is a set of that states that are exact permutations of q.
So it is possible to create a finite-state automaton for accepting permutations of a given word.
Moving on with the proof
So I have proven that regular languages have the power to check for permutations, haven't I?
So why is there no approach to reach this with Regexes? It's a useful functionality.
formal-languages regular-languages regular-expressions
New contributor
9
You can list all permutations of your word with a regular expression. The resulting expression will be quite large, but will definitely be a regular expression.
– Yuval Filmus
2 days ago
6
I suggest ignoring all answers about Theory of Computation on stackoverflow. This isn't that site's specialty.
– Yuval Filmus
2 days ago
The answer on your linked page here - stackoverflow.com/a/3102205/6936386 - seems to be easily adaptable and not too complicated:^(a()|a()|b()|c()){4}2345$
seems to work (see regex101.com/r/9URPpg/4/tests).
– boboquack
yesterday
5
@boboquack That isn't a regular expression in the sense in which the term is used in computer science. (This sort of thing is exactly why Yuval suggests not trusting Stack Overflow answers about theoretical CS.)
– David Richerby
21 hours ago
add a comment |
up vote
9
down vote
favorite
up vote
9
down vote
favorite
The Problem
There is no easy way to get a permutation with a regex.
Permutation: Getting a word $$w=x_1…x_n$$ ("aabc") to another order, without changing number or kind of letters.
Regex: Regular expression.
For verification:
"Regex permutations without repetition" The answer creates JavaScript code instead of a regex, assuming this would be more simple.
"How to find all permutations of a given word in a given text" – The answer doesn't use regexes either.
"Regex to match all {1, 2, 3, 4} without repetition" – The answer uses regexes, but it's neither adaptable nor simple.- This answer even claims: "A regular expression cannot do what you're asking for. It cannot generate permutations from a string".
The kind of solution I am searching for
It should have the form:
- »aabc« (or anything else you could use a opening and closing parentheses)
- (aabc)! (similar to (abc)? but with another symbol in the end)
- [aabc]! (similar to [abc]+ but with another symbol in the end)
Advantages of these solutions
They are:
- easy
- adaptable
- reusable
Why this should exist
- Regexes are a way to describe a grammar of a regular language. They have the full power to be any kind of regular language.
- Let's say, regular languages are powerful enough for permutations (proof below) – why is there no easy way to express this?
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
The proof
- Regular expressions are one way to note the grammar of a regular language. They can describe any regular languages grammar.
- Another way to describe any regular languages (that have a finite number of letters within their alphabet) grammar are non-deterministic Automatons (with a finite number of states).
Having a finite number of letters I can create this automaton: (Example. Formal: see below)
Grammar that accepts permutations of "abbc":
(sry for numbers on top, maybe someone knows how to make this part looking better)
s -> ah¹
s -> bh²
s -> ch³
h¹ -> bh¹¹
h¹ -> ch¹²
h² -> ah¹¹ (no typo! equivalence)
h² -> bh²²
h² -> ch²³
h³ -> ah¹²
h³ -> bh²³
h¹¹ -> bc
h¹¹ -> cb
h¹² -> bb
h²² -> ac
h²² -> ca
h²³ -> ab
h²³ -> ba
More formal: (using a finite-state-automaton but this could be made with grammar as well)
- A word q (with finite length) to which any permutation should reach an accepting state.
- X is the finite alphabet.
- Set of states S contains any order of letters up to the length of q. (So the size of S is finite.) Plus one state of "any longer word".
- state transition function d which takes a letter and moves on the state that corresponds to the now read part of the word.
- F is a set of that states that are exact permutations of q.
So it is possible to create a finite-state automaton for accepting permutations of a given word.
Moving on with the proof
So I have proven that regular languages have the power to check for permutations, haven't I?
So why is there no approach to reach this with Regexes? It's a useful functionality.
formal-languages regular-languages regular-expressions
New contributor
The Problem
There is no easy way to get a permutation with a regex.
Permutation: Getting a word $$w=x_1…x_n$$ ("aabc") to another order, without changing number or kind of letters.
Regex: Regular expression.
For verification:
"Regex permutations without repetition" The answer creates JavaScript code instead of a regex, assuming this would be more simple.
"How to find all permutations of a given word in a given text" – The answer doesn't use regexes either.
"Regex to match all {1, 2, 3, 4} without repetition" – The answer uses regexes, but it's neither adaptable nor simple.- This answer even claims: "A regular expression cannot do what you're asking for. It cannot generate permutations from a string".
The kind of solution I am searching for
It should have the form:
- »aabc« (or anything else you could use a opening and closing parentheses)
- (aabc)! (similar to (abc)? but with another symbol in the end)
- [aabc]! (similar to [abc]+ but with another symbol in the end)
Advantages of these solutions
They are:
- easy
- adaptable
- reusable
Why this should exist
- Regexes are a way to describe a grammar of a regular language. They have the full power to be any kind of regular language.
- Let's say, regular languages are powerful enough for permutations (proof below) – why is there no easy way to express this?
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
The proof
- Regular expressions are one way to note the grammar of a regular language. They can describe any regular languages grammar.
- Another way to describe any regular languages (that have a finite number of letters within their alphabet) grammar are non-deterministic Automatons (with a finite number of states).
Having a finite number of letters I can create this automaton: (Example. Formal: see below)
Grammar that accepts permutations of "abbc":
(sry for numbers on top, maybe someone knows how to make this part looking better)
s -> ah¹
s -> bh²
s -> ch³
h¹ -> bh¹¹
h¹ -> ch¹²
h² -> ah¹¹ (no typo! equivalence)
h² -> bh²²
h² -> ch²³
h³ -> ah¹²
h³ -> bh²³
h¹¹ -> bc
h¹¹ -> cb
h¹² -> bb
h²² -> ac
h²² -> ca
h²³ -> ab
h²³ -> ba
More formal: (using a finite-state-automaton but this could be made with grammar as well)
- A word q (with finite length) to which any permutation should reach an accepting state.
- X is the finite alphabet.
- Set of states S contains any order of letters up to the length of q. (So the size of S is finite.) Plus one state of "any longer word".
- state transition function d which takes a letter and moves on the state that corresponds to the now read part of the word.
- F is a set of that states that are exact permutations of q.
So it is possible to create a finite-state automaton for accepting permutations of a given word.
Moving on with the proof
So I have proven that regular languages have the power to check for permutations, haven't I?
So why is there no approach to reach this with Regexes? It's a useful functionality.
formal-languages regular-languages regular-expressions
formal-languages regular-languages regular-expressions
New contributor
New contributor
edited 2 days ago
Glorfindel
1741110
1741110
New contributor
asked 2 days ago
Asqiir
1518
1518
New contributor
New contributor
9
You can list all permutations of your word with a regular expression. The resulting expression will be quite large, but will definitely be a regular expression.
– Yuval Filmus
2 days ago
6
I suggest ignoring all answers about Theory of Computation on stackoverflow. This isn't that site's specialty.
– Yuval Filmus
2 days ago
The answer on your linked page here - stackoverflow.com/a/3102205/6936386 - seems to be easily adaptable and not too complicated:^(a()|a()|b()|c()){4}2345$
seems to work (see regex101.com/r/9URPpg/4/tests).
– boboquack
yesterday
5
@boboquack That isn't a regular expression in the sense in which the term is used in computer science. (This sort of thing is exactly why Yuval suggests not trusting Stack Overflow answers about theoretical CS.)
– David Richerby
21 hours ago
add a comment |
9
You can list all permutations of your word with a regular expression. The resulting expression will be quite large, but will definitely be a regular expression.
– Yuval Filmus
2 days ago
6
I suggest ignoring all answers about Theory of Computation on stackoverflow. This isn't that site's specialty.
– Yuval Filmus
2 days ago
The answer on your linked page here - stackoverflow.com/a/3102205/6936386 - seems to be easily adaptable and not too complicated:^(a()|a()|b()|c()){4}2345$
seems to work (see regex101.com/r/9URPpg/4/tests).
– boboquack
yesterday
5
@boboquack That isn't a regular expression in the sense in which the term is used in computer science. (This sort of thing is exactly why Yuval suggests not trusting Stack Overflow answers about theoretical CS.)
– David Richerby
21 hours ago
9
9
You can list all permutations of your word with a regular expression. The resulting expression will be quite large, but will definitely be a regular expression.
– Yuval Filmus
2 days ago
You can list all permutations of your word with a regular expression. The resulting expression will be quite large, but will definitely be a regular expression.
– Yuval Filmus
2 days ago
6
6
I suggest ignoring all answers about Theory of Computation on stackoverflow. This isn't that site's specialty.
– Yuval Filmus
2 days ago
I suggest ignoring all answers about Theory of Computation on stackoverflow. This isn't that site's specialty.
– Yuval Filmus
2 days ago
The answer on your linked page here - stackoverflow.com/a/3102205/6936386 - seems to be easily adaptable and not too complicated:
^(a()|a()|b()|c()){4}2345$
seems to work (see regex101.com/r/9URPpg/4/tests).– boboquack
yesterday
The answer on your linked page here - stackoverflow.com/a/3102205/6936386 - seems to be easily adaptable and not too complicated:
^(a()|a()|b()|c()){4}2345$
seems to work (see regex101.com/r/9URPpg/4/tests).– boboquack
yesterday
5
5
@boboquack That isn't a regular expression in the sense in which the term is used in computer science. (This sort of thing is exactly why Yuval suggests not trusting Stack Overflow answers about theoretical CS.)
– David Richerby
21 hours ago
@boboquack That isn't a regular expression in the sense in which the term is used in computer science. (This sort of thing is exactly why Yuval suggests not trusting Stack Overflow answers about theoretical CS.)
– David Richerby
21 hours ago
add a comment |
3 Answers
3
active
oldest
votes
up vote
31
down vote
accepted
The fundamental theorems of formal language theory are that regular expressions, regular grammars, deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) all describe the same kinds of languages: namely the regular languages. The fact that we can describe these languages in so many completely different ways suggests that there's something natural and important about these languages, in the same way that the equivalence of Turing machines, the lambda calculus and all kinds of other things suggests that the computable languages are natural and important. They're not just an artifact of whatever random decisions the original discoverer made.
Suppose we add a new rule for creating regular expressions: if $R$ is a regular expression, then $pi(R)$ is a regular expression, and it matches every permutation of every string matched by $R$. So, for example, $L(pi(abc)) = {abc, acb, bac, bca, cab, cba}$. The problem is that this breaks the fundamental equivalences described above. $Lbig(pi((ab)^*))big)$ is the language of strings that contain an equal number of $a$s and $b$s and this isn't a regular language. Compare this with, for example, adding a negation or reversal operator to regular expressions, which doesn't change the class of languages that are accepted.
So, to answer the title question, regular expressions can't do permutations and we don't add that ability because then regular expressions wouldn't match regular languages. Having said that, it's possible that "regular expressions with permutations" would also be an interesting class of languages with lots of different characterizations.
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
7
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
3
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
1
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
1
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.
– reinierpost
21 hours ago
|
show 3 more comments
up vote
13
down vote
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
Your "proof" only looked at permutations of single words, which are finite languages.
Every finite language is regular (e.g. just by listing all of the members with a |
inbetween), but there are infinite regular languages (and those are generally the more interesting ones).
As soon as you get a regular expression (or grammar/automaton) which accepts an infinite language (i.e. an expression with the *
operator, or an automaton with a loop), your construction doesn't work anymore (you get an infinite grammar/automaton).
The answer by David Richerby provided an example of a regular language whose permutation language is not regular anymore – all such examples are infinite languages.
add a comment |
up vote
7
down vote
Let $Sigma$ be an alphabet of size $n$. A regular expression describing all permutations of $Sigma$ must have exponential size. This follows from Theorem 9 in Lower bounds for context-free grammars, which gives an exponential lower bound on the much stronger model of context-free grammars (a regular expression of size $m$ can be converted to a context-free grammar of size $O(m)$).
So in some sense, there is no succinct way to specify all permutations of a word.
Here is a simple proof for a $tildeOmega(2^n)$ lower bound on the size of a regular expression for the language of all permutations of an alphabet $Sigma$ of size $n$. Since a regular expression of size $m$ can be converted to an NFA with $O(m)$ states, it suffices to give a lower bound on NFAs.
We will use the fooling set method, which we now describe. Let $L$ be a language, and let $(x_i,y_i)_{1 leq i leq N}$ be a collection of pairs of words such that:
$x_iy_i in L$.- If $i neq j$ then either $x_i y_j notin L$ or $x_j y_i notin L$.
Then every NFA for $L$ contains at least $N$ states. Indeed, consider any NFA for $L$. For each $i$, consider some accepting computation of $x_iy_i$, and let $q_i$ be the state that the NFA is in after reading $x_i$ in this accepting computation. I claim that $q_i neq q_j$ for $i neq j$. Indeed, if $q_i = q_j$ then a "cut-and-paste" argument shows that both $x_iy_j$ and $x_jy_i$ are in $L$, contrary to assumption.
Now we can prove the lower bound on NFAs for the language $L_n$ of all permutations of the symbols $sigma_1,ldots,sigma_n$; we assume for simplicity of exposition that $n$ is even. For each subset $S$ of $sigma_1,ldots,sigma_n$ of size $n/2$, let $x_S$ consist of all symbols in $S$ in some arbitrary order, and let $y_S$ consist of all symbols not in $S$ in some arbitrary order. Clearly $x_Sy_S in L_n$. Conversely, if $S neq T$ then $x_S y_T notin L_n$. This shows that every NFA for $L_n$ contains at least $binom{n}{n/2} = Omega(2^n/sqrt{n})$ many states.
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
1
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
1
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
1
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
|
show 1 more comment
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
31
down vote
accepted
The fundamental theorems of formal language theory are that regular expressions, regular grammars, deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) all describe the same kinds of languages: namely the regular languages. The fact that we can describe these languages in so many completely different ways suggests that there's something natural and important about these languages, in the same way that the equivalence of Turing machines, the lambda calculus and all kinds of other things suggests that the computable languages are natural and important. They're not just an artifact of whatever random decisions the original discoverer made.
Suppose we add a new rule for creating regular expressions: if $R$ is a regular expression, then $pi(R)$ is a regular expression, and it matches every permutation of every string matched by $R$. So, for example, $L(pi(abc)) = {abc, acb, bac, bca, cab, cba}$. The problem is that this breaks the fundamental equivalences described above. $Lbig(pi((ab)^*))big)$ is the language of strings that contain an equal number of $a$s and $b$s and this isn't a regular language. Compare this with, for example, adding a negation or reversal operator to regular expressions, which doesn't change the class of languages that are accepted.
So, to answer the title question, regular expressions can't do permutations and we don't add that ability because then regular expressions wouldn't match regular languages. Having said that, it's possible that "regular expressions with permutations" would also be an interesting class of languages with lots of different characterizations.
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
7
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
3
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
1
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
1
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.
– reinierpost
21 hours ago
|
show 3 more comments
up vote
31
down vote
accepted
The fundamental theorems of formal language theory are that regular expressions, regular grammars, deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) all describe the same kinds of languages: namely the regular languages. The fact that we can describe these languages in so many completely different ways suggests that there's something natural and important about these languages, in the same way that the equivalence of Turing machines, the lambda calculus and all kinds of other things suggests that the computable languages are natural and important. They're not just an artifact of whatever random decisions the original discoverer made.
Suppose we add a new rule for creating regular expressions: if $R$ is a regular expression, then $pi(R)$ is a regular expression, and it matches every permutation of every string matched by $R$. So, for example, $L(pi(abc)) = {abc, acb, bac, bca, cab, cba}$. The problem is that this breaks the fundamental equivalences described above. $Lbig(pi((ab)^*))big)$ is the language of strings that contain an equal number of $a$s and $b$s and this isn't a regular language. Compare this with, for example, adding a negation or reversal operator to regular expressions, which doesn't change the class of languages that are accepted.
So, to answer the title question, regular expressions can't do permutations and we don't add that ability because then regular expressions wouldn't match regular languages. Having said that, it's possible that "regular expressions with permutations" would also be an interesting class of languages with lots of different characterizations.
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
7
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
3
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
1
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
1
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.
– reinierpost
21 hours ago
|
show 3 more comments
up vote
31
down vote
accepted
up vote
31
down vote
accepted
The fundamental theorems of formal language theory are that regular expressions, regular grammars, deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) all describe the same kinds of languages: namely the regular languages. The fact that we can describe these languages in so many completely different ways suggests that there's something natural and important about these languages, in the same way that the equivalence of Turing machines, the lambda calculus and all kinds of other things suggests that the computable languages are natural and important. They're not just an artifact of whatever random decisions the original discoverer made.
Suppose we add a new rule for creating regular expressions: if $R$ is a regular expression, then $pi(R)$ is a regular expression, and it matches every permutation of every string matched by $R$. So, for example, $L(pi(abc)) = {abc, acb, bac, bca, cab, cba}$. The problem is that this breaks the fundamental equivalences described above. $Lbig(pi((ab)^*))big)$ is the language of strings that contain an equal number of $a$s and $b$s and this isn't a regular language. Compare this with, for example, adding a negation or reversal operator to regular expressions, which doesn't change the class of languages that are accepted.
So, to answer the title question, regular expressions can't do permutations and we don't add that ability because then regular expressions wouldn't match regular languages. Having said that, it's possible that "regular expressions with permutations" would also be an interesting class of languages with lots of different characterizations.
The fundamental theorems of formal language theory are that regular expressions, regular grammars, deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) all describe the same kinds of languages: namely the regular languages. The fact that we can describe these languages in so many completely different ways suggests that there's something natural and important about these languages, in the same way that the equivalence of Turing machines, the lambda calculus and all kinds of other things suggests that the computable languages are natural and important. They're not just an artifact of whatever random decisions the original discoverer made.
Suppose we add a new rule for creating regular expressions: if $R$ is a regular expression, then $pi(R)$ is a regular expression, and it matches every permutation of every string matched by $R$. So, for example, $L(pi(abc)) = {abc, acb, bac, bca, cab, cba}$. The problem is that this breaks the fundamental equivalences described above. $Lbig(pi((ab)^*))big)$ is the language of strings that contain an equal number of $a$s and $b$s and this isn't a regular language. Compare this with, for example, adding a negation or reversal operator to regular expressions, which doesn't change the class of languages that are accepted.
So, to answer the title question, regular expressions can't do permutations and we don't add that ability because then regular expressions wouldn't match regular languages. Having said that, it's possible that "regular expressions with permutations" would also be an interesting class of languages with lots of different characterizations.
edited 6 hours ago
Laurel
1034
1034
answered 2 days ago
David Richerby
64.3k1597185
64.3k1597185
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
7
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
3
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
1
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
1
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.
– reinierpost
21 hours ago
|
show 3 more comments
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
7
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
3
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
1
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
1
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.
– reinierpost
21 hours ago
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
But L((ab)*) is no regular language either – so L(perm((ab)*)) can't be one. ( (ab)* is no regular language since there is no kind of memory to remember how many opening "a"s there are, so with a finite number of states you cannot put the same number of "b"s.)
– Asqiir
2 days ago
7
7
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
@Asqiir $L((ab)^*)$ is regular because it's the language matched by the given regular expression. You seem to have misunderstood what the language is. It's ${varepsilon, ab, abab, ababab, abababab, dots}$, not ${varepsilon,ab,aabb,aaabbb,aaaabbbb,dots}$. The latter language isn't regular but that's not the language we're talking about.
– David Richerby
2 days ago
3
3
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
@Asqiir It is that language because it's what you get by applying any permutation you want to a language in which every string contained the same number of $a$s and $b$s; it's not regular because you can prove that using the pumping lemma.
– David Richerby
2 days ago
1
1
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
You are completely right. I missed the point of "putting regular expressions into each other", I only thought about "permutating a fixed word" not "permutating another regex" which of course isn't possible.
– Asqiir
2 days ago
1
1
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the
!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.– reinierpost
21 hours ago
Perhaps regular expressions with permutations describe a class of languages with interesting properties, but I've never run into need for the
!
operator in practice, and I suppose few people have, as it's easy to implement, and no implementation of extended regular expressions I've seen supports it.– reinierpost
21 hours ago
|
show 3 more comments
up vote
13
down vote
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
Your "proof" only looked at permutations of single words, which are finite languages.
Every finite language is regular (e.g. just by listing all of the members with a |
inbetween), but there are infinite regular languages (and those are generally the more interesting ones).
As soon as you get a regular expression (or grammar/automaton) which accepts an infinite language (i.e. an expression with the *
operator, or an automaton with a loop), your construction doesn't work anymore (you get an infinite grammar/automaton).
The answer by David Richerby provided an example of a regular language whose permutation language is not regular anymore – all such examples are infinite languages.
add a comment |
up vote
13
down vote
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
Your "proof" only looked at permutations of single words, which are finite languages.
Every finite language is regular (e.g. just by listing all of the members with a |
inbetween), but there are infinite regular languages (and those are generally the more interesting ones).
As soon as you get a regular expression (or grammar/automaton) which accepts an infinite language (i.e. an expression with the *
operator, or an automaton with a loop), your construction doesn't work anymore (you get an infinite grammar/automaton).
The answer by David Richerby provided an example of a regular language whose permutation language is not regular anymore – all such examples are infinite languages.
add a comment |
up vote
13
down vote
up vote
13
down vote
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
Your "proof" only looked at permutations of single words, which are finite languages.
Every finite language is regular (e.g. just by listing all of the members with a |
inbetween), but there are infinite regular languages (and those are generally the more interesting ones).
As soon as you get a regular expression (or grammar/automaton) which accepts an infinite language (i.e. an expression with the *
operator, or an automaton with a loop), your construction doesn't work anymore (you get an infinite grammar/automaton).
The answer by David Richerby provided an example of a regular language whose permutation language is not regular anymore – all such examples are infinite languages.
So my question is:
- (Why) Is my proof wrong?
- If it is right: Why is there no easy way to express permutations?
Your "proof" only looked at permutations of single words, which are finite languages.
Every finite language is regular (e.g. just by listing all of the members with a |
inbetween), but there are infinite regular languages (and those are generally the more interesting ones).
As soon as you get a regular expression (or grammar/automaton) which accepts an infinite language (i.e. an expression with the *
operator, or an automaton with a loop), your construction doesn't work anymore (you get an infinite grammar/automaton).
The answer by David Richerby provided an example of a regular language whose permutation language is not regular anymore – all such examples are infinite languages.
answered yesterday
Paŭlo Ebermann
47329
47329
add a comment |
add a comment |
up vote
7
down vote
Let $Sigma$ be an alphabet of size $n$. A regular expression describing all permutations of $Sigma$ must have exponential size. This follows from Theorem 9 in Lower bounds for context-free grammars, which gives an exponential lower bound on the much stronger model of context-free grammars (a regular expression of size $m$ can be converted to a context-free grammar of size $O(m)$).
So in some sense, there is no succinct way to specify all permutations of a word.
Here is a simple proof for a $tildeOmega(2^n)$ lower bound on the size of a regular expression for the language of all permutations of an alphabet $Sigma$ of size $n$. Since a regular expression of size $m$ can be converted to an NFA with $O(m)$ states, it suffices to give a lower bound on NFAs.
We will use the fooling set method, which we now describe. Let $L$ be a language, and let $(x_i,y_i)_{1 leq i leq N}$ be a collection of pairs of words such that:
$x_iy_i in L$.- If $i neq j$ then either $x_i y_j notin L$ or $x_j y_i notin L$.
Then every NFA for $L$ contains at least $N$ states. Indeed, consider any NFA for $L$. For each $i$, consider some accepting computation of $x_iy_i$, and let $q_i$ be the state that the NFA is in after reading $x_i$ in this accepting computation. I claim that $q_i neq q_j$ for $i neq j$. Indeed, if $q_i = q_j$ then a "cut-and-paste" argument shows that both $x_iy_j$ and $x_jy_i$ are in $L$, contrary to assumption.
Now we can prove the lower bound on NFAs for the language $L_n$ of all permutations of the symbols $sigma_1,ldots,sigma_n$; we assume for simplicity of exposition that $n$ is even. For each subset $S$ of $sigma_1,ldots,sigma_n$ of size $n/2$, let $x_S$ consist of all symbols in $S$ in some arbitrary order, and let $y_S$ consist of all symbols not in $S$ in some arbitrary order. Clearly $x_Sy_S in L_n$. Conversely, if $S neq T$ then $x_S y_T notin L_n$. This shows that every NFA for $L_n$ contains at least $binom{n}{n/2} = Omega(2^n/sqrt{n})$ many states.
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
1
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
1
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
1
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
|
show 1 more comment
up vote
7
down vote
Let $Sigma$ be an alphabet of size $n$. A regular expression describing all permutations of $Sigma$ must have exponential size. This follows from Theorem 9 in Lower bounds for context-free grammars, which gives an exponential lower bound on the much stronger model of context-free grammars (a regular expression of size $m$ can be converted to a context-free grammar of size $O(m)$).
So in some sense, there is no succinct way to specify all permutations of a word.
Here is a simple proof for a $tildeOmega(2^n)$ lower bound on the size of a regular expression for the language of all permutations of an alphabet $Sigma$ of size $n$. Since a regular expression of size $m$ can be converted to an NFA with $O(m)$ states, it suffices to give a lower bound on NFAs.
We will use the fooling set method, which we now describe. Let $L$ be a language, and let $(x_i,y_i)_{1 leq i leq N}$ be a collection of pairs of words such that:
$x_iy_i in L$.- If $i neq j$ then either $x_i y_j notin L$ or $x_j y_i notin L$.
Then every NFA for $L$ contains at least $N$ states. Indeed, consider any NFA for $L$. For each $i$, consider some accepting computation of $x_iy_i$, and let $q_i$ be the state that the NFA is in after reading $x_i$ in this accepting computation. I claim that $q_i neq q_j$ for $i neq j$. Indeed, if $q_i = q_j$ then a "cut-and-paste" argument shows that both $x_iy_j$ and $x_jy_i$ are in $L$, contrary to assumption.
Now we can prove the lower bound on NFAs for the language $L_n$ of all permutations of the symbols $sigma_1,ldots,sigma_n$; we assume for simplicity of exposition that $n$ is even. For each subset $S$ of $sigma_1,ldots,sigma_n$ of size $n/2$, let $x_S$ consist of all symbols in $S$ in some arbitrary order, and let $y_S$ consist of all symbols not in $S$ in some arbitrary order. Clearly $x_Sy_S in L_n$. Conversely, if $S neq T$ then $x_S y_T notin L_n$. This shows that every NFA for $L_n$ contains at least $binom{n}{n/2} = Omega(2^n/sqrt{n})$ many states.
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
1
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
1
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
1
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
|
show 1 more comment
up vote
7
down vote
up vote
7
down vote
Let $Sigma$ be an alphabet of size $n$. A regular expression describing all permutations of $Sigma$ must have exponential size. This follows from Theorem 9 in Lower bounds for context-free grammars, which gives an exponential lower bound on the much stronger model of context-free grammars (a regular expression of size $m$ can be converted to a context-free grammar of size $O(m)$).
So in some sense, there is no succinct way to specify all permutations of a word.
Here is a simple proof for a $tildeOmega(2^n)$ lower bound on the size of a regular expression for the language of all permutations of an alphabet $Sigma$ of size $n$. Since a regular expression of size $m$ can be converted to an NFA with $O(m)$ states, it suffices to give a lower bound on NFAs.
We will use the fooling set method, which we now describe. Let $L$ be a language, and let $(x_i,y_i)_{1 leq i leq N}$ be a collection of pairs of words such that:
$x_iy_i in L$.- If $i neq j$ then either $x_i y_j notin L$ or $x_j y_i notin L$.
Then every NFA for $L$ contains at least $N$ states. Indeed, consider any NFA for $L$. For each $i$, consider some accepting computation of $x_iy_i$, and let $q_i$ be the state that the NFA is in after reading $x_i$ in this accepting computation. I claim that $q_i neq q_j$ for $i neq j$. Indeed, if $q_i = q_j$ then a "cut-and-paste" argument shows that both $x_iy_j$ and $x_jy_i$ are in $L$, contrary to assumption.
Now we can prove the lower bound on NFAs for the language $L_n$ of all permutations of the symbols $sigma_1,ldots,sigma_n$; we assume for simplicity of exposition that $n$ is even. For each subset $S$ of $sigma_1,ldots,sigma_n$ of size $n/2$, let $x_S$ consist of all symbols in $S$ in some arbitrary order, and let $y_S$ consist of all symbols not in $S$ in some arbitrary order. Clearly $x_Sy_S in L_n$. Conversely, if $S neq T$ then $x_S y_T notin L_n$. This shows that every NFA for $L_n$ contains at least $binom{n}{n/2} = Omega(2^n/sqrt{n})$ many states.
Let $Sigma$ be an alphabet of size $n$. A regular expression describing all permutations of $Sigma$ must have exponential size. This follows from Theorem 9 in Lower bounds for context-free grammars, which gives an exponential lower bound on the much stronger model of context-free grammars (a regular expression of size $m$ can be converted to a context-free grammar of size $O(m)$).
So in some sense, there is no succinct way to specify all permutations of a word.
Here is a simple proof for a $tildeOmega(2^n)$ lower bound on the size of a regular expression for the language of all permutations of an alphabet $Sigma$ of size $n$. Since a regular expression of size $m$ can be converted to an NFA with $O(m)$ states, it suffices to give a lower bound on NFAs.
We will use the fooling set method, which we now describe. Let $L$ be a language, and let $(x_i,y_i)_{1 leq i leq N}$ be a collection of pairs of words such that:
$x_iy_i in L$.- If $i neq j$ then either $x_i y_j notin L$ or $x_j y_i notin L$.
Then every NFA for $L$ contains at least $N$ states. Indeed, consider any NFA for $L$. For each $i$, consider some accepting computation of $x_iy_i$, and let $q_i$ be the state that the NFA is in after reading $x_i$ in this accepting computation. I claim that $q_i neq q_j$ for $i neq j$. Indeed, if $q_i = q_j$ then a "cut-and-paste" argument shows that both $x_iy_j$ and $x_jy_i$ are in $L$, contrary to assumption.
Now we can prove the lower bound on NFAs for the language $L_n$ of all permutations of the symbols $sigma_1,ldots,sigma_n$; we assume for simplicity of exposition that $n$ is even. For each subset $S$ of $sigma_1,ldots,sigma_n$ of size $n/2$, let $x_S$ consist of all symbols in $S$ in some arbitrary order, and let $y_S$ consist of all symbols not in $S$ in some arbitrary order. Clearly $x_Sy_S in L_n$. Conversely, if $S neq T$ then $x_S y_T notin L_n$. This shows that every NFA for $L_n$ contains at least $binom{n}{n/2} = Omega(2^n/sqrt{n})$ many states.
edited 2 days ago
answered 2 days ago
Yuval Filmus
187k12176337
187k12176337
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
1
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
1
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
1
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
|
show 1 more comment
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
1
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
1
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
1
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
Does this mean 1) in theory it would be possible to let »abc« match all {abc, acb, bac, bca, cab, cba} but it's just not efficient and would make them too slow since »abc« would expand exponentially to (abc|acb|bac|bca|cab|cba)? or 2) The kind of automaton I need is not able to specify all permutations for a given word?
– Asqiir
2 days ago
1
1
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
Here is a regular expression that matches all permutations of $abc$: $abc+acd+bac+bca+cab+cba$. You can easily convert this to a DFA having $1+3+6+6+1 = 17$ states. The same works for $abcdefghij$.
– Yuval Filmus
2 days ago
1
1
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
What I understood: In theory, regular languages are able to accept permutations (so are regular expressions). There is just no "simple way" to write "permutation of abc" like »abc«. (For whatever reasons.)
– Asqiir
2 days ago
1
1
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
Yes, that's a good summary. I'll see if I can come up with a simpler argument for regular expressions.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
I added a simple argument using the fooling set method.
– Yuval Filmus
2 days ago
|
show 1 more comment
Asqiir is a new contributor. Be nice, and check out our Code of Conduct.
Asqiir is a new contributor. Be nice, and check out our Code of Conduct.
Asqiir is a new contributor. Be nice, and check out our Code of Conduct.
Asqiir is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f100206%2fwhy-is-there-no-permutation-in-regexes-even-if-regular-languages-seem-to-be-ab%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
9
You can list all permutations of your word with a regular expression. The resulting expression will be quite large, but will definitely be a regular expression.
– Yuval Filmus
2 days ago
6
I suggest ignoring all answers about Theory of Computation on stackoverflow. This isn't that site's specialty.
– Yuval Filmus
2 days ago
The answer on your linked page here - stackoverflow.com/a/3102205/6936386 - seems to be easily adaptable and not too complicated:
^(a()|a()|b()|c()){4}2345$
seems to work (see regex101.com/r/9URPpg/4/tests).– boboquack
yesterday
5
@boboquack That isn't a regular expression in the sense in which the term is used in computer science. (This sort of thing is exactly why Yuval suggests not trusting Stack Overflow answers about theoretical CS.)
– David Richerby
21 hours ago