How to synthesise compiler testing data?












1














I am writing a simple compiler as a school work. I am looking for an automated approach to generate both positive and negative testing data to test my compiler, given the formal grammar and other specification. The language I am dealing with is of mediate size with 38 or so non-terminals. For the sake of illustration, here is a snapshot of the grammar:



program: const_decl* declaration* ENDMARKER

# statement
stmt: flow_stmt | '{' stmt* '}' | NAME [stmt_trailer] ';' | ';'

stmt_trailer: arglist | ['[' expr ']'] '=' expr

flow_stmt: if_stmt | for_stmt | while_stmt | read_stmt ';' | write_stmt ';' | return_stmt ';'

return_stmt: 'return' ['(' expr ')']

if_stmt: 'if' '(' condition ')' stmt ['else' stmt]

condition: expr ('<'|'<='|'>'|'>='|'!='|'==') expr | expr

for_stmt: ('for' '(' NAME '=' expr ';' condition ';'
NAME '=' NAME ('+'|'-') NUMBER ')' stmt)


Is there any tools to generate input file with the help of the grammar? The hand-written tests are too tedious or too weak to discover problems. An example of this language here:



void main() {
int N;
int temp;
int i, j;
int array_size;

reset_heap;
scanf(N);

for (i = 0; i < N; i = i + 1) {
scanf(array_size);
if (array_size > max_heap_size) {
printf("array_size exceeds max_heap_size");
} else {
for (j = 0; j < array_size; j = j + 1) {
scanf(temp);
heap[j] = temp;
}
heap_sort(array_size);
print_heap(array_size);
}
}
}


Generating controllable testing data automatically can save the days. Given the simplicity of the language, there must be some way to effectively do this. Any pointer and insight is greatly appreciated.










share|improve this question






















  • Rule expansion with randomization and thresholds?
    – cgsdfc
    Nov 19 '18 at 13:54
















1














I am writing a simple compiler as a school work. I am looking for an automated approach to generate both positive and negative testing data to test my compiler, given the formal grammar and other specification. The language I am dealing with is of mediate size with 38 or so non-terminals. For the sake of illustration, here is a snapshot of the grammar:



program: const_decl* declaration* ENDMARKER

# statement
stmt: flow_stmt | '{' stmt* '}' | NAME [stmt_trailer] ';' | ';'

stmt_trailer: arglist | ['[' expr ']'] '=' expr

flow_stmt: if_stmt | for_stmt | while_stmt | read_stmt ';' | write_stmt ';' | return_stmt ';'

return_stmt: 'return' ['(' expr ')']

if_stmt: 'if' '(' condition ')' stmt ['else' stmt]

condition: expr ('<'|'<='|'>'|'>='|'!='|'==') expr | expr

for_stmt: ('for' '(' NAME '=' expr ';' condition ';'
NAME '=' NAME ('+'|'-') NUMBER ')' stmt)


Is there any tools to generate input file with the help of the grammar? The hand-written tests are too tedious or too weak to discover problems. An example of this language here:



void main() {
int N;
int temp;
int i, j;
int array_size;

reset_heap;
scanf(N);

for (i = 0; i < N; i = i + 1) {
scanf(array_size);
if (array_size > max_heap_size) {
printf("array_size exceeds max_heap_size");
} else {
for (j = 0; j < array_size; j = j + 1) {
scanf(temp);
heap[j] = temp;
}
heap_sort(array_size);
print_heap(array_size);
}
}
}


Generating controllable testing data automatically can save the days. Given the simplicity of the language, there must be some way to effectively do this. Any pointer and insight is greatly appreciated.










share|improve this question






















  • Rule expansion with randomization and thresholds?
    – cgsdfc
    Nov 19 '18 at 13:54














1












1








1







I am writing a simple compiler as a school work. I am looking for an automated approach to generate both positive and negative testing data to test my compiler, given the formal grammar and other specification. The language I am dealing with is of mediate size with 38 or so non-terminals. For the sake of illustration, here is a snapshot of the grammar:



program: const_decl* declaration* ENDMARKER

# statement
stmt: flow_stmt | '{' stmt* '}' | NAME [stmt_trailer] ';' | ';'

stmt_trailer: arglist | ['[' expr ']'] '=' expr

flow_stmt: if_stmt | for_stmt | while_stmt | read_stmt ';' | write_stmt ';' | return_stmt ';'

return_stmt: 'return' ['(' expr ')']

if_stmt: 'if' '(' condition ')' stmt ['else' stmt]

condition: expr ('<'|'<='|'>'|'>='|'!='|'==') expr | expr

for_stmt: ('for' '(' NAME '=' expr ';' condition ';'
NAME '=' NAME ('+'|'-') NUMBER ')' stmt)


Is there any tools to generate input file with the help of the grammar? The hand-written tests are too tedious or too weak to discover problems. An example of this language here:



void main() {
int N;
int temp;
int i, j;
int array_size;

reset_heap;
scanf(N);

for (i = 0; i < N; i = i + 1) {
scanf(array_size);
if (array_size > max_heap_size) {
printf("array_size exceeds max_heap_size");
} else {
for (j = 0; j < array_size; j = j + 1) {
scanf(temp);
heap[j] = temp;
}
heap_sort(array_size);
print_heap(array_size);
}
}
}


Generating controllable testing data automatically can save the days. Given the simplicity of the language, there must be some way to effectively do this. Any pointer and insight is greatly appreciated.










share|improve this question













I am writing a simple compiler as a school work. I am looking for an automated approach to generate both positive and negative testing data to test my compiler, given the formal grammar and other specification. The language I am dealing with is of mediate size with 38 or so non-terminals. For the sake of illustration, here is a snapshot of the grammar:



program: const_decl* declaration* ENDMARKER

# statement
stmt: flow_stmt | '{' stmt* '}' | NAME [stmt_trailer] ';' | ';'

stmt_trailer: arglist | ['[' expr ']'] '=' expr

flow_stmt: if_stmt | for_stmt | while_stmt | read_stmt ';' | write_stmt ';' | return_stmt ';'

return_stmt: 'return' ['(' expr ')']

if_stmt: 'if' '(' condition ')' stmt ['else' stmt]

condition: expr ('<'|'<='|'>'|'>='|'!='|'==') expr | expr

for_stmt: ('for' '(' NAME '=' expr ';' condition ';'
NAME '=' NAME ('+'|'-') NUMBER ')' stmt)


Is there any tools to generate input file with the help of the grammar? The hand-written tests are too tedious or too weak to discover problems. An example of this language here:



void main() {
int N;
int temp;
int i, j;
int array_size;

reset_heap;
scanf(N);

for (i = 0; i < N; i = i + 1) {
scanf(array_size);
if (array_size > max_heap_size) {
printf("array_size exceeds max_heap_size");
} else {
for (j = 0; j < array_size; j = j + 1) {
scanf(temp);
heap[j] = temp;
}
heap_sort(array_size);
print_heap(array_size);
}
}
}


Generating controllable testing data automatically can save the days. Given the simplicity of the language, there must be some way to effectively do this. Any pointer and insight is greatly appreciated.







testing compiler-construction






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 19 '18 at 13:51









cgsdfc

8611




8611












  • Rule expansion with randomization and thresholds?
    – cgsdfc
    Nov 19 '18 at 13:54


















  • Rule expansion with randomization and thresholds?
    – cgsdfc
    Nov 19 '18 at 13:54
















Rule expansion with randomization and thresholds?
– cgsdfc
Nov 19 '18 at 13:54




Rule expansion with randomization and thresholds?
– cgsdfc
Nov 19 '18 at 13:54












1 Answer
1






active

oldest

votes


















1















Any pointer and insight is greatly appreciated.




This should have the subtopic of How to avoid combinatorial explosion when generating test data.



While I would not be surprised if there are tools to do this having had the same need to generate test data for grammars I have created a few one off applications.



One of the best series of articles I have found on this is by Eric Lippert, Every Binary Tree There Is, think BNF converted to binary operators then converted to AST when you read tree. However he uses Catalan (every branch has two leaves) and when I wrote my app I preferred Motzikin (a branch can have one or two leaves).



Also he did his in C# with LINQ and I did mine in Prolog using DCG.



Generating the data based on the BNF or DCG is not hard, the real trick is to limit the area of expansion and the size of the expansion and to inject bad data.



By area of expansion lets say you want to test nested if statements three levels deep, but have to have valid code that compiles. Obviously you need the boilerplate code to make it compile then you start changing the deeply nested if by adding or removing the else clause. So you need to put in constraints so that the boilerplate code is constant and the testing part is variable.



By size of expansion lets say that you want to test conditional expressions. You can easily calculate that if you have many operators and you want to test them all in combinations you soon run into combinatorial explosion. The trick is to ensure you test deep enough and with enough breadth but not every combination. Again the judicial use of constraints helps.



So the point of all of this is that you start with a tool that takes in the BNF and generates valid code. Then you modify the BNF to add constraints and modify the generator to understand the constraints to generate the code examples.



Then you modify the BNF for invalid data and likewise the generator to understand those rules.



After that is working you can then start layering on levels of automation.



If you do go this route and decide that you will have to learn Prolog, take a look at Mercury first. I have not done this with Mercury, but if I do it again Mercury is high on the list.



While my actual code is not public, this and this is the closest to it that is public.



Along the way I had some fun with it in Code Golf.



When generating terminals such as reserved words or values for types, you can use predefined list with both valid and invalid data, e.g. for if if the language is case sensitive I would include in the list if,If,IF,iF, etc. For value types such as unsigned byte I would include -1,0,255 and 256.



When I was testing basic binary math expressions with +, -, * and ^ I generated all the test for with three basic numbers -2,-1,0,1, and 2. I thought it would be useless since I already had hundreds of test cases, but since it only took a few minutes to generate all of the test cases and several hours to run it, to my surprise it found a pattern I did not cover. The point here is that contrary what most people say about having to many test cases, remember that it is only time on a computer by changing a few constraints so do the large number of test.






share|improve this answer























  • Of interest: How to identify wasteful representations of Prolog terms
    – Guy Coder
    Nov 19 '18 at 22:34










  • Great answer. I need some time to digest.
    – cgsdfc
    Nov 20 '18 at 1:51











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376098%2fhow-to-synthesise-compiler-testing-data%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1















Any pointer and insight is greatly appreciated.




This should have the subtopic of How to avoid combinatorial explosion when generating test data.



While I would not be surprised if there are tools to do this having had the same need to generate test data for grammars I have created a few one off applications.



One of the best series of articles I have found on this is by Eric Lippert, Every Binary Tree There Is, think BNF converted to binary operators then converted to AST when you read tree. However he uses Catalan (every branch has two leaves) and when I wrote my app I preferred Motzikin (a branch can have one or two leaves).



Also he did his in C# with LINQ and I did mine in Prolog using DCG.



Generating the data based on the BNF or DCG is not hard, the real trick is to limit the area of expansion and the size of the expansion and to inject bad data.



By area of expansion lets say you want to test nested if statements three levels deep, but have to have valid code that compiles. Obviously you need the boilerplate code to make it compile then you start changing the deeply nested if by adding or removing the else clause. So you need to put in constraints so that the boilerplate code is constant and the testing part is variable.



By size of expansion lets say that you want to test conditional expressions. You can easily calculate that if you have many operators and you want to test them all in combinations you soon run into combinatorial explosion. The trick is to ensure you test deep enough and with enough breadth but not every combination. Again the judicial use of constraints helps.



So the point of all of this is that you start with a tool that takes in the BNF and generates valid code. Then you modify the BNF to add constraints and modify the generator to understand the constraints to generate the code examples.



Then you modify the BNF for invalid data and likewise the generator to understand those rules.



After that is working you can then start layering on levels of automation.



If you do go this route and decide that you will have to learn Prolog, take a look at Mercury first. I have not done this with Mercury, but if I do it again Mercury is high on the list.



While my actual code is not public, this and this is the closest to it that is public.



Along the way I had some fun with it in Code Golf.



When generating terminals such as reserved words or values for types, you can use predefined list with both valid and invalid data, e.g. for if if the language is case sensitive I would include in the list if,If,IF,iF, etc. For value types such as unsigned byte I would include -1,0,255 and 256.



When I was testing basic binary math expressions with +, -, * and ^ I generated all the test for with three basic numbers -2,-1,0,1, and 2. I thought it would be useless since I already had hundreds of test cases, but since it only took a few minutes to generate all of the test cases and several hours to run it, to my surprise it found a pattern I did not cover. The point here is that contrary what most people say about having to many test cases, remember that it is only time on a computer by changing a few constraints so do the large number of test.






share|improve this answer























  • Of interest: How to identify wasteful representations of Prolog terms
    – Guy Coder
    Nov 19 '18 at 22:34










  • Great answer. I need some time to digest.
    – cgsdfc
    Nov 20 '18 at 1:51
















1















Any pointer and insight is greatly appreciated.




This should have the subtopic of How to avoid combinatorial explosion when generating test data.



While I would not be surprised if there are tools to do this having had the same need to generate test data for grammars I have created a few one off applications.



One of the best series of articles I have found on this is by Eric Lippert, Every Binary Tree There Is, think BNF converted to binary operators then converted to AST when you read tree. However he uses Catalan (every branch has two leaves) and when I wrote my app I preferred Motzikin (a branch can have one or two leaves).



Also he did his in C# with LINQ and I did mine in Prolog using DCG.



Generating the data based on the BNF or DCG is not hard, the real trick is to limit the area of expansion and the size of the expansion and to inject bad data.



By area of expansion lets say you want to test nested if statements three levels deep, but have to have valid code that compiles. Obviously you need the boilerplate code to make it compile then you start changing the deeply nested if by adding or removing the else clause. So you need to put in constraints so that the boilerplate code is constant and the testing part is variable.



By size of expansion lets say that you want to test conditional expressions. You can easily calculate that if you have many operators and you want to test them all in combinations you soon run into combinatorial explosion. The trick is to ensure you test deep enough and with enough breadth but not every combination. Again the judicial use of constraints helps.



So the point of all of this is that you start with a tool that takes in the BNF and generates valid code. Then you modify the BNF to add constraints and modify the generator to understand the constraints to generate the code examples.



Then you modify the BNF for invalid data and likewise the generator to understand those rules.



After that is working you can then start layering on levels of automation.



If you do go this route and decide that you will have to learn Prolog, take a look at Mercury first. I have not done this with Mercury, but if I do it again Mercury is high on the list.



While my actual code is not public, this and this is the closest to it that is public.



Along the way I had some fun with it in Code Golf.



When generating terminals such as reserved words or values for types, you can use predefined list with both valid and invalid data, e.g. for if if the language is case sensitive I would include in the list if,If,IF,iF, etc. For value types such as unsigned byte I would include -1,0,255 and 256.



When I was testing basic binary math expressions with +, -, * and ^ I generated all the test for with three basic numbers -2,-1,0,1, and 2. I thought it would be useless since I already had hundreds of test cases, but since it only took a few minutes to generate all of the test cases and several hours to run it, to my surprise it found a pattern I did not cover. The point here is that contrary what most people say about having to many test cases, remember that it is only time on a computer by changing a few constraints so do the large number of test.






share|improve this answer























  • Of interest: How to identify wasteful representations of Prolog terms
    – Guy Coder
    Nov 19 '18 at 22:34










  • Great answer. I need some time to digest.
    – cgsdfc
    Nov 20 '18 at 1:51














1












1








1







Any pointer and insight is greatly appreciated.




This should have the subtopic of How to avoid combinatorial explosion when generating test data.



While I would not be surprised if there are tools to do this having had the same need to generate test data for grammars I have created a few one off applications.



One of the best series of articles I have found on this is by Eric Lippert, Every Binary Tree There Is, think BNF converted to binary operators then converted to AST when you read tree. However he uses Catalan (every branch has two leaves) and when I wrote my app I preferred Motzikin (a branch can have one or two leaves).



Also he did his in C# with LINQ and I did mine in Prolog using DCG.



Generating the data based on the BNF or DCG is not hard, the real trick is to limit the area of expansion and the size of the expansion and to inject bad data.



By area of expansion lets say you want to test nested if statements three levels deep, but have to have valid code that compiles. Obviously you need the boilerplate code to make it compile then you start changing the deeply nested if by adding or removing the else clause. So you need to put in constraints so that the boilerplate code is constant and the testing part is variable.



By size of expansion lets say that you want to test conditional expressions. You can easily calculate that if you have many operators and you want to test them all in combinations you soon run into combinatorial explosion. The trick is to ensure you test deep enough and with enough breadth but not every combination. Again the judicial use of constraints helps.



So the point of all of this is that you start with a tool that takes in the BNF and generates valid code. Then you modify the BNF to add constraints and modify the generator to understand the constraints to generate the code examples.



Then you modify the BNF for invalid data and likewise the generator to understand those rules.



After that is working you can then start layering on levels of automation.



If you do go this route and decide that you will have to learn Prolog, take a look at Mercury first. I have not done this with Mercury, but if I do it again Mercury is high on the list.



While my actual code is not public, this and this is the closest to it that is public.



Along the way I had some fun with it in Code Golf.



When generating terminals such as reserved words or values for types, you can use predefined list with both valid and invalid data, e.g. for if if the language is case sensitive I would include in the list if,If,IF,iF, etc. For value types such as unsigned byte I would include -1,0,255 and 256.



When I was testing basic binary math expressions with +, -, * and ^ I generated all the test for with three basic numbers -2,-1,0,1, and 2. I thought it would be useless since I already had hundreds of test cases, but since it only took a few minutes to generate all of the test cases and several hours to run it, to my surprise it found a pattern I did not cover. The point here is that contrary what most people say about having to many test cases, remember that it is only time on a computer by changing a few constraints so do the large number of test.






share|improve this answer















Any pointer and insight is greatly appreciated.




This should have the subtopic of How to avoid combinatorial explosion when generating test data.



While I would not be surprised if there are tools to do this having had the same need to generate test data for grammars I have created a few one off applications.



One of the best series of articles I have found on this is by Eric Lippert, Every Binary Tree There Is, think BNF converted to binary operators then converted to AST when you read tree. However he uses Catalan (every branch has two leaves) and when I wrote my app I preferred Motzikin (a branch can have one or two leaves).



Also he did his in C# with LINQ and I did mine in Prolog using DCG.



Generating the data based on the BNF or DCG is not hard, the real trick is to limit the area of expansion and the size of the expansion and to inject bad data.



By area of expansion lets say you want to test nested if statements three levels deep, but have to have valid code that compiles. Obviously you need the boilerplate code to make it compile then you start changing the deeply nested if by adding or removing the else clause. So you need to put in constraints so that the boilerplate code is constant and the testing part is variable.



By size of expansion lets say that you want to test conditional expressions. You can easily calculate that if you have many operators and you want to test them all in combinations you soon run into combinatorial explosion. The trick is to ensure you test deep enough and with enough breadth but not every combination. Again the judicial use of constraints helps.



So the point of all of this is that you start with a tool that takes in the BNF and generates valid code. Then you modify the BNF to add constraints and modify the generator to understand the constraints to generate the code examples.



Then you modify the BNF for invalid data and likewise the generator to understand those rules.



After that is working you can then start layering on levels of automation.



If you do go this route and decide that you will have to learn Prolog, take a look at Mercury first. I have not done this with Mercury, but if I do it again Mercury is high on the list.



While my actual code is not public, this and this is the closest to it that is public.



Along the way I had some fun with it in Code Golf.



When generating terminals such as reserved words or values for types, you can use predefined list with both valid and invalid data, e.g. for if if the language is case sensitive I would include in the list if,If,IF,iF, etc. For value types such as unsigned byte I would include -1,0,255 and 256.



When I was testing basic binary math expressions with +, -, * and ^ I generated all the test for with three basic numbers -2,-1,0,1, and 2. I thought it would be useless since I already had hundreds of test cases, but since it only took a few minutes to generate all of the test cases and several hours to run it, to my surprise it found a pattern I did not cover. The point here is that contrary what most people say about having to many test cases, remember that it is only time on a computer by changing a few constraints so do the large number of test.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 19 '18 at 19:49

























answered Nov 19 '18 at 14:29









Guy Coder

14.9k43982




14.9k43982












  • Of interest: How to identify wasteful representations of Prolog terms
    – Guy Coder
    Nov 19 '18 at 22:34










  • Great answer. I need some time to digest.
    – cgsdfc
    Nov 20 '18 at 1:51


















  • Of interest: How to identify wasteful representations of Prolog terms
    – Guy Coder
    Nov 19 '18 at 22:34










  • Great answer. I need some time to digest.
    – cgsdfc
    Nov 20 '18 at 1:51
















Of interest: How to identify wasteful representations of Prolog terms
– Guy Coder
Nov 19 '18 at 22:34




Of interest: How to identify wasteful representations of Prolog terms
– Guy Coder
Nov 19 '18 at 22:34












Great answer. I need some time to digest.
– cgsdfc
Nov 20 '18 at 1:51




Great answer. I need some time to digest.
– cgsdfc
Nov 20 '18 at 1:51


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376098%2fhow-to-synthesise-compiler-testing-data%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Can a sorcerer learn a 5th-level spell early by creating spell slots using the Font of Magic feature?

Does disintegrating a polymorphed enemy still kill it after the 2018 errata?

A Topological Invariant for $pi_3(U(n))$