Crowbar central

A blog about the ooc language, with source code examples, updates on the state of the llama (our beloved mascot), and random thoughts by nddrylliog, the benevolent dictator for life of ooc.

The opinions expressed in this blog are not necessarily those of the whole ooc community.

Sep 28

Error reporting improvements - parser generator tricks

Several had noticed that rock error reporting was particularly sucky these last few weeks. I’m happy to announce that it has been fixed and even improved, as you can see:

Which, surely you agree, is much much nicer to work with.

(Disclaimer: the code below is greg syntax, not ooc code)

Turns out it was broken by my fault. I had changed things like

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN
OPEN_PAREN = ‘(‘
CLOS_PAREN = ‘)’

To

FunctionCall = Name ‘(’ Args ‘)’

It didn’t break parsing (ie. produced exactly the same AST) but it broke error reporting. Why?

The original peg/leg had no error reporting facility. Nor did greg, _why’s re-entrant fork of it. So in my fork, I added some error reporting facility.

peg/leg/greg allow the specification of blocks of C code inside brackets {} when a rule is fully matched, such as this:

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN { onCall() }

So in the same spirit, I added ~{}, which is called when a rule is *not* matched.

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN ~{ onError() } { onCall()}

In that example, onError() will be called if the closing parenthesis isn’t matched.

But wait, isn’t that illogical? Why choose the closing parenthesis to test for errors? After all, we could do something like:

FunctionCall = Name OPEN_PAREN (Expr (COMMA Expr ~{ onError() })*)) CLOS_PAREN

Ie. if Expr is not matched (malformed expression) it would throw an error. But that won’t work. Why? Because it’s alright for Expr not to be matched. ‘*’ means ‘0 or more’. So if it’s not matched, the parser generated by greg will just assume that there are no more Expr to be matched and that the parent rule ()* is fully matched and then it’ll move on to the next rule, ie. the ‘)’.

Now we see why it makes sense to test for closing delimiters such as ‘)’, ‘}’, ‘]’. If there is indeed a malformed expression, then Expr will no longer match and it’ll stop matching ()*, and when trying to match the closing parenthesis it will fail! And that’s when we want to throw an error.

But as I said the only thing I did was to change from

FunctionCall = Name OPEN_PAREN Args CLOS_PAREN ~{ /* … */ } { /* … */ }

To

FunctionCall = Name ‘(’ Args ‘)’ ~{ /* … */ } { /* … */ }

And the explanation is very simple: ~{} only works after named rules. ‘(’ is a string rule and thus ~{} is simply ignored. This is a little limitation of my greg error reporting solution that I had totally forgotten. It might be considered a bug, if anyone else uses greg and wishes that particular issue to be fixed, you know where GitHub is.

I hope you enjoyed this little journey in the world of parser generators, it’s an exciting domain and I’ve only grazed the surface. If you’re interested in developing your own toy language in ooc with greg, decadence is a nice tutorial I was corrupted into writing.