|
Error Reporting and Recovery
This is a rough document describing the new error recovery features
in Version 0.7.1. This document also describes how features have
changed since Version 0.6.
The first change (from 0.6) is that we have two new exceptions:
- ParseException
- TokenMgrError
Whenever the token manager detects a problem, it throws the exception
TokenMgrError. Previously, it used to print the message:
Lexical Error ...
following which it use to throw the exception ParseError.
Whenever the parser detects a problem, it throws the exception ParseException.
Previously, it used to print the message:
Encountered ... Was expecting one of ... following which it use
to throw the exception ParseError.
In Version 0.7.1, error messages are never printed explicitly, rather
this information is stored inside the exception objects that are
thrown. Please see the classes ParseException.java and TokenMgrError.java
(that get generated by JavaCC during parser generation) for more
details.
If the thrown exceptions are never caught, then a standard action
is taken by the virtual machine which normally includes printing
the stack trace and also the result of the "toString" method
in the exception. So if you do not catch the JavaCC exceptions,
a message quite similar to the ones in Version 0.6.
But if you catch the exception, you must print the message yourself.
Exceptions in Java are all subclasses of type Throwable. Furthermore,
exceptions are divided into two broad categories - ERRORS and other
exceptions.
Errors are exceptions that one is not expected to recover from -
examples of these are ThreadDeath or OutOfMemoryError. Errors are
indicated by subclassing the exception "Error". Exceptions subclassed
from Error need not be specified in the "throws" clause of method
declarations.
Exceptions other than errors are typically defined by subclassing
the exception "Exception". These exceptions are typically handled
by the user program and must be declared in throws clauses of method
declarations (if it is possible for the method to throw that exception).
The exception TokenMgrError is a subclass of Error, while the exception
ParseException is a subclass of Exception. The reasoning here is
that the token manager is never expected to throw an exception -
you must be careful in defining your token specifications such that
you cover all cases. Hence the suffix "Error" in TokenMgrError.
You do not have to worry about this exception - if you have designed
your tokens well, it should never get thrown. Whereas it is typical
to attempt recovery from Parser errors - hence the name "ParseException".
(Although if you still want to recover from token manager errors,
you
can do it - it's just that you are not forced to catch them.)
In Version 0.7.1, we have added a syntax to specify additional exceptions
that may be thrown by methods corresponding to non-terminals. This
syntax is identical to the Java "throws ..." syntax. Here's an example
of how you use this:
void VariableDeclaration() throws SymbolTableException, IOException
:
{...}
{
...
}
Here, VariableDeclaration is defined to throw exceptions SymbolTableException
and IOException in addition to ParseException.
Error Reporting:
The scheme for error reporting is simpler in Version 0.7.1 (as compared
to Version 0.6) - simply modify the file ParseException.java to
do what you want it to do. Typically, you would modify the getMessage
method to do your own customized error reporting. All information
regarding these methods can be obtained from the comments in the
generated files ParseException.java and TokenMgrError.java. It will
also help to understand the functionality of the class Throwable
(read a Java book for this).
There is a method in the generated parser called "generateParseException".
You can call this method anytime you wish to generate an object
of type ParseException. This object will contain all the choices
that the parser has attempted since the last successfully consumed
token.
Error Recovery
JavaCC offers two kinds of error recovery - shallow recovery and
deep recovery. Shallow recovery recovers if none of the current
choices have succeeded in being selected, while deep recovery is
when a choice is selected, but then an error happens sometime during
the parsing of this choice.
Shallow Error Recovery:
We shall explain shallow error recovery using the following example:
void Stm() :
{}
{
IfStm()
|
WhileStm()
}
Let's assume that IfStm starts with the reserved word "if" and
WhileStm starts with the reserved word "while". Suppose you want
to recover by skipping all the way to the next semicolon when neither
IfStm nor WhileStm can be matched by the next input token (assuming
a lookahead of 1). That is the next token is neither "if" nor "while".
What you do is write the following:
void Stm() :
{}
{
IfStm()
|
WhileStm()
|
error_skipto(SEMICOLON)
}
But you have to define "error_skipto" first. So far as JavaCC is
concerned, "error_skipto" is just like any other non-terminal. The
following is one way to define "error_skipto" (here we use the standard
JAVACODE production):
JAVACODE
void error_skipto(int kind) {
ParseException e = generateParseException(); // generate the exception
object.
System.out.println(e.toString()); // print the error message
Token t;
do {
t = getNextToken();
} while (t.kind != kind);
// The above loop consumes tokens all the way upto a token of
// "kind". We use a do-while loop rather than a while because
the
// current token is the one immediately before the erroneous token
// (in our case the token immediately before what should have
been
// "if"/"while".
}
That's it for shallow error recovery. In a future version of JavaCC
we will have support for modular composition of grammars. When this
happens, one can place all these error recovery routines into a
separate module that can be "imported" into the main grammar module.
We intend to supply a library of useful routines (for error recovery
and otherwise) when we implement this capability.
Deep Error Recovery
Let's use the same example that we did for shallow recovery:
void Stm() :
{}
{
IfStm()
|
WhileStm()
}
In this case we wish to recover in the same way. However, we wish
to recover even when there is an error deeper into the parse. For
example, suppose the next token was "while" - therefore the choice
"WhileStm" was taken. But suppose that during the parse of WhileStm
some error is encoutered - say one has "while (foo { stm; }" - i.e.,
the closing parentheses has been missed. Shallow recovery will not
work for this situation. You need deep recovery to achieve this.
For this, we offer a new syntactic entity in JavaCC - the try-catch-finally
block.
First, let us rewrite the above example for deep error recovery
and then
explain the try-catch-finally block in more detail:
void Stm() :
{}
{
try {
(
IfStm()
|
WhileStm()
)
catch (ParseException e) {
error_skipto(SEMICOLON);
}
}
That's all you need to do. If there is any unrecovered error during
the parse of IfStm or WhileStm, then the catch block takes over.
You can have any number of catch blocks and also optionally a finally
block (just as in Java). What goes into the catch blocks is *Java
code*, not JavaCC expansions. For example, the above example could
have been rewritten as:
void Stm() :
{}
{
try {
(
IfStm()
|
WhileStm()
)
catch (ParseException e) {
System.out.println(e.toString());
Token t;
do {
t = getNextToken();
} while (t.kind != SEMICOLON);
}
}
Our belief is that its best to avoid placing too much Java code
in the catch and finally blocks since it overwhelms the grammar
reader. Its best to define methods that you can then call from the
catch blocks.
Note that in the second writing of the example, we essentially copied
the code out of the implementation of error_skipto. But we left
out the first statement - the call to generateParseException. That's
because in this case, the catch block already provides us with the
exception. But even if you did call this method, you will get back
an identical object
 |