XD blog

blog page

antlr4, programmation, python


2015-07-13 Using Antlr4 to write a grammar

Here are a few tricks I discovered when I implemented my first grammar using Antlr 4. First, it really helps to be able to test some parts of your grammar. One of the options is the plugin IntelliJ Idea Plugin for ANTLR v4 for the following tool: IntelliJ IDEA (the plugin works with the community edition, see also Installing, Updating and Uninstalling Repository Plugins). You will find many grammar examples at antlr/grammars-v4. The tool tells you if the grammar compiles and when it cannot parse an example as it displays a graphs with the recognized pieces.

source: https://github.com/antlr/intellij-plugin-v4

There are three kinds of objects:

About the syntax, [Name] means the name must begin by an upper letter, [name], it must begin by a lower letter. All objects are defined with a syntax very similar to regular expressions. The grammar DOT (graphviz language) is a simple example to begin with. We can see that: This tells us more about how Antlr tries to match the rule with the text (not sure I'm right about this): it tries rules in the order they are defined by the grammar, it stops searching whenever it finds a token or a fragment (ambiguity is not possible for fragments and tokens). I went through many mistakes when building my first grammar. One of them was looking that way:

line 1:0 mismatched input 'aa' expecting {'something', 'aa'}

As you noticed 'aa' was expected but not matched. It was usually due to some ambiguity. To detect the conflict, With the plugin mentioned above, I checked the rule the string rule was supposed to match (it fails), I removed all rules above, it usually worked. By adding them back one by one in the grammar, it became easier to understand where the conflict was.

On Python, I added function to module pyensae to build and use Antlr4 grammar. See antlr_grammar_build.py, antlr_grammar_use.py.


<-- -->

Xavier Dupré