LR Parser Generator 2014-10-11

A few weeks ago I decided that I wanted to write a parser generator. This is one of those projects I’ve wanted to do for a long time but never got started on. When I finally started I decided that it would be fun to learn Scala at the same time, so I wrote the parser generator in Scala.

One of the things that motivated me to get started writing a parser generator is some ideas I have about how parser generator diagnostic messages can be improved. Currently, LR parser generators provide only the most limited information about a shift-reduce or reduce-reduce conflict. For example, the Beaver parser generator can give you something like this:

Warning: Resolved Shift-Reduce conflict by selecting (PLUS: SHIFT; goto 3) over (PLUS: REDUCE E = E PLUS E) using precedence.

It can take a lot of time, ever for an expert, to figure out what the problem is and how to fix the grammar to remove the conflict. I think this can be significantly improved.

Currently my parser generator only generates an LR(0) parser, though my goal is to make it generate LR(1) parsers. The Scala code became surprisingly compact. For example, here is the main parse loop of the parser after the action and goto tables have been generated:

def parse(in: LookaheadBuffer[Symbol], goal: ItemSet) {
  val state = new scala.collection.mutable.Stack[ItemSet]
  val stack = new scala.collection.mutable.Stack[ParseTree]
  state push goal
  while (true) {
    while (reduceRules contains {
      val rule = reduceRules(
      val lhs = rule.lhs
      var children = List[ParseTree]()
      for (i <- 1 to rule.rhs.size) {
        children = children :+ stack.pop
      stack push new ParseTree(lhs, children)
      state push map( goto
    val sym = if (!in.eof) in.pop else EOFToken
    if (sym != WhitespaceToken) {// skip whitespace
      val action = actions(
      action match {
        case Accept =>
        case shift: Shift =>
          stack push new ParseTree(sym)
          state push

I wouldn’t be surprised if there was some way to shave ten more lines off that loop.

No Comments on LR Parser Generator
Categories: Programming Scala