Agile on Ice: The Visitor Pattern and Extensibility

I've been dealing with Dil lately. It's a compiler project for the D programming language, written in D. It makes use of the visitor pattern to provide semantic analysis. (The visitor pattern is a way of achieving dynamic dispatch via strongly typed overloads. It was a clever hack when it was first created.)

One issue with the visitor pattern is that it requires a lot of boilerplate code. If you want to have multiple semantic passes, you need to use an interface, and that interface requires a method for each type. If your visitor only cares about ten types and the interface supports fifty, you have a problem.

This doesn't much matter for dil -- very few types will not matter for any given visitor.

But let's look at a different problem. Let's say we want to log all visitor actions. Where do we do this? There are two choices: every single visitor method, or every single visited class. Similarly if we want to filter visited items, or set a breakpoint, or anything interesting like that.

I'm working on semantic analysis in Dil right now. For cleanness, I would like to split semantic analysis into possibly many phases. However, I would also like to combine semantic passes when possible for efficiency. To do this, I need to create a visitor that will coordinate between several visitors. This is an unreasonably large task with the current Visitor pattern implementation.

What else could I use, though?

In the past, I've used this concept:


void delegate(Node)[ClassInfo] handlers;
void visit(Node node)
{
   if (auto ptr = node.classinfo in handlers)
   {
      auto dg = *ptr;
      dg(node);
   }
}

This design is sufficiently minimal that it's easy to do a fair bit with it:

Log each visited node

Use multiple handlers per node type

Use the same handler for multiple node types

Filter or preprocess nodes based on some criterion available in the base class

Ignore various node types by not writing any code for them

There is one problem with it, though: you have to do a lot of casting. In C#, you can use reflection to invoke the methods without caring about their types. In D, you can write a little wrapper:


class Invoker(T : Node)
{
   void delegate(T) dg;
   void invoke(Node node) 
   {
      debug
      {
         // safety: cast and check
         auto theNode = cast(T)node;
         assert (theNode !is null, "expected: " ~ T.stringof ~
                 "but was " ~ node.classinfo.name);
      }
      else
      {
         // efficiency: force cast and assume
         auto theNode = *cast(T*)&node;
      }
      return dg(theNode);
   }
}
handlers[type] = &(new Invoker(dg)).invoke;

A slightly simpler version is possible with D2's closures.

There is one problem with our solution: it uses associative arrays, which could be slow.

If you know what types you support in advance, you can map each to an index to get a very fast lookup. This requires a fast means of getting this index, however. I'm generally inclined to just use the associative array; it's going to be small in any case, so it will not incur a significant penalty in most cases.

Agile on Ice

Thursday, May 14, 2009

The Visitor Pattern and Extensibility

No comments:

Blog Archive