11 January 2018

Abusing multiple-dispatch creatively

As part of creating a new POD tree for the Perl 6 utilities in hopes of letting others create their own subclasses, I came across this interesting use of multiple-dispatch. One of the Pod types that the Perl 6 compiler generates is for inline attributes like making text bold or italic with formatting codes like 'B<text bold>'. Bold, italic, and underline formatting codes all generate the same Pod::FormattingCode object, with a different .type value, so they look something like

class Pod::FormattingCode {
  has Str $.type; # This is 'B', 'I', etcetera as the need arises.
class Pod::FormattingCode::Bold { }

I have a bunch of .to-node() methods that are specialized on turning raw Pod objects into something a little more useful. One of these, to convert a Pod::FormattingCode into my internal Node::FormattingCode::Bold object, looks like this:

multi method to-node( Pod::FormattingCode $pod ) {
  given $pod.type {
    when 'B' {
      Node::FormattingCode::Bold.new( :type( $pod.type ) )

my $bold = Pod::FormattingCode.new( :type( 'B' ) );
self.to-node( $bold ); # Calls the multi method above.

All of the methods that convert $something to a node are named .to-node(), and I can rely on Perl 6's multiple dispatch to keep them separate for me. This is important to me mainly because of locality of reference. If you're debugging my code, and you want to know where something gets converted to a Node:: object, just look through the .to-node() methods. Now, looking at the given-when block, that's going to grow, and by quite a bit. At least three lines for every formatting code that I find in the documentation.

And it gets a bit worse. Say I want to do the right thing, and factor out the '.new(...)' lines into their own method, because I'm pretty sure they'll grow, as I find neat little corner cases for each of the Pod types. I'd have to name them something ugly, which breaks up my idea.

Since the method still converts a Pod object to a Node object, it'd be nice to be able to reuse the .to-node() method, but to fit it into the existing scheme of things I"d have to create a new object like a Pod::FormattingCode::Bold, create a new instance of that, and then I'd be able to do multiple-dispatch on that type. But that means creating not one but two new classes for every Pod::FormattingCode - one for the "shim" that I use to dispatch on, and another one for the actual object I'm going to return to the user. And it's even worse than that, because it's possible, though very unlikely, that the Perl 6 team will one day create a Pod::FormattingCode::Bold class and trounce on my own name-space, undoing my hard work.

Well, as you might have guessed, there is a solution, and it doesn't involve trouncing on anyone's namespaces. And it still lets us stick to the principle of reusing .to-node() for all of our node-conversion needs. Here's how you write it:

multi method to-node( Pod::FormattingCode $pod ) {
  self.to-node( $pod, $pod.type );
multi method to-node( Pod::FormattingCode $pod, 'B' ) {
  Node::FormattingCode::Bold.new( :type( 'B' ) );

my $bold = Pod::FormattingCode.new( :type( 'B' ) );
self.to-node( $bold ); # Returns a Node::FormattingCode::Bold object.

What this does may not be obvious at first glance, so I'll walk through it. We create a Pod::FormattingCode object of type 'B' for Bold, and magic happens. The first magical bit is that multiple-dispatch kicks in, so that when .to-node() gets called, Perl 6 does its best to find the closest signature. And in this case it's a perfect match. .to-node( Pod::FormattingCode ) is perfectly matched by the first method. Inside that method, we do something a little sneaky. We break out the type, and rely again on multiple dispatch. This time we're calling .to-node( Pod::FormattingCode $pod, 'B' ), and again we have a perfect match, but this time it's the second method, down below. That creates the new Node::FormattingCode::Bold object and returns it.

How did that work, you might ask? Well, multiple dispatch in languages like C++ or Java work based on types. You can sort of simulate what we just did in C++ with templates, and Java's generic sort of do what we did, but not quite, and with much more work. TL;DR Perl 6's multiple dispatch works on argument values as well as types, so you can dispatch both on a generic Str class as well as a specific instance "foo" of Str. This means you can write Haskell-like code in Perl 6.

multi sub fib( $n where * < 0 ) { 0 }
multi sub fib( 0 ) { 0 }
multi sub fib( 1 ) { 1 }
multi sub fib( $n ) { fib( $n - 1 ) + fib( $n - 2 ) }

The where declaration there lets us cleanly handle negative values as well, as a bonus. No matter what (integer) value the client passes to the fib subroutine, Perl 6 will dispatch it cleanly, so that fib(2) will call sub fib(1) and return 1, rather than calling sub fib($n) and going into an infinite regress. I was just working along on Pod::To::Tree, did that particular bit of refactoring and thought you might like to hear about it. This is your friendly neighborhood Perl Fisher, signing off.

No comments:

Post a Comment