Indirect but still fatal

new Foo bad, 'no indirect' good

Wed Jul 29 21:25:00 2009

Basic thesis of this article: indirect object syntax method calls are evil, and we should all use no indirect to avoid accidentally using them. If you're already convinced of that, go read something else, we're done here :)

If you're not, then first I guess I should explain what indirect object syntax is.

  new Foo @args;

That's indirect object syntax. Basically doing

  method_name ClassName @args;
  method_name $obj @args;

instead of

  ClassName->method_name(@args);
  $obj->method_name(@args);

Using arrow (->) syntax is good. You should do it. For everything.

Of course, if you're still reading, presumably you don't believe me. So first, I suppose I should provide some background on how perl5 parses things. Basically, when perl sees

  bareword <something>

It follows the following procedure to decide what it is:

(1) Is <something> a valid class name? If so, this is a method call.

  use Foo::Bar;
  new Foo::Bar @args; # calls Foo::Bar->new

(2) Is bareword a known subroutine name? If so, this is a sub call.

  sub wotsit { ... }
  wotsit { foo => 'bar', baz => 'quux' }; # calls wotsit({ ... })

(3) Stuff it, I'm guessing it's a method call.

  wotsit { foo => 'bar', baz => 'quux' }; # tries to call a method on a hashref

BOOM!

Better still, you can run into case (3) very, very easily. Consider the following:

  # One.pm
  package One;

  use strict;
  use Two;

  sub one { Two::two { (foo => 'bar', baz => 'quux') } }

  1;
  # Two.pm
  package Two;

  use strict;
  use One;

  sub two (&) {
    join(', ', $_[0]->())."\n";
  }
  1;

The circular dependency is what's going to kill us. So long as we load One.pm first, we're fine. But if we load Two.pm first, it starts loading One, the 'use Two' doesn't start loading Two because it's already in process, and so when 'sub one' is parsed there isn't a Two::two yet, and we get a completely unexpected case (3) -

  $ perl -MOne -e 'print One::one'
  baz, quux, foo, bar
  $ perl -MTwo -e 'print One::one'
  Can't use string ("foo") as a subroutine ref while "strict refs" in use at Two.pm line 7.

String "foo"? What the hell? ... well, what's happening here is that the block is, amazingly, still being parsed as a block. So it gets called and then the commas become the comma operator rather than a list constructor, and so the return value is actually 'foo', and then $_[0]->() tries to call it, and then strict screams because that isn't a subroutine reference.

If the above confused you, that's fine. It confused me horribly the first time I encountered it (especially when it was two plugins in a Module::Pluggable style system so I got different load orders depending on the machine the code was being run on ...)

However, with the aid only of indirect.pm we can make this far less confusing -

  # One.pm
  package One;

  use strict;
  use Two;
  no indirect;

  sub one { Two::two { (foo => 'bar', baz => 'quux') } }

  1;
  # Two.pm
  package Two;

  use strict;
  use One;
  no indirect;

  sub two (&) {
    join(', ', $_[0]->())."\n";
  }
  1;

And now we get:

  $ perl -MTwo -e 'print One::one'
  Indirect call of method "Two::two" on a block at One.pm line 8.
  Can't use string ("foo") as a subroutine ref while "strict refs" in use at Two.pm line 7.

And no indirect has kindly told us what's actually happened (and can be asked to throw an error in such cases if you want to bomb out in advance). Now we've found the mistake, we can either resolve the circular dependency, or if that's not possible predeclare the sub with a prototype -

  # Two.pm
  package Two;

  use strict;

  sub two (&);

  use One;
  no indirect;

  sub two (&) {
    join(', ', $_[0]->())."\n";
  }
  1;

At which point everything works fine again.

In summary: indirect object syntax evil, 'no indirect' good. And while you're there you might also like to poke at no autovivification; to help avoid accidental hash/array creation problems too ...

-- mst, out