snapsvg

2014-01-23

Declaring your intent

In Perl it is necessary to declare a variable with my (or our) before using it. This behaviour is enabled with the strict pragma; and recently it has become the default.

Why?

Today's theme explores the idea that, when writing code, there is meaning in every statement. A good portion of code will comprise statements that actually implement the logic that causes the program to do what it does; but often overlooked are the statements such as these my and our declarations, which explain your intention for the variable before it's ever even used.

We'll look at some of the simpler reasons behind it, and later on we shall look at the less apparent ones.

Requesting

In these cases the intention you are declaring is simple: "I want to use this symbol."

The humble typo is the most obvious reason espoused for requesting new variables: it stops you using something else. But in Perl this actually covers at least three separate types of typo, all of which are solved by declaring things before you use them.

Misspelling it later

Misspelling the variable later on is the most common failure.

my $hard_to_spell_name;
$hard_tp_spell_name = 'cats';
Global symbol "$hard_tp_spell_name" requires explicit package name at script.pl line 3.
Execution of script.pl aborted due to compilation errors.

Saying you want to use symbol A and then using symbol B is an error it is trivial to pick up on.

Misspelling it now

This is less common because you usually spell the variable name right when you create it because you've just spent ages trying to come up with the name in the first place. It's the same declaration, except you meant B and B, rather than A and A.

my $hard_tp_spell_name;
$hard_to_spell_name = 'cats';
Global symbol "$hard_to_spell_name" requires explicit package name at script.pl line 3.
Execution of script.pl aborted due to compilation errors.

Forgetting

This requires a module, but declaring your intent allows the warnings pragma to tell you when you didn't use a variable you asked for.

Install warnings::unused from CPAN in the usual way.

use warnings::unused;
use strict;
use warnings;

my $foo;
my $bar = 'cats';

say $bar;
Unused variable my $foo at script.pl line 5.

Typing

By this I mean the type of the variable, not the typing you're doing when you make a typo.

In this case, you've declared an array and then accidentally used a scalar, or forgotten it's not an arrayref, or something along those lines. This is also the sort of protection you get from languages with a more C-style typing system, where you have to declare a variable by defining its symbol name and its type (int i;). Basically even though you spelled the symbol name right, you're using it wrongly.

my @array_of_cats;
push @$array_of_cats, 'cat';
Global symbol "$array_of_cats" requires explicit package name at script.pl line 3.
Execution of script.pl aborted due to compilation errors.

"You're using it wrongly" is a perfectly reasonable statement here. That's because you declared what "right" is: "wrongly" is directly determined by your own my statement.

Overwriting

Reuse

If you are required to declare your variables the first time you use them then you will always do so. This means that the keyword my is not only used to declare that a variable is supposed to be available, but also to declare that the variable is supposed to be new.

Hence, if you try to introduce a variable that already exists, it tells you off, and thus you avoid clobbering an existing variable.

This behaviour is actually only a warning, so comes from use warnings; rather than use strict;. However, it is still a result of declaring your intent.

use strict;
use warnings;
my $cats = 'cat';
my $cats = 'horse';
"my" variable $cats masks earlier declaration in same scope at script.pl line 4.

Clobbering

It is easy to forget that the use of my and our produce lexical variables. These are variables that are only visible within the block in which they are defined (treating a file as a block for this definition).

With my you simply cannot clobber this variable from anywhere else. It is either a compiler error, or a different variable.

# This sub is useless and does nothing
sub one {
  my @cats;
  push @cats, @_;
  return @cats;
}

# This sub can't see @cats from the other sub!
sub two {
  push @cats, @_; # line 10
  return @cats;
}
Global symbol "@cats" requires explicit package name at script.pl line 10.
Execution of script.pl aborted due to compilation errors.

Or:

# This compiles, but is a new, separate array of cats.
# It is fractionally more useful than sub one.
sub two {
  my @cats = ('default_cat');
  push @cats, @_; # line 11
  return @cats;
}

A bonus of my is that when the block has executed, the variable is tidied up. That is, it falls out of scope. This also works in loop bodies, allowing you to trash and recreate data in every iteration by putting a my line inside the loop.

package Cat {

  my @cats;

  # Both of these use the same @cats - the one above!
  sub one {
    push @cats, @_;
    return @cats;
  }

  sub two {
    @cats = ('default_cat'); # whups, overwrote the whole set!
    push @cats, @_;
    return @cats;
  }
}

@Cat::cats = ('cat_one', 'cat_two'); 

Here, @cats is available to be clobbered anywhere in the Cat package1. However, because it is lexical, it is only available within that block2. Line 18 appears to be altering the same variable (@cats within the package Cat), but in fact this is creating a new package variable in Cat3.

The intent of using my to declare @cats therefore is to have a variable available throughout the package, but not to be available without the package.

There is a subtler declaration of intent. The position of this my statement declares that this variable is intended to be used throughout the entire package; therefore it should be applicable to the majority of the behaviour in the package. Were this not the intention, the my statement could be put in a block that encapsulates the variable and any places it is supposed to be used.

our is a similar beast, but it adds the ability for outsiders to also alter the variable, so long as they do so explicitly. The following code differs only in the use of our:

package Cat {

  our @cats;

  sub one {
    push @cats, @_;
    return @cats;
  }

  sub two {
    @cats = ('default_cat');
    push @cats, @_;
    return @cats;
  }
}

@Cat::cats = ('cat_one', 'cat_two'); 

Now, the variable @cats inside the package's block can also be accessed as @Cat::cats from outside of it. This is the intent you declare when using our.

1 Normally, the package would be defined in its own file, but this format is common for single-use packages, especially in tests.

2 When the package is defined in its own file, the file itself is the scope for such variables.

3 The reader should be aware that this is the reasoning behind the message Global symbol "$foo" requires explicit package name when strictures tells you off for an undeclared variable. Any variable name can be used, so long as it explicitly declares a package name like in this example. The difference between a lexical variable and a package variable is not in scope of this blog post.

No comments:

Post a Comment