In Perl it is necessary to declare a variable with my
(or our
) before using it. This behaviour is enabled with the strict
pragma; and recently it has become the default.
Why?
Today's theme explores the idea that, when writing code, there is meaning in every statement. A good portion of code will comprise statements that actually implement the logic that causes the program to do what it does; but often overlooked are the statements such as these my
and our
declarations, which explain your intention for the variable before it's ever even used.
We'll look at some of the simpler reasons behind it, and later on we shall look at the less apparent ones.
Requesting
In these cases the intention you are declaring is simple: "I want to use this symbol."
The humble typo is the most obvious reason espoused for requesting new variables: it stops you using something else. But in Perl this actually covers at least three separate types of typo, all of which are solved by declaring things before you use them.
Misspelling it later
Misspelling the variable later on is the most common failure.
my $hard_to_spell_name; $hard_tp_spell_name = 'cats';
Global symbol "$hard_tp_spell_name" requires explicit package name at script.pl line 3. Execution of script.pl aborted due to compilation errors.
Saying you want to use symbol A and then using symbol B is an error it is trivial to pick up on.
Misspelling it now
This is less common because you usually spell the variable name right when you create it because you've just spent ages trying to come up with the name in the first place. It's the same declaration, except you meant B and B, rather than A and A.
my $hard_tp_spell_name; $hard_to_spell_name = 'cats';
Global symbol "$hard_to_spell_name" requires explicit package name at script.pl line 3. Execution of script.pl aborted due to compilation errors.
Forgetting
This requires a module, but declaring your intent allows the warnings
pragma to tell you when you didn't use a variable you asked for.
Install warnings::unused
from CPAN in the usual way.
use warnings::unused; use strict; use warnings; my $foo; my $bar = 'cats'; say $bar;
Unused variable my $foo at script.pl line 5.
Typing
By this I mean the type of the variable, not the typing you're doing when you make a typo.
In this case, you've declared an array and then accidentally used a scalar, or forgotten it's not an arrayref, or something along those lines. This is also the sort of protection you get from languages with a more C-style typing system, where you have to declare a variable by defining its symbol name and its type (int i;
). Basically even though you spelled the symbol name right, you're using it wrongly.
my @array_of_cats; push @$array_of_cats, 'cat';
Global symbol "$array_of_cats" requires explicit package name at script.pl line 3. Execution of script.pl aborted due to compilation errors.
"You're using it wrongly" is a perfectly reasonable statement here. That's because you declared what "right" is: "wrongly" is directly determined by your own my
statement.
Overwriting
Reuse
If you are required to declare your variables the first time you use them then you will always do so. This means that the keyword my
is not only used to declare that a variable is supposed to be available, but also to declare that the variable is supposed to be new.
Hence, if you try to introduce a variable that already exists, it tells you off, and thus you avoid clobbering an existing variable.
This behaviour is actually only a warning, so comes from use warnings;
rather than use strict;
. However, it is still a result of declaring your intent.
use strict; use warnings; my $cats = 'cat'; my $cats = 'horse';
"my" variable $cats masks earlier declaration in same scope at script.pl line 4.
Clobbering
It is easy to forget that the use of my
and our
produce lexical variables. These are variables that are only visible within the block in which they are defined (treating a file as a block for this definition).
With my
you simply cannot clobber this variable from anywhere else. It is either a compiler error, or a different variable.
# This sub is useless and does nothing sub one { my @cats; push @cats, @_; return @cats; } # This sub can't see @cats from the other sub! sub two { push @cats, @_; # line 10 return @cats; }
Global symbol "@cats" requires explicit package name at script.pl line 10. Execution of script.pl aborted due to compilation errors.
Or:
# This compiles, but is a new, separate array of cats. # It is fractionally more useful than sub one. sub two { my @cats = ('default_cat'); push @cats, @_; # line 11 return @cats; }
A bonus of my
is that when the block has executed, the variable is tidied up. That is, it falls out of scope. This also works in loop bodies, allowing you to trash and recreate data in every iteration by putting a my
line inside the loop.
package Cat { my @cats; # Both of these use the same @cats - the one above! sub one { push @cats, @_; return @cats; } sub two { @cats = ('default_cat'); # whups, overwrote the whole set! push @cats, @_; return @cats; } } @Cat::cats = ('cat_one', 'cat_two');
Here, @cats
is available to be clobbered anywhere in the Cat package1. However, because it is lexical, it is only available within that block2. Line 18 appears to be altering the same variable (@cats
within the package Cat
), but in fact this is creating a new package variable in Cat3.
The intent of using my
to declare @cats
therefore is to have a variable available throughout the package, but not to be available without the package.
There is a subtler declaration of intent. The position of this my
statement declares that this variable is intended to be used throughout the entire package; therefore it should be applicable to the majority of the behaviour in the package. Were this not the intention, the my
statement could be put in a block that encapsulates the variable and any places it is supposed to be used.
our
is a similar beast, but it adds the ability for outsiders to also alter the variable, so long as they do so explicitly. The following code differs only in the use of our
:
package Cat { our @cats; sub one { push @cats, @_; return @cats; } sub two { @cats = ('default_cat'); push @cats, @_; return @cats; } } @Cat::cats = ('cat_one', 'cat_two');
Now, the variable @cats
inside the package's block can also be accessed as @Cat::cats
from outside of it. This is the intent you declare when using our
.
1 Normally, the package would be defined in its own file, but this format is common for single-use packages, especially in tests.
2 When the package is defined in its own file, the file itself is the scope for such variables.
3 The reader should be aware that this is the reasoning behind the message Global symbol "$foo" requires explicit package name
when strictures tells you off for an undeclared variable. Any variable name can be used, so long as it explicitly declares a package name like in this example. The difference between a lexical variable and a package variable is not in scope of this blog post.
No comments:
Post a Comment