The Linux and Unix Menagerie: Understanding Perl Variable References On Linux And Unix.

Hey there,

Today we're going to take a look at a part of Perl that a lot of folks shy away from; mostly because (from my experience) they feel it's too abstract a notion or too complicated to understand. For today, I'm referring to Perl references ;) And here's the thing; nothing could be farther from the truth. It's just about as simple as the sentence preceding the last. When I referred to Perl references I was, for the most part, laying the foundation for easily understanding the entire concept. If you attack the problem semantically, and try not to think of it as a bunch of backslashes and arrows and symbols, it makes perfect sense :) If I'm wrong, and this post leaves you reeling in confusion and pain, please let me know so I'll stop being so cavalier with my prose :)

So, let's take a look at Perl references and how they can be used, most basically, in a step-by-step fashion; from the simplest of beginnings to the not-so-complex middle (We'll leave references to hashes of arrays of references to other hashes for some other day ;)

For all of these examples, we'll use command line examples, so you can cut and paste them to try them out, rather than pretend that we're inside a Perl script.

1. A simple way to look at Perl References: The basis of any Perl reference is the variable, or value that you're referring to. At the most basic level, any variable assignment is a reference. For instance, look at these basic statements:

host # perl -e '$a = "bob";'
host # perl -e '@a = qw(bob joe);'
host # perl -e '%a = (bob => joe);

These are all just simple variable assignments (with $ indicating a scalar variable, @ representing an array and % representing a hash). However, you can think of them as references (which will make the transition to understanding textbook references much more smooth. The variable $a, for example, has an assigned value of the string "bob." So, if you look at that in a different way, the $a variable refers to the string "bob," or (another way) the variable $a is NOT the string "bob," but a reference to that string (or scalar) value.

BTW, if this part of the post is beyond where you're at with Perl, take a look back at some of our older posts on simple arithmetic and simple variables in Perl that deal with these more basic principles. There should be enough links on those two pages to connect you to all the other ones on this site. If not, the blog search feature (although it's very generous in its interpretations -- search for the letter "a" to see what I mean ;) should help you find what you need.

2. Looking at actual Perl References: A textbook Perl reference is the same thing as we discussed in point 1, except taken up (or out) one meta-level. So instead of having the relationship of reference ($a) and referent ("bob") that we had before, we're going to assign one scalar variable a reference to any of the three variables from before, rather than from the variables directly to the values. So, to reference any of these three we could do the following (note that for this basic lesson, the Perl reference will always be a scalar since, at its core, it always is; even if that scalar value is a part of a larger array or hash). The symbol that denotes that you're setting your variable's value to a reference is the backslash (\) character:

host # perl -e '$a_ref1 = \$a;'
host # perl -e '$a_ref2 = \@a;'
host # perl -e '$a_ref3 = \%a;'

So now we have three very simple Perl references. $a_ref1 has the value of a reference to the $a scalar variable, $a_ref2 has the value of a reference to the @a array and $a_ref3 has the value of a reference to the %a hash. (Note that you can have a Perl variable refer to itself, although the uses for this are somewhat limited and generally not necessary for basic Perl scripting. Ex: $a_ref4 = \$a_ref4 <-- $a_ref4 has the value of a reference to itself.

3. Extracting values from Perl References: This is just as easy as extracting values from regular variables, except, as before, you have think one more hop. Whereas, with a regular variable, you would extract the value of that variable directly, with a Perl reference, you need to extract the value of the variable that is being referenced by your reference. It sounds worse than it is ;) For instance, if we accept that the scalar variable $a is equal to "bob," we know that we can extract the value of $a by doing the following (as before):

host # perl -e '$a = "bob";print "$a\n";'

Whereas, if we create a reference (another scalar variable) to the variable $a, and call that $a_ref1, we need to extract the value from the variable that we are referencing. A simple and comfortable approach to extracting this value would be the following:

host # perl -e '$a = "bob"; $a_ref1 = \$a; print "${$a_ref1}\n";'

In this instance we've simply peeled the onion, so to speak (insert your favorite peelable vegetable or fruit here ;). In order to extract the variable of $a from the Perl reference $a_ref1 variable, we just stripped it layer by layer. To deconstruct the print statement above, we'll go backward from the statement we used to print the value of the $a_ref1 Perl reference:

a. ${$a_ref1} is what we call to print the value of the variable $a.

b. ${$a_ref1} is actually equal to ${\$a} since $a_ref1's value is a reference to $a (as denoted by "$a_ref1 = \$a;")

c. ${\$a} is equal to ${a} since the we're dealing directly with the referent. $a_ref1 (the variable with the value of the reference actually points to a hex address in memory (usually associated with a Perl file type). You can see the difference in the output of the two commands below:

host # # perl -e '$a = "bob"; $a_ref1 = \$a; print "$a_ref1\n";'
SCALAR(0x2e250) <-- This is the hexadecimal memory space that the $a_ref1 reference refers to. Your results may vary :)

perl -e '$a = "bob"; $a_ref1 = \$a; print "${\$a}\n";'
bob <-- This is the value of the referenced variable $a, which we know (from before) is equal to "bob" ($a = "bob" from what seems like so far up the page ;)

d. And, even though we don't need to tell you this, just for completeness' sake: ${a} (or $a - same thing) equals "bob".

4. Extracting values from Perl References that aren't scalar: Finally, some good news :) The principles above apply to all sorts of variable dereferencing. So, for instance, if you wanted to extract the value of the array reference $a_ref2, you could get it by doing:

host # perl -e '@a = qw(bob joe); $a_ref2 = \@a; print "@{$a_ref2}\n";'
bob joe <-- The whole thing
host # perl -e '@a = qw(bob joe); $a_ref2 = \@a; print "@{$a_ref2}[0]\n";'
bob <-- array index 0
host # perl -e '@a = qw(bob joe); $a_ref2 = \@a; print "@{$a_ref2}[1]\n";'
joe <-- array index1

and the same basic principle applies to hashes (%{$a_ref3} would get you all those values). Basically, all you need to do to extract the value of a one-level-deep Perl Reference is to wrap the reference-variable in a curly brackets and preface that with the appropriate symbol ($ for scalar, @ for array, % for hash, etc).

5. What to do if you have no idea what kind of Perl Reference you're dealing with: Fortunately, there exists - in the very heart of Perl - a function to deal with just this sort of predicament. It's called, for some strange reason, "ref" ;) On many systems, doing something like this:

host # perl -e '@a = qw(bob joe); $ref_type = ref(\@a); print "$ref_type\n";'
ARRAY

is all you need to do to get back the type of reference you're dealing with (Obviously, we knew it was an array since we're doing these self-contained command line scripts, but you could use the ref function against any Perl Reference and get the value from it. One thing to note about the ref function is that it doesn't always work as expected. For instance, if you call the function ref on a straight-up scalar, array or hash variable, it should return "undefined." This is normal, since those straight-up variables are "not" references. However, sometimes, even when you are dealing with a reference, you won't get any feedback on your command line. This isn't to say that ref doesn't know what kind of reference you're working with; just that it's not in the mood to tell you ;)

You can get around this little hassle pretty simply by just writing a simple type-check. So if you run the following:

host # perl -e '%a = (bob => joe); $ref_type = ref(\%a); print "$ref_type\n";'

and you don't get the return of

HASH

as you would expect to, you can figure out what the return from ref was anyway. The two most basic ways to do this range from cowboy to academic ;)

a. Cowboy: Just print the variable that points to the reference, like we did above, to get the hexadecimal address (instead of the value of the referenced variable), since this is accompanied by the reference type:

host # perl -e '%a = (bob => joe); $ref_type = \%a; print "$ref_type\n";'
HASH(0x2e26c)

b. Academic: Use a simple if-condition to test and see what kind of output ref returns

host # perl -e '%a = (bob => joe); $ref_type = ref(\%a); if ($ref_type eq "HASH") {print "HaSH FOUND!\n"};'
HaSH FOUND!

Of course, you could check just to see if the value is even "defined," since, if it isn't, you're not dealing with a reference. 99% of the time, Perl will do the right thing and tell you what the ref function returns. For all I know, the 1% of the time it doesn't work for me is because I completely screwed up ;)

Perl also deals with references to a lot of different file and object types to which these same basic principles apply. So, if you're dealing with a pipe or another type of file or variable, you can still use the principles above to help you out. And, for simplicity's sake (until you get used to being utterly confused while in a state of mostly-understanding ;), for every level of referencing that gets added on, you just need to derefence that many times backward (as shown above) to make your way back to the original value of the original variable(s)!

And that wraps up that :)

Hope that helps shed some light on basic Perl References and, again, I'd love to hear what you think about this post; especially with regards to how you felt about it (Was it too simplistic? Too Complicated? Hard to understand? Easy Peasy? ;)

Cheers,

, Mike