“at least two or three hours per week”

Randal Schwartz, in his interview with Leo Laporte and Chris DiBona on FLOSS #9 (way before Randal was ever the host of the same show), says around the 9:20 mark that “Perl is meant for people who use the language at least two or three hours per week”.

This remark was highlighted by John D. Cook in Three-hour-a-week language. I found an even better thought in the comments. rdm says it’s more about knowing what to look for:

I have enough of a perl vocabulary that I know how to perform relevant searches when I am reaching for a concept. Python? Not so much…

That doesn’t have much to do with the language, really. If you spend a couple of hours each week using a language, reading the docs, and looking for answers, you gain experience and knowledge about the process making it slightly easier the next time. I’m not a great programmer, but I’m a pretty good answer finder. That can make up for a lack of talent.

In my Learning Perl classes, I tell people they aren’t going to learn Perl in a week. I can make them aware of things, but they need to practice. Even though we do exercises in the class, thinking about Perl all day for four days can melt anyone’s brain. Take that three (or more) hours a week for half a year and you’ll probably get passably good.

I got used to Perl by doing it almost every day all day for two years, but then I had to relearn it when Randal trained me to be a Perl trainer. I actually learned more by answering the random questions that people had. That was either students in classes or conversations on usenet. Now that could be Stackoverflow. You create some common set of problems for yourself, but by reading the problems from many people, you get to learn things from problems you wouldn’t make yourself. That’s where the gold is.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Google Buzz Send Gmail Post to LinkedIn Post to Reddit Post to Slashdot Post to StumbleUpon Post to Technorati

Advice to a new Perl user

A Learning Perl reader asked me for some advice in private email. After I typed it out I felt like posting it for everyone. He graciously let me use his questions and my answers.


1. I wish to use Perl on Windows, is it a good combination (from a career perspective)?

Although Windows can be a pain, and not just because of Perl, there are plenty of people who need to get things done on Windows. With the Win32:: modules, you can hook into the same APIs. I think you can even use Perl from Powershell.

2. Would the knowledge I’ll gain after self-training for Perl be useful for a long time?

You always have to keep learning. Perl is just a language and you can get almost anything done with it, but the more valuable thing is knowledge about the problem you’re trying to solve.

“Useful” is a harder thing to judge because it mostly depends on what you are doing. Perl can do quite a bit, but I find that “useful” is related to what non-Perl libraries I can access through Perl.

I don’t think you can go wrong at least learning Perl and practicing it for a couple of years. Some of the experience you gain there you can transfer to another language. It’s same the other way around, too, since the more experience you have as a programmer the easier new languages should be for you.

3. How does one figure out a niche for himself in this hugely spread-out world of Perl?

I find something that’s not getting done and take ownership of it. Something out there is being neglected. Just keep plugging away at something boring and unexciting. Running a local user group is good for that.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Google Buzz Send Gmail Post to LinkedIn Post to Reddit Post to Slashdot Post to StumbleUpon Post to Technorati

“The stat preceding -l _ wasn’t an lstat”

I ran into a fatal error that I haven’t previously encountered and I couldn’t find a good explanation where I expected it. The -l file test operator can only use the virtual _ filehandle if the preceding lookup was an lstat.

The file test operators, all documented under the -X entry in perlfunc, can use the virtual filehandle _, the single underscore, to reuse the results of the previous file lookup. They don’t just look up the single attribute you test, but all of it (through stat) which it filters to give you the answer to the question that you ask. The _ reuses that information to answer the next question instead of looking it up again.

I had a program that was similar to this one, where I used some filetest operators, including the -l to test if it’s a symbolic link.

use v5.14;

my $filename = join ".", $0, $$, time, 'txt';
my $symname  = $filename =~ s/\.txt/-link.txt/r;

open my $fh, '>', $filename
  or die "Could not open [$filename]: $!";
say $fh 'Just another Perl hacker,';
close $fh;

symlink $filename, $symname 
  or die "Could not symlink [$symname]";

# http://perldoc.perl.org/functions/-X.html
foreach( $filename, $symname ) {
  say;
  say "\texists"           if -e;
  say "\thas size " . -s _ if -z _;
  say "\tis a link"        if -l _;
  }

I get this fatal error:

The stat preceding -l _ wasn't an lstat at test_link_test.pl line 19

The entry in perlfunc doesn’t say anything about this, but it hints that -l is a bit special:

If any of the file tests (or either the stat or lstat operator) is given the special filehandle consisting of a solitary underline, then the stat structure of the previous file test (or stat operator) is used, saving a system call. (This doesn’t work with -t , and you need to remember that lstat() and -l leave values in the stat structure for the symbolic link, not the real file.) (Also, if the stat buffer was filled by an lstat call, -T and -B will reset it with the results of stat _ ).

Adding the diagnostics pragma has the answer that isn’t in perlfunc:

The stat preceding -l _ wasn't an lstat at test_link_test.pl line 19 (#1)
    (F) It makes no sense to test the current stat buffer for symbolic
    linkhood if the last stat that wrote to the stat buffer already went
    past the symlink to get to the real file.  Use an actual filename
    instead.

The other file test operators will perform a stat. If the file is a symlink, the stat follows the symlink to get the information from its target. A symlink to a symlink will even keep going until it ultimately gets to a non symlink. With a stat, the -l _ will never be true because it always ends up at the target, even if it doesn’t exist.

The lstat doesn’t follow the link, so it can answer the -l _ question because it might have returned the information for a link and in the case of a non-link, it works just like stat.

As the long version of the warning says, it’s probably better to never use the _ filehandle and use the full filename instead. Sure, it has to redo the work, but you won’t be surprised by a fatal error if you did the wrong type of lookup before.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Google Buzz Send Gmail Post to LinkedIn Post to Reddit Post to Slashdot Post to StumbleUpon Post to Technorati

Learning Perl Challenge: Remove intermediate directories

I often run into situations where I have directories that contain only one file, a subdirectory, with contain only one file, a subdirectory, and so on for a long chain, until I get to the interesting files. These situations come up when I have only part of a data set so the files that would be in other directories aren’t there, and I find it annoying to deal with these long directory specifications. So, this challenge is to fix that by collapsing those one-entry directories into a single one.

For example, you should take this structure, where you have A/B/C/D/E in a direct line with no other branches:

and turn it into this one, with a single directory with the files that were at the end:

However, you should only moves files up if the directory above it has only one entry (which must be a subdirectory!). In this example, A/B/C has two subdirectories in it:

so the the files in E should only move up into D. Otherwise, the files from the two branches in C would get mixed up with each other.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Google Buzz Send Gmail Post to LinkedIn Post to Reddit Post to Slashdot Post to StumbleUpon Post to Technorati

Why Perl’s conditional operator is right associative

What happens if you change the associativity of the conditional operator? PHP implemented it incorrectly and now it’s part of the language. In What does this PHP print?, Ovid posted a bit of PHP code that gives him unexpected results. The code comes from a much longer rant by Alex Munroe titled PHP: a fractal of bad design:


<?php // test.php
  $arg = 'T';
  $vehicle = ( ( $arg == 'B' ) ? 'bus' :
               ( $arg == 'A' ) ? 'airplane' :
               ( $arg == 'T' ) ? 'train' :
               ( $arg == 'C' ) ? 'car' :
               ( $arg == 'H' ) ? 'horse' :
               'feet' );
  echo $vehicle;
?>

The result is 'horse', and it will be for almost all values of $arg.

% php test.php
horse

I don’t care so much about the rant, but it told me the answer to this problem. The conditional operator is left associative in PHP, as documented in Operator Precedence. That almost made sense to me, and I know that putting parentheses around these things makes it more clear. I’m almost embarrassed to say that I couldn’t do it right off in this case. Where do I put them? With other operators it’s easy because the operator characters next to each other. I started writing this to figure out the grouping when the operator characters are separated by other things.

Let’s simplify that a bit to we don’t have a big mess. Now there are only two:

<?php // simple.php
  $arg = 'C';
  $vehicle = (
               ( $arg == 'C' ) ? 'car' :
               ( $arg == 'H' ) ? 'horse' : 'feet'
             );
  echo "$vehicle\n";
?>

The result is still 'horse' because we haven’t really changed anything:

% php simple.php
horse

Joel Berger gave a hint when he said that changing 'car' to '' yields 'feet':

<?php // null.php
  $arg = 'C';
  $vehicle = (
               ( $arg == 'C' ) ? '' :
               ( $arg == 'H' ) ? 'horse' : 'feet'
             );
  echo "$vehicle\n";
?>

And it does yield 'feet'::

% php null.php
feet

In Perl, the language I do know, the same operator is right associative (Why is the conditional operator right associative? on Stackoverflow explains why). Associativity, documented in perlop, comes into play when the compiler has to figure out which operator to do first when it has the same operator next to each other. In Learning Perl, we show this with the expontentiation operator since many other operators, such as multiplication and addition, don’t really care. The expontentiation is right associative because that’s what Larry decided it was (C doesn’t have this operator). That means it does the operation on the right before it does the operation on the left. You can see this when you use parentheses, the highest precedence operator, to denote the order you want and compare it to the version without the explicit grouping:

my $num = 4**3**2;    # 262144
my $num = 4**(3**2);  # 262144
my $num = (4**3)**2;  # 4096

We can do the same for the conditional operator in Perl. First, we translate the code to PHP, which is mostly changing == to eq:

# perl.pl
use v5.10;

my $arg = 'C';
my $vehicle = (
               ( $arg eq 'C' ) ? 'car' :
               ( $arg eq 'H' ) ? 'horse' : 'feet'
             );
say $vehicle;

<pre class="brush:plain">
% perl.pl
car

In Perl, we get the same behavior if we put parentheses around the second conditional:

# right.pl
use v5.10;

my $arg = 'C';
my $vehicle = (
               ( $arg eq 'C' ) ? 'car' :
               ( ( $arg eq 'H' ) ? 'horse' : 'feet' )
             );
say $vehicle;

We get the same result as perl.pl because we haven’t changed the order of anything:

% perl right.pl
car

To get the PHP behaviour, we have to change the parentheses like this, to surround everything up to the next ?. It took quite a mental leap for me to get this far because it’s so unnatural:

# left.pl
use v5.10;

my $arg = 'C';                                                        
my $vehicle = (
               ( ( $arg eq 'C' ) ? 'car' : ( $arg eq 'H' ) ) 
                 ? 'horse' : 'feet'
             );
say $vehicle;

Now we get different behaviour:

% perl left.pl
horse

That’s really odd, but it’s also a small gotcha we mention in the Learning Perl class. You can have things such as ( $arg == 'H' ) as a branch. This use probably isn’t useful, but it’s a consequence of the syntax. We can do assignments, for instance:

my $result = $value ? ( $n = 5 ) : ( $m = 6 );

It’s easier to see this as a picture for the path through the conditionals. The right associative version branches either to an endpoint or another decision and there’s only one way to get to that endpoint.

Right associative, as in Perl

The left associative version has multiple ways to get to the same endpoint because either branch in the previous conditional can be the value for the next test. This also shows how 'car' isn’t the endpoint that you think it should be:

Left associative, as in PHP

Going back to do the same thing with the original chain of conditionals, we get this diagram that looks more like a corset lacing instruction than something we meant to program.

The full monty

However, we already know the answers in this particular case because some values are literals, so we can remove several paths. Now it’s much more clear that many paths are feeding into a path that must end up at 'horse'.

The full monty

In fact, the only way to get to 'feet' is to be any letter that is not B, A, T, C, or H. Joel figured this out by changing 'car' to the empty string, which has this diagram:

Joel’s change

The only way to get to 'horse' is to be exactly H. The other letters must end up at 'feet' because they all end up at the empty string. Every other string ends up at 'feet' because they are not exactly H.

Maybe the complicated stuff makes sense to PHP programmers. I don’t know. It’s more likely that they don’t do these sorts of things, at least if they’ve read the advice in the PHP manual. Some people blame Perl since PHP inherited from Perl, but it seems like a yacc error that they can’t fix for backward compatibility. It’s not like that’s never happened to Perl

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Google Buzz Send Gmail Post to LinkedIn Post to Reddit Post to Slashdot Post to StumbleUpon Post to Technorati