Updates to Chapter 13, “Directory Operations”

[This post notes differences between the fifth and sixth editions.]

Our updates to Chapter 13 aren’t that exciting. There’s not much that has changed in the world of Perl and directories. It’s almost dull, even.

  • Use variables as directory handles: opendir my $dir, $directory.
  • Mention a couple more modules incidental to some of the examples, including File::Spec::Functions, Path::Class, and File::Temp
  • Show a find2perl example. We mentioned File::Find only to say that we weren’t going to say anything about it. Also mention the improved interfaces of File::Finder and File::Find::Rule

Updates to Chapter 7, “In the World of Regular Expressions”

[This post notes differences between the fifth and sixth editions.]

I just committed the new Chapter 7, “In the World of Regular Expressions”. It was quite an education, even for me, because the character class stuff has changed so much since Perl 5.6, and, since Learning Perl had been ignoring Unicode, we didn’t face the hard problems.

  • The \w character class is almost dangerous now. By default, it represents over 100,000 characters that can match at that position. The \d and \s character classes have the same problem on a smaller scale. It’s unlikely that anyone actually wants these shortcuts anymore, but there are still in older programs. I did cover this over at The Effective Perler, too.
  • Since we’re covering Unicode, this is the right chapter to cover the Unicode properties, such as \p{Space}. Those don’t completely solve the character class shortcut problem because they still match many characters. The perluniprops documentation lists how many characters match each property, which is kinda cool.
  • Perl 5.13.9 includes Karl Williamson’s work to add the /a adverb to enforce ASCII semantics, so we use that
    too even though we don’t really get into options into the next chapter.

This is all rather painful to update because I didn’t want to go through everything assuming ASCII semantics (so, very few changes) then tack on an “if you are using Unicode” section that then invalidates everything. We just have to bite the bullet and make the switch to thinking of Unicode as the default and ASCII as the backward-compatibility special case.

Updates to Chapter 12, “File Test Operators”

[This post notes differences between the fifth and sixth editions.]

This chapter probably doesn’t deserve an update here because almost nothing changed. Most of the updates is just make all the code examples consistent. When I added the Perl 5.10 updates for the stacked file test operators, I used a style that wasn’t quite my own, but not quite the one Tom and Randal had already used in the book. It’s more jarring in this chapter than in Chapter 15 (“Smart matching”), a completely new chapter in the fifth edition, because you can see two different styles on the same page. And, I’ve updated Chapter 15 too.

There is one area where I can use some feedback though. We say:

Don’t worry if you don’t know what some of the other file tests mean—if you’ve never heard of them, you won’t be needing them. But if you’re curious, get a good book about programming for Unix.

However, we don’t give any suggestions for what a good book might be. What would you choose?

“captures” versus “memories”, “group” versus “buffer”

The term “memories” to label the side effects of parentheses has fallen out of favor. The new hotness is “capture group”, although that has sometimes showed up as “capture buffer” in the documentation. Karl Williamson, however, purged the docs of “capture buffer”, so you shouldn’t see that anywhere in Perl 5.14’s docs. This mostly affects Chapter 8, where we introduce the match variables, even though we have grouping and backreferences in Chapter 7.

I’m not so sure I like “groups” everywhere though. I think that’s the right term to apply to the particular parentheses that triggered the capture, but not necessarily the thing actually captured. It’s the difference between asking which team is in the Super Bowl and who’s on the Super Bowl team.

I don’t really care that much, though, because there’s one overriding concern: we need to use the same terms that are in the documentation so people have the right search terms.

Perl 6 has a thing called captures, but that’s a completely different beast.