Updates to Chapter 5, “Input/Output”

[This post notes differences between the fifth and sixth editions.]

With the added emphasis on Unicode, we had to update the chapter on input and output a bit. If we are going to talk about Unicode, we need to talk about encodings, which expands the material on three argument open and brings in binmode too.

For this update, we also introduced filehandle references, although we did not call them references, really. We still present most of the chapter using bareword filehandles, and once we have covered everything we show how you can do the same things with filehandles in variables. However, we still save the meat of filehandle referencs to Intermediate Perl.

Some people think everyone should be using lexical filehandles all of the time, but even if you want to do that for new code, you still have to understand what people did in old code so we have to cover the legacy syntax (and some people would even object to calling it “legacy”).

A new Unicode appendix

I’ve added a new appendix to Learning Perl to handle all of the Unicode stuff I was having difficultly integrating into the other chapters.

Our goal has always been to present just the information you need without getting into distracting details. The problem with Unicode is that there are a lot of distracting details. Not only that, you have to learn some things in tandem. We can’t talk about Unicode strings without introducing strings, but at the same time, we want to start with Unicode as the basis for strings.

I wanted to have a lot of that stuff in the Strings chapter, but a lot of the Perl Unicode stuff lives in modules. We do talk about modules later in the book, but I want to use some of them earlier.

Any beginning book is going to have this problem. You need to ignore some stuff to at least get started. As such, I gave up on trying to cram all the Unicode stuff into the chapters and put most of it into a new appendix. This also means that if people want to ignore some of the Unicode stuff, which I don’t recommend, they can. But, they shouldn’t. So, read the whole book, even the appendices!

Updates to Chapter 9, “Processing Text with Regular Expressions”

[This post notes differences between the fifth and sixth editions.]

I didn’t have to make many changes to this chapter. I wanted to put in at least one Perl 5.14 feature, but the only new thing that the substitution operator gets is the /r modifier.

While working through this chapter though, I started to wonder if our terms in the previous editions were the same in the Perl documentation. We called the modifiers “option modifiers”, and sometimes “flags”, perlre just says “modifier”. Personally, I’m used to saying “flag” all the time and I like that term just fine, but for regular people, there’s nothing to connect the everyday use of “flag” to the thing after the match operators. So, “modifier” it is. I’d much rather use “adverb”, which is popular in Perl 6 land, but it’s a bit late for Perl 5 to change terms. When I made the switch in this chapter, I had to go back to Chapters 7 and 8 and do the same thing.

This chapter is also curious in that it ends with a long example the builds up to a perl one-liner. One of the things the reviewers noted about a new edition was a new chapter devoted to one-liners. That’s still possible, I guess.

Updates to Chapter 8, “Matching with Regular Expressions”

[This post notes differences between the fifth and sixth editions.]

There are a couple of interesting updates for Chapter 8. The small change is the slight modification of a footnote. We mentioned that the performance problem of the match variables $& and friends wouldn’t be solved before Perl 6. However, with Perl 5.10’s introduction of the /p match operator flag, problem solved!

Chapter 8 also has a subtle shift in thinking about anchors. Perl 5 introduced the \A, \Z, and \z regular expression anchors. Somehow, never made the shift from the Perl 4 anchors ^ and $. Even after Perl Best Practices pointed out the problem, we failed to update the Llama

I’d never really bothered to check when Perl introduced \A until today. That’s a task I do quite frequently: when did some feature show up in Perl? I could just go through all the tarballs, unpack them, and look at the documentation, but there’s an easier way. Since I have a clone of the perl repository, I have access to the entire perl development history. Each release has a tag, and I can list all the tags:

$ git tag
perl-1.0
perl-1.0.15
perl-1.0.16
perl-2.0
perl-2.001
perl-3.000
perl-3.044
perl-4.0.00
perl-4.0.36
perl-5.000
perl-5.000o
perl-5.001
perl-5.001n
perl-5.002
perl-5.002_01
perl-5.003
...

If I want to see what was going on in a particular release, I checkout the appropriate tag:

git checkout perl-5.000

Now I can see the state of the repo at the point of that release. Sure enough, C<\A>, C<\Z>, and C<\z> are in the documentation back then.

Updates to Chapter 3, “Lists and Arrays”

[This post notes differences between the fifth and sixth editions.]

I went into this chapter thinking that it would be fairly easy: just fix up any possible typos or grammar problems, then move on. However, I was reading through Appendix B and noticed that in previous editions that we had ignored splice. We mention it all the way at the end of the book, but it almost takes as much space to say that we aren’t going to cover to say that we will. So, I move it out of Appendix B and into Chapter 3.

You would think that this chapter would be a natural to pull in things like List::Utils, but we actually save that for later. We make some Perl-pure versions of max in the “Subroutines” chapter, then later in the “Perl Modules” chapter we can abandon the examples we used to illustrate the Perl concepts so the reader can use List::Utils.