Sound out complex statements

Code reading is more important than code writing, and people’s lack of that skill is what often gets them in trouble. Six months after writing some code, you might not know why you coded something but you should know what you coded.

Consider this tidbit from Dan Lyke that we posted on Twitter:

$var ^= 0x2 if ($var+0 & 2);

Taken as a unit, that’s a bit much to digest. However, like sounding out words as a kindergartener, you should be able to figure this out if you know the language.

Instead of reading right to left, pick out the important structural bits of the statement.

First, there’s the postfix if. You know that the left side has an operation and the right side has a condition. Replacing the particular messiness, I bet most Perlers instantly grasp this structure:

OPERATION if CONDITION

Perl will do OPERATION when CONDITION evaluates to true. Easy peasy.

From there, you descend into the statement to understand it’s parts. The condition is a little tricky because it has an uncommon (for Perl) operator &, the bitwise AND.

($var+0 & 2);

The structure is simple:

VALUE OPERATOR VALUE

The two values are:

$var + 0
2

I can understand how people get hung up on this one because the $var + 0 is also uncommon. But, reading the documentation for & you’ll see that it has a string and numeric context. That’s a bit unusual for Perl where the operator sets the context. But, it is what it is. To get around that, the author ensured that & chose the numeric operation by ensuring the lefthand value was a number by adding 0 to $var. There’s still the chance of non-numeric data, but the & will still be numeric.

I write about the bitwise operators in >Mastering Perl, but we cover them very briefly in the chapter on “Scalars” in Learning Perl too. In short, ($var+0 & 2) is true if $var has the bit representing 2 set. So, numbers such as 2, 6, 7, 10, and so on yield true because their sums in powers of 2 need a 2. (2, 2 + 4, 1 + 2 + 4, 2 + 8 ).

The left side looks a bit tricky as well because it also uses a bitwise operator, ^, the exclusive OR. It’s part of a binary assignment:

$var ^= 0x2

That’s the same as the term that uses the variable name twice:

$var = $var ^ 0x2

The exclusive OR returns a value with the bits set in the positions where only one of its operands had bits set. In short, this operation returns all the bits set in $var and unsets the bit representing 2 if it was already set.

And that’s it. Although you may not have understand why the author wrote the code they did, by breaking it down you know what it does. The better you get at this process, the quicker it happens. Soon you’ll read it as complete statements, just like how you read these words here by recognizing them as whole words instead of individual letters.

But, that you can read the statement doesn’t mean it can’t be simpler. For some reason a particular bit has to be unset if it is already set. That’s the task. Now let’s see if I can improve the statement.

Perl v5.22 adds an experimental feature to make bitwise operations force numeric context. This is almost always what people want from those operators:

use feature qw(bitwise);
no warnings qw(experimental::bitwise);

$var ^= 0x2 if $var & 2;

The hexadecimal notation is a bit much since it’s the same value with the same digits in decimal:

use feature qw(bitwise);
no warnings qw(experimental::bitwise);

$var ^= 2 if $var & 2;

Unsetting the bit representing 2 is the same as subtracting 2 if the the bit is already set. Instead of ^, you can subtract 2:

use feature qw(bitwise);
no warnings qw(experimental::bitwise);

$var -= 2 if $var & 2;

That works for this situation, but if you wanted to set a more complicated bit pattern, you might be back to the bit operators.

Finally, a comment is in order. Maybe there was on in the original code but couldn’t fit in Twitter:

use feature qw(bitwise);
no warnings qw(experimental::bitwise);

# ensure the 2 bit is unset because that tells the foobar that
# quux is okay. It's stored oddly in the database because of 
# that stupid legacy thing we all hate but no one has the guts
# to fix. Entered as issue #137. And #449. And #1023.
$var -= 2 if $var & 2;

If you’re going to change the code, you need to test the change to ensure the new code does the same thing as the old code (even if the old code was wrong):

use v5.22;
use feature qw(bitwise signatures);
no warnings qw(experimental::bitwise experimental::signatures);

use Test::More;

my @values = ( 0 .. 100 );

foreach my $value ( @values ) {
	is( subtract($value), bitwise($value), "Values match for $value" );
	}

done_testing();

sub subtract ($var) { $var -=   2 if  $var & 2;  $var }
sub bitwise ($var)  { $var ^= 0x2 if ($var & 2); $var }

(See Use v5.20 subroutine signatures for more on that new feature)

The syntax for each of the subroutines is the same even if I used a different operator. The skill you need to read either of them is the same. If you don’t see it all at once, break it down.

Leave a comment

4 Comments.

  1. Why not $var &= ~2, ignoring the issue of context?

  2. Ignoring the larger structure of the code, my reason for Tweeting it as awful is exactly Peter Roberts’ complaint:

    1. It should have been $var &= ~2;

    2. It really should have been $var &= ~SOME_CONSTANT_NAME;

    The weird part is that the previous author(s) were setting the bit with $var |= 2; so they clearly knew something about bit operations, but apparently didn’t know about the negation operator.

    Kids with CS degrees, man: Whatcha gonna do with ’em?

    • Twitter is a bad place to make points like that because you get their message across but not yours, especially since you’re making your very short comment in the middle of a sea of people complaining that Perl just looks ugly. I know my students have problems with statements like that for the reasons I noted. Aside from other context, I assumed you didn’t like the way it looked rather than wanting something else. Provide your answer when you find something you don’t like. 🙂

      I don’t get upset with people getting something done inefficiently, just like I wouldn’t want people to be upset with me for not knowing something (or not remembering something). That someone made a statement that wasn’t the best it could be is orders of magnitude less significant than poor architecture.

      As for CS degrees, who knows. I think my bit fiddling kung fu isn’t what it used to be.

      • Yeah, I guess some of this is about idioms. As an old school bit slinger, I find “&= ~” a very familiar idiom.

        Your points about testing and undef behavior are well taken, and are helping me to clarify some of my concerns about how we use Perl.

        My day job involves slinging Perl, but lately I’ve been moving my personal projects back to C++, publicly. One of the responses to my “compile time checking” whines is “with sufficient test coverage…”

        I think the purpose of a language should be lower cognitive load. I mean, heck, that’s why Perl’s got the awesome whipupitude. Your testing examples are fantastic, but the “what happens if it’s undef” and other edge cases start to make me think about where that cognitive load becomes more than with a language that has explicit typing.

        Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *