Learning Perl Challenge: Be better than Quorum (Answer)

Did you come up with something better than Quorum in the previous Learning Perl challenge? There’s been some spirited conversations since then and some surprising new information.

The challenge was to come up with a program to return the maximum of either the multiples or not-multiples of n between a and b. In the original Quorum-is-better-than-Perl paper, the researchers had shown some very unPerly—and frankly generally bad—code to absolute newbies and asked them if they understood it. They then asserted that the Quorum version was easier for non-programmers to understand. But, they went further than that and asserted that Perl is worse than line-noise, a common bad joke for people who wonder why they can’t learn a complex skill before lunch.

I added a twist though. After your first try, I asked you to wait a day without thinking about it then try again the next day. There is difference between something you do right away and something you have time to think about. I don’t mind bad or sub-optimal code as long as it’s compartmentalized. I can hide in a subroutine something quick and inefficient today, then change the subroutine later as I have time to improve it.

Sinan Ünür, who brought the paper to my attention, submitted some solutions to my challenge (first day, second day). While he was thinking about that, he tried to run the original Quorum code. The Quorum code didn’t work—it didn’t even compile. Apparently the Perl code was the only code in the paper that actually worked. So, who cares is newbies understand a solution that doesn’t work? Paging, Mr. Babbit!

First solution

But, let’s put Quorum aside and get to the challenge. I’ll start with my first solution, which is a bit embarrassing now.

The challenge is to find the maximum of two numbers. One number is the count of the multiples and the other is the count of non-multiples. In the five minutes I gave myself, I create the simplest-to-understand code I could imagine. I went with the v5.20 option to use subroutine signatures:

use v5.20;
sub count ( $start, $end, $divisor ) {
    my $multiples     = grep { !( $_ % $divisor ) } $start .. $end;
    my $not_multiples = grep {  ( $_ % $divisor ) } $start .. $end;
   
	$multiples >= $not_multiples ? $multiples : $not_multiples;
	}

This works. It passes all the tests I listed in the challenge (using the table-driven testing I wrote about for PerlTicks). It scans the list twice, but it was the first thing I thought so I went with it.

Sinan did something a bit more tricky. He cheated a bit by going outside the bounds of Learning Perl by using references. He created an array reference that has code references as elements. Each code reference increments the value in an exterior scalar variable. He chooses which sub to run by using a boolean value 0 or 1 to select the code reference to run:

sub count ( $start, $end, $divisor ) {
    my ($p, $q);
    for my $x ($start .. $end) {
        [sub { $p += 1}, sub { $q += 1}]->[ $x % $divisor != 0 ]->();
    }
    ($p > $q) ? $p : $q;
}

If that’s his first go, I think we’re in trouble (in truth, his production code that I’ve seen isn’t so abstruse).

Steve Wilson’s solution is closer the the original Perl in the Quorum article. He goes through the range of value and increments scalars:

sub count {
    my ($start, $end, $divisor) = @_;
    my $is = 0;
    my $not = 0;

    for ($start .. $end){
        if ($_ % $divisor == 0){
            $is++;
        }
        else {
            $not++
        }
    }
    if ($not > $is){
        return $not;
    }
    else {
        return $is;
    }
}

E. Choroba had a delightful solution. He used the conditional operator to chose the variable to increment:

sub count {
    my ($start, $end, $div) = @_;
    my ($divisible, $nondivisible);
    ($_ % $div ? $divisible : $nondivisible)++ for $start .. $end;
    return $divisible > $nondivisible ? $divisible : $nondivisible
}

Most of the first solutions were similar. They passed through each number in the range and examined it. That’s fine. That’s the first way we think about it. In a pure mathematical framework, that makes sense. Count both sets and choose the max. Pure math doesn’t care about real world computers though.

Curiously, Sinan wrote about this in /r/programming too and someone submitted this Ruby solution claiming it didn’t loop:

def z(a, b, c)
  (a..b).partition {|i| i%c==0}.map(&:count).max
end

“Looping” is a odd thing in programming speak. Some people think about it only in terms of particular operators or language constructs even if other operators effectively do the same thing. The poster claims that partition might not loop because it could be implemented as recursion. Well, I’d count that as looping as well. But, I do like partition and wish it was part of core Perl (it’s in List::MoreUtils on CPAN).

Second solution

I waited a day to work on my second solution, according to the rules I set out. I’ll show my solution last this time.

Sinan went a bit crazy, again way out of the bounds of Learning Perl. He creates a factory subroutine (something we write about in both Intermediate Perl and Effective Perl Programming) to create subroutines that act differently depending on their argument list. If he passes an argument, he possibly increments a counter then returns the count so far. With no argument, he returns the count. Again, he loops over the entire range. His solution is basically the same with another level of abstraction. His first solution continually remade an anonymous array of new code references. Now he makes that once:

sub count ( $start, $end, $divisor ) {
    my @counters = map make_condition_counter($_),
        sub { !($_[0] % $divisor) }, # divisible
        sub {  ($_[0] % $divisor) }, # not divisible
    ;
 
    for my $v ($start .. $end) {
        $_->($v) for @counters;
    }
 
    max map $_->(), @counters;
}
 
sub make_condition_counter {
    my $condition = shift;
    my $n = 0;
    sub {
        $n += (!!$condition->(@_)) if @_;
        $n;
    };
}

Steve Wilson polishes his code and adds a smart-match operator, but his code is the same idea:

sub count {
    my ($start, $end, $divisor) = @_;
    my @range = ($start .. $end);
    my $is = 0;

    for (@range){
        my $num = $_;
        if ($num % $divisor == 0){
            while ( $num ~~ @range) {
                $num += $divisor;
                $is++;
            }
            last;
        }
    }
    my $not = scalar @range - $is;
    if ( $not > $is ) { $not } else { $is };
}

It’s Gustavo Chaves that goes in a completely direction, and is similar to the direction I went in. He didn’t loop at all. He recognized that given a range of values, we already know how many of them are divisible by a number. For instances, how many multiples of 3 are between 1 and 100? You don’t figure that out by counting and neither does Gustavo:

sub count ( $start, $end, $divisor ) {
    use integer;
    my $sm   = ($start % $divisor) ? $start - ($start % $divisor) + $divisor : $start;
    my $em   = $end   - $end   % $divisor + $divisor - 1;
    my $is   = ($em - $sm + 1) / $divisor;
    my $isnt = ($end - $start + 1) - $is;
    return $is > $isnt ? $is : $isnt;
}

This means that Gustavo’s solution takes the same amount of time no matter how long the range.

When I was thinking about it the night before, I figured that there had to be a way to know the multiples of n, and there is. I did some back-of-the-envelope tries and by induction came up with the same thing Gustavo was doing, but I grabbed the floor and ceil functions from POSIX which allows me to use the language of number theory. When I have the number of multiples, I also no the number of non-multiples because that’s the other part of the range:

use POSIX qw(floor ceil);

sub count ( $start, $end, $divisor ) {
    my $multiples     = floor($end/$divisor) - ceil($start/$divisor) + 1;
    my $not_multiples = ($end - $start + 1) - $multiples;
   
	$multiples >= $not_multiples ? $multiples : $not_multiples;
	}

My solution is within the bounds of Learning Perl since I cheat a little with POSIX. We do have a chapter on CPAN, so it counts. I could reimplement floor and ceil myself, but all I’d really do is steal them from POSIX.

Sinan’s /r/programming thread contains a Python 3 solution that does the same thing. The // does integer division. The b // c does a floor, and by making the second one (-a) // c does a floor on a negative number which is like a ceil on a positive number:

def cnt(a, b, c):
    tmp = b // c + (-a) // c + 1
    return max(tmp, b - a + 1 - tmp)

In Perl, that’s two floors:

use POSIX qw(floor);

sub count ( $start, $end, $divisor ) {
    my $multiples     = floor($end/$divisor) + floor(-$start/$divisor) + 1;
    my $not_multiples = $end - $start + 1 - $multiples;
   
	$multiples >= $not_multiples ? $multiples : $not_multiples;
	}

Not all languages or implementations may round negative integer division in the same way. Perl’s integer module rounds the opposite direction for negative numbers. See Why Python’s Integer Division Floors, but also How do the floor and ceiling functions work on negative numbers?. If you depend on this in your code, be ready for heated arguments and losing.

I thought there must be a solution like this because I follow Mark Jason Dominus’s World of Discourse blog where he talks about these sorts of things. Since those math ideas are in my head, it’s a solution I went toward. A quick googling I found How many multiples of X lie in the arbitrary range [Y,Z]? on Math StackExchange and Multiples of N Between A and B. My next thought was to find what Knuth had to say, but Sorting and Searching didn’t talk about this sort of problem. I don’t talk about that sort of stuff until Mastering Perl: the things you read day-to-day and the people you allow into your head affect the solutions you come up with.