In the “Benchmarking” chapter of Mastering Perl, I emphasize better algorithms over different syntax. Many of the problems we think we have better solutions if we change how we do things instead of worrying about the efficiency of a particular keyword. In this item, I’ll go through my actual path through a problem rather than hiding all my failed experiments. The negative results are just as valuable.
Mark Jason Dominus’s Universe of Discourse often demonstrates the advantages of better thinking over better programming. He often computationally solves numeric analysis and number theory problems with much less brute force than most people would apply.
For instance, in “An ounce of theory is worth a pound of search”, he recalls his solution to find “excellent numbers”, as well another math problem. These are numbers where you chop the sequence of digits in half, subtract the square of the lower part from the square of the higher part, and in doing so get the original number back. For instance, 48 is excellent:
48 -> 4 and 8 8² - 4² = 64 - 16 = 48
The problem Mark presents is to find all 10-digit excellent numbers. It’s a common computational exercise that’s a bit more interesting than some other contrived examples. If we went through all of the numbers one-by-one in an exhaustive search, Mark estimates that it would take about 100 days to go through each number to check it. I don’t know what he was using to run the program, but even in 2006 computers were much better than the 1,000 tries a second he uses as his estimate. I think, though, that Mark was just tossing out a number to show that even if you cut it in half it still wasn’t a practical improvement.
I wrote a brute-force Perl program that prints timestamps as it reaches an excellent number or a new order of magnitude. Mark decided to try numbers with an odd number of digits by prepending a 0
(as in 01
) so I’ve done that too:
#!/usr/bin/perl use v5.10; use strict; use warnings; use integer; use Time::HiRes qw(gettimeofday tv_interval); my $time0 = [gettimeofday]; my $old_length = 0; my $num = $ARGV[0] // 0; # starting point $SIG{INT} = sub { say "Stopping at $num. Average ", eval { $num / tv_interval( $time0, [gettimeofday] ) } , " / second"; exit; }; while( ++$num ) { my $length = length $num; if( $length > $old_length ) { say "[@{[time]}] Working on length $length => ", "average ", eval { $num / tv_interval( $time0, [gettimeofday] ) } , " / second"; $old_length = $length; } if( $length % 2 ) { $num = '0' . $num; $length++; } my $half = $length / 2; my $front = substr $num, 0, $half; my $back = substr $num, $half, $half; my $computed = $back**2 - $front**2; say "[@{[time]}] $num" if $computed == $num; }
Almost instantly this program finds all the excellent numbers through six digits, and it’s not until it starts on the 9-digit numbers that it starts to slow down. After an hour and a half it found the first 10-digit excellent number and about three hours after that it finished the 10-digit numbers. Several hours later it found the first 11-digit excellent number. That’s not bad for a puny 2010 MacBook Air:
[1432016632] Working on length 1 => average / second [1432016632] 01 [1432016632] Working on length 2 => average / second [1432016632] 48 [1432016632] Working on length 3 => average / second [1432016632] Working on length 4 => average / second [1432016632] 3468 [1432016632] Working on length 5 => average / second [1432016632] 010101 [1432016632] 016128 [1432016632] 034188 [1432016632] Working on length 6 => average / second [1432016632] 140400 [1432016632] 190476 [1432016632] 216513 [1432016632] 300625 [1432016632] 334668 [1432016632] 416768 [1432016632] 484848 [1432016632] 530901 [1432016633] Working on length 7 => average 1000000 / second [1432016635] 02341548 [1432016636] 02661653 [1432016648] Working on length 8 => average 625000 / second [1432016656] 16604400 [1432016676] 33346668 [1432016707] 59809776 [1432016753] Working on length 9 => average 826446 / second [1432016755] 0101010101 [1432018439] Working on length 10 => average 553403 / second [1432021243] 3333466668 [1432022946] 4848484848 [1432023104] 4989086476 [1432029289] Working on length 11 => average 790076 / second [1432042948] 018516137328
This program is a bit slow, though. Mark included numbers of an odd length by prepending a 0, so I had to search through the 9- and 11-digit numbers. My rate is a couple orders of magnitude faster than Mark’s estimate, but again, I don’t think he really meant that as a serious estimate.
There are many things I could do to improve the run time performance of this code. I could skip all numbers where the front half is larger than the back half where the difference in the squares would be negative. I see about a 10% boost by inserting this next
check:
... my $front = substr $num, 0, $half; my $back = substr $num, $half, $half; next if $front > $back; my $computed = $back**2 - $front**2; say "[@{[time]}] $num" if $computed == $num; }
A small improvement like that isn’t very interesting though. Rather than a 10% improvement, I want it to run in 10% of the time. To do that I need a completely different approach.
In Mark’s case, he flips the problem from brute-force iteration through every number to check that it meets the conditions to using the conditions to generate the candidate numbers. If 2k is the number of digits, I know b² – a² = 10k·a + b. Rearranging that, I have a·(10k + a) = b² – b.
Now the problem turns into computing two values for all 5-digit numbers: n·(10k + n) and n² – n. Now I’ll take a bit of a wrong turn, but I want to show that anyway. Part of my goal in Mastering Perl is to show how people actually work, mistakes and all. That’s part of the learning process. If I never show that part, I deprive you of what I’ve learned by experiencing it. It also lets you know I’m as big a bonehead as anyone else.
I can use the computed value as the key in a hash and put the numbers that lead to that computed value in the array reference value. The a and b will be those values that have exactly two elements in the array reference. The highest number of that pair is the b. I get to use v5.20’s postfix dereference feature:
#!/usr/bin/perl use v5.20; use feature qw(postderef); no warnings qw(experimental::postderef); my %pairs; foreach my $n ( 10_000 .. 99_999 ) { my $k = length $n; my $front = $n*(10**$k + $n); my $back = $n**2 - $n; push @{ $pairs{ $front } }, $n; push @{ $pairs{ $back } }, $n; } foreach my $key ( keys %pairs ) { # skip everything that doesn't have a partner next unless @{ $pairs{$key} } == 2; my( $front, $back ) = sort { $a <=> $b } $pairs{$key}->@*; say "$front$back"; }
I get back all 10-digit excellent numbers in a couple of seconds, which pleases me:
4989086476 3333466668 4848484848
Now that I can do this quickly, I can make my program more useful by taking the number of digits from a command-line argument:
#!/usr/bin/perl use v5.20; use feature qw(postderef); no warnings qw(experimental::postderef); my %pairs; my $digits = $ARGV[0] // 10; my $k = ( $digits / 2 ) - 1; foreach my $n ( 10**$k .. 10**($k+1) - 1 ) { my $k = length $n; my $front = $n*(10**$k + $n); my $back = $n**2 - $n; push @{ $pairs{ $front } }, $n; push @{ $pairs{ $back } }, $n; } foreach my $key ( keys %pairs ) { # skip everything that doesn't have a partner next unless @{ $pairs{$key} } == 2; my( $front, $back ) = sort { $a <=> $b } $pairs{$key}->@*; say "$front$back"; }
When I run this under time to watch to runtime performance, I can easily get through the 12-digit numbers. As I emphasized in Mastering Perl, it’s not enough for me to solve the problem to the limits given me. I want to go a little beyond that and see where my solution breaks. As I do this, I’m lead to a much better solution I never would have explored if I was satisfied with a couple of seconds here. Think about what your commute to work might be like if highway designers had tested their assumptions with two orders of magnitude greater capacity than they targeted! Or, 1,000 times the website traffic, or a lot more of anything.
The memory I used in %pairs
causes a problem for my puny MacBook Air for 14-digit numbers. I get stuck in paging hell after that hash takes up about 6 GB of memory (there is only 4 GB of physical memory). That’s the trick with these things; I traded memory for speed. In this case, I traded a bunch of memory for worse performance!
I thought I could fix this by storing the %pairs
hash on disk with DBM::Deep, but that was so slow that I almost preferred the pure brute force. The tied object and file operations behind-the-scenes made everything much slower:
#!/usr/bin/perl use v5.20; use feature qw(postderef); no warnings qw(experimental::postderef); use DBM::Deep; my $file = 'excellent.db'; unlink $file; # could do File::Temp here, I guess my $pairs = DBM::Deep->new( $file ); my $digits = $ARGV[0] // 10; die "Number of digits must be even and non-zero! You said [$digits]\n" unless( $digits > 0 and $digits % 2 == 0 and int($digits) eq $digits ); my $k = ( $digits / 2 ) - 1; foreach my $n ( 10**$k .. 10**($k+1) - 1 ) { my $k = length $n; my $front = $n*(10**$k + $n); my $back = $n**2 - $n; push @{ $pairs->{ $front } }, $n; push @{ $pairs->{ $back } }, $n; } while( my( $key, $value ) = each %$pairs ) { # skip everything that doesn't have a partner next unless @$value == 2; my( $front, $back ) = sort { $a <=> $b } $value->@*; say "$front$back"; }
That’s not going to work, but I don’t call it a failure. Now I know that’s not a solution. I had an idea and I tried it. Honestly, I only thought of it since Mark went onto another problem in “An ounce of theory is worth a pound of search” that used a hash to store intermediate values so I used that approach here too. That turned out to be a wrong path. A red herring even.
While waiting to generate those 14-digit excellent numbers, I thought of another approach that wouldn’t use any sort of cache. I would go through the numbers like I already had, but when I computed n·(10k + n) I’d see if I could find an integer roots for n·(n – 1) for that same number. That’s almost the square root, so I just check a window around that square root:
#!/usr/bin/perl use v5.20; my $digits = $ARGV[0] // 10; die "Number of digits must be even and non-zero! You said [$digits]\n" unless( $digits > 0 and $digits % 2 == 0 and int($digits) eq $digits ); my $k = ( $digits / 2 ) - 1; foreach my $n ( 10**$k .. 10**($k+1) - 1 ) { my $k = length $n; my $front = $n*(10**$k + $n); my $root = int( sqrt( $front ) ); foreach my $try ( $root - 2 .. $root + 2 ) { my $back = $try * ($try - 1); last if length($try) > $k; last if $back > $front; #say "\tn: $n back: $back try: $try front: $front"; if( $back == $front ) { say "$n$try"; last; } } }
Now I was spitting out numbers very quickly. It only took a couple of minutes to find all the 16-digit excellent numbers and I didn’t overwhelm my puny laptop. In a half hour I computed all the 18-digit excellent numbers:
$ time perl excellent3.pl 14 33333346666668 48484848484848 real 0m19.421s user 0m19.287s sys 0m0.068s $ time perl excellent3.pl 16 1045751633986928 1140820035650625 3333333466666668 real 3m20.514s user 3m19.214s sys 0m0.740s $ time perl5.20.0 excellent3.pl 18 103495840337945601 115220484358463728 134171310390093900 139601140398860400 140400140400140400 146198830409356725 168654484443958128 189525190474810476 190476190476190476 215488216511784513 216513216513216513 225789700526090001 241951680548171776 271851166588008693 299376300623700625 300625300625300625 332001666665001333 333333334666666668 334668334668334668 344329484680361873 415233416766584768 416768416768416768 468197520829099776 483153484846516848 484848484848484848 529100530899470901 530901530901530901 572945416949321793 real 30m51.687s user 30m34.211s sys 0m9.339s
Curiously, the list of excellent numbers at Programming Praxis is wrong. You can easily find those errors because the second half of the digits is less than the first half in the interlopers. To check those, I wrote another program to take a single number and report its possible excellence:
#!/usr/bin/perl use v5.20; use utf8; use open qw(:std :utf8); my $number = $ARGV[0]; my $length = length $number; my $k = $length - 1; if( $length % 2 ) { die "odd"; } my( $a, $b ) = map { substr $number, $_->[0], $_->[1] } ( [ 0, $length / 2 ], [ $length / 2, $length / 2 ] ); my $a2 = $a ** 2; my $b2 = $b ** 2; my $diff = $b2 - $a2; say "b = $b => b² = $b2"; say "a = $a => a² = $a2"; say "\ndiff is $diff"; my $insert = $number == $diff ? '' : 'not '; say "\n$number is ${insert}excellent";
At this point, I went to read how Mark solved it in 2006 in “Excellent numbers “. He did the same thing I did in my previous solution. That is, my foray into caching wasn’t something that he showed, and perhaps never tried. The best optimization strategy is to steal from what smarter people have already done. I hadn’t followed my own advice this time. The Programming Praxis entry on excellent numbers has several other methods, but I haven’t been motivated to try them. I have an even faster method that blows them all away.
Now that I’ve computed the numbers, I never need to do it again. I already know the list, just like I know what the Fibonacci numbers are in the example in Mastering Perl. I’ll store that known list in a hash keyed by length and on request spit out the numbers of the right length:
#!/usr/bin/perl use v5.20; use feature qw(postderef); no warnings qw(experimental::postderef); my %hash; while( <DATA> ) { chomp; my $length = length; push @{ $hash{ $length } }, $_; } my $digits = $ARGV[0] // 10; die "Number of digits must be even and non-zero! You said [$digits]\n" unless( $digits > 0 and $digits % 2 == 0 and int($digits) eq $digits ); say join "\n", $hash{ $digits }->@* __DATA__ 48 3468 140400 190476 216513 300625 334668 416768 ... all the others, left out
Nothing in the problem said I couldn’t do that, but that’s the difference between Computer Science and programming. I do what works and can cheat as much as possible!
Conclusions
So, there it is. I started with a brute force that checked every number, going through many candidates that I knew could never work. I then went through just the half numbers and calculated both a (10k + a) and b² – b so I could match them up. That had extreme memory requires starting with 12-digit numbers so I had to abandon that. I ended up making every a (10k + a) and guessing a b that would satisfy b² – b for that number. That used almost no memory and was speedy. Finally, I just have the list, so I can return instantly the list of any length I’ve already computed.
I also learned some things I could fold into further optimizations by checking fewer candidates. Someone could probably prove some of these conjectures, but I haven’t:
- a always seems to be even
- a so far has always ended in { 0, 4, 6 }
- b so far has always ended in { 0, 1, 3, 5, 6, 8 }
- b is always greater than a
- b so far hasn’t started with { 0, 1, 2 }
- No excellent number so far starts with a digit greater than 5
There are other things I’d like to try, as well. I’d expect these to give better improvements than limiting candidate numbers:
- I did all of my computations serially. I’d like to try a parallel solution, perhaps using something like MCE (multi-core engine) to spread out the work.
- Firing up a fat Amazon EC2 instance for an hour would be interesting.
- Using a language with native integers of arbitrary size might be faster. Should I suss it out in LISP? The factorials I tried in LISP were very fast.
- At some point Perl is going to run out of precision because it works with the underlying libc. The bignum pragma can help, but making everything an object will slow me down.
- I wonder if PDL (Perl Data Language) would be much faster. I’m guessing not, but why not try?
There’s an interesting addendum to this post. In 484848 is excellent , Mark talks about mathematicians doing a lot of hidden work to present a workable starting point that they present as obvious, or, as he says “pull the mystery lemma out of my ass at the beginning for no apparent reason.” I didn’t do that; my article is close to the same path I took in real life, including not reading Mark’s solution until I tried it myself (and ended up with his solution, curiously).
And, if you must, here’s the list of excellent number that I know so far (up to 20 digits):
48 3468 140400 190476 216513 300625 334668 416768 484848 530901 16604400 33346668 59809776 3333466668 4848484848 4989086476 101420334225 181090462476 238580543600 243970550901 268234583253 274016590848 320166650133 333334666668 346834683468 400084748433 440750796876 502016868353 569466945388 33333346666668 48484848484848 1045751633986928 1140820035650625 3333333466666668 103495840337945601 115220484358463728 134171310390093900 139601140398860400 140400140400140400 146198830409356725 168654484443958128 189525190474810476 190476190476190476 215488216511784513 216513216513216513 225789700526090001 241951680548171776 271851166588008693 299376300623700625 300625300625300625 332001666665001333 333333334666666668 334668334668334668 344329484680361873 415233416766584768 416768416768416768 468197520829099776 483153484846516848 484848484848484848 529100530899470901 530901530901530901 572945416949321793 21733880705143685100 22847252005297850625 23037747345324014028 23921499005444619376 24981063345587629068 26396551105776186476 31698125906461101900 33333333346666666668 34683468346834683468 35020266906876369525 36160444847016852753 36412684107047802476 46399675808241903600 46401324208242096401
Ok, I think I have some mathematical reasoning regarding the properties of the last digits of a and b. I’ve performed the following math entirely modulo 10, to focus solely on the effect of the last digit.
There are only a limited number of last digits in squares:
To get an excellent number, we need:
Rearranging:
Because a*a is a square, b*b – b must be one of 0, 1, 4, 9, 6, 5.
b can only therefore be one of ( 0, 1, 3, 5, 6, 8 ).
Likewise, the viable results of b*b – b are only 0 and 6. That means a is limited to 0, 4, and 6, because those are the only values that produce an a*a mod 10 value of 0 or 6.
Furthermore, it would seem if b is 0, 1, 5, or 6, then a must be 0, and if b is 3 or 8, then a must be 4 or 6.
Minor correction: If b is 3 or 8, a can be either 4 or 6, and if b is 0, 1, 5 or 6, a must be 0.
A fuller explanation:
Furthermore, it would seem if b is 0, 1, 5, or 6, then a must be 0, and if b is 3 or 8, then a must be 4 or 6.
You can take this further. Because of the constraint b*b – a*a == b, you can put further constraints on a given the value of b. Let’s explore:
Modulo 10, b*b == b for 0, 1, 5 and 6. Thus, if b has one of those values, a must be 0 for b*b – a*a == b to hold.
That leaves 3 and 8, so let’s work them out longhand:
Likewise:
The two values of a whose square (mod 10) is 6 are 4 and 6. So if b ends in 3 or 8, a must end in 4 or 6.
The whole lesson from this exercise sadly remains that Perl is rubbish when it comes to integer crunching.
I’m doing a lot of projecteuler stuff in pure Perl for the fun of it, and no matter how you approach it, three topics remain painfully slow:
– pure integer
– array walking
– scope changes
Array walking at least should get faster with the multideref since 5.20, but I had trouble getting reliable numbers across versions to see their impacts.
Method invocations get a little love in 5.22, though the cost of creating a scope is still there.
And for the last one, there’s not much to be done is there? It would be nice to have a pragma where I can say to perl: Look, I need you to do some calculating in a lot of little ops, and I won’t be using formats, magic, overloads, NVs, auto-bignum conversions, strings, locale dependant formatting, globs or whatnot. Just do what computers are good at, and hurl some integers around.
Of course that would be utterly unperlish, and besides that’s what Inline::C is for. Still, sometimes I dream of it…