Thursday 6 June 2019

Project Euler 152 - Writing 1/2 as a sum of inverse squares

There are several ways to write the number 1/2 as a sum of inverse squares using distinct integers.
For instance, the numbers {2,3,4,5,7,12,15,20,28,35} can be used:


In fact, only using integers between 2 and 45 inclusive, there are exactly three ways to do it, the remaining two being: {2,3,4,6,7,9,10,20,28,35,36,45} and {2,3,4,6,7,9,12,15,28,30,35,36,45}.
How many ways are there to write the number 1/2 as a sum of inverse squares using distinct integers between 2 and 80 inclusive?


Solution

Wow, this one is tough. I spent more than 6 hours to solve it.

Upon observing the outcome of a very naive (and hence very slow) algorithm, it can be found that only a portion of numbers in the range of [2, 80] can be candidates that form the solution, namely

CANDIDATES = [2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 18, 20, 21, 24, 28, 30, 35, 36, 39, 40, 42, 45, 52, 56, 60, 63, 70, 72]

I don't know how to prove it, so I just hope it's the correct set of candidates.

My algorithm is very intuitive and requires no advanced mathematics knowledge.

First define S([a1, a2, a3, ..., ak]) = sum(1 / a1 ^ 2 + 1 / a2 ^ 2 + 1 / a ^ 3 + .... + 1 / ak ^ 2)

We start with the candidate array [2], search for the next smallest candidate (sc) so that S([2]) + S([sc]) <= 0.5. Obviously, in this case, the next smallest candidate is 3, so we append 3 to the candidate array and the array becomes [2, 3].

The candidate array keeps expanding until such smallest candidate cannot be found, then we update the last element of the candidate array to its 'next greater element' (we will refer to it as 'sibling' in the rest of the article) in the CANDIDATES list.

For example, the candidates array is [2, 3, 4, 5, 6, 12, 28, 52], so no such smallest candidate can be found so that S([2, 3, 4, 5, 6, 12, 28, 52]) + S([sc]) <= 0.5. Therefore the last element 52 is updated to its 'sibling ' 56, the candidate array becomes [2, 3, 4, 5, 6, 12, 28, 56].

if S(candidate_array) + S([sc]) happens to be 0.5, a solution is found.

The algorithm so far will find all the solutions although it's still quite slow. Consider this scenario:

candidate_array = [2, 4], the next smallest candidate that meets the criteria is 5, but even if we 'add up' the rest of all candidates, i.e. S([5, 6, 7, ...., 72]), the sum of S([2, 4]) and S([5, 6, 7, ...., 72]) is less than 0.5. We can see there is no solution for candidate array starting with [2, 4]. Similarly, there won't be a solution when it starts with [2, 5] and so on so forth. So we simply take off the last item off the list, and update the second last item (now it has become the last one) to its 'sibling' 3. When candidate_array = [3], there is also no solution. We take it off the list and end up with an empty array. And this is the terminating condition in the while loop.

It is worth mentioning that there is another condition that needs to be met to trigger the above scenario. That is, the next smallest candidate must equate the last element's 'sibling' in the CANDIDATES list, because the greatest candidate 72, happens to always meet this criteria, and therefore this particular case must be excluded. In the example above, 5 is 4's 'sibling', so the condition is met. On the other hand, the next smallest candidate for [2, 3, 4, 5, 9, 10, 13, 18, 24, 30, 35, 45, 52, 56] is 72, which is not 56's 'sibling'. Therefore the condition is not met and the scenario is not applied. 

The algorithm runs 4 seconds. I consider it not a half bad result.