This post is about the Project Euler - 14 - Longest Collatz sequence problem. This is one of the perfect examples of memoization technique. This is the actual problem.
The problem statement looks pretty simple and a brute force will work just fine. This code runs in 20 seconds.
As I had to wait for 20 seconds to get the solution, I was thinking about improving the performance. Then I found this. From the problem statement,
To optimize it, I just had to change the Series function alone. I stored the computed values in a dictionary and if the current number in the sequence happens to be in the dictionary, I ll just return the value corresponding to it. This program responded with the result in 2 seconds. That is 10 times faster than the first version.
The following iterative sequence is defined for the set of positive integers:
n → n/2 (n is even)
n → 3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence:
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.
Which starting number, under one million, produces the longest chain?
NOTE: Once the chain starts the terms are allowed to go above one million.
Simple Solution
The problem statement looks pretty simple and a brute force will work just fine. This code runs in 20 seconds.
def Series(N):
Counter = 1
while N != 1:
N = N/2 if N % 2 == 0 else 3*N+1
Counter += 1
return Counter
Max, Element = 0, 0
for x in xrange (1, 1000000):
series = Series(x)
if (Max < series):
Max = series
Element = x
print Max, Element
As I had to wait for 20 seconds to get the solution, I was thinking about improving the performance. Then I found this. From the problem statement,
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
, starting with any number, if it becomes 13 at any point of time, it has to take the same route and if it becomes 13 it will take 10 steps to become 1. So I wanted to store this information first time itself, so that I don't have to recalculate. This is called Memoization
. Quoting from wikipediaIn computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs
Optimized (Memoized) Solution
To optimize it, I just had to change the Series function alone. I stored the computed values in a dictionary and if the current number in the sequence happens to be in the dictionary, I ll just return the value corresponding to it. This program responded with the result in 2 seconds. That is 10 times faster than the first version.
Dict = {}
def Series(N):
Counter = 1 #It is 1, because we are counting the initial N
Original = N
while N != 1:
N = N/2 if N % 2 == 0 else 3*N+1
if N in Dict: # Check if the length has been computed for this number already
Counter += Dict[N] # Add it to the current length counter
Dict [Original] = Counter # Store the computed value for future look back
return Counter
Counter += 1
Dict [Original] = Counter # Store the computed value for future look back
return Counter