Now, what does this have to do with project euler? Well, here's another example. I didn't get this one from project euler, but its the kind of thing they tend to ask.
Click here to view the secret text
×
The nth triangular number is the sum of the first n integers. For example, T1 = 1, T2 = 1 + 2 = 3, T3 = 1 + 2 + 3 = 6, T4 = 1 + 2 + 3 + 4 = 10 etc. The nth square number is the square of the nth integer. For example S1 = 1^2 = 1, S2 = 2^2 = 4, S3 = 3^2 = 9 etc.
Some square numbers are also triangular numbers. For example, T1 = S1 = 1, T8 = S6 = 36, T49 = S35 = 1225.
Find the sum of the first 20 such numbers (ie 1 + 36 + 1225 + ...)
Ok so here's a one way to do this (in sort of pseudo-c++):
Click here to view the secret text
×
int get_answer()
num_found = 0
sum = 0
n = 1
while (num_found < 20)
if ( is_square_number( n ) && is_triangle_number( n ) )
num_found++
sum += n
n++
return sum
bool is_square_number( n ) {
if ( round( sqrt(n) ) ^ 2 == n ) return true
else return false
}
bool is_triangle_number( n ) {
sum = 0
x = 1
while ( sum < n)
sum += x
x++
if ( sum == n ) return true
else return false
}
Disclaimer: this code is untested, but please don't run it to find out if it's right... you'll be there for a long time!
How long is this going to take (remember that project euler problems should take less than a minute to run!!). Short answer: longer than the life of the universe (try it if you don't believe me!)
Why is it so innefficient? Well for one thing, it takes a long time to check if a number is a triangular number. Basically, the bigger n is, the longer it will take to check if n is triangular (becuase you have to add up more and more numbers). And remember, we're going to be calling this function for every integer between 1 and ... well I don't even know how much, because I can't be bothered to find out... but those calls are going to take longer and longer to execute the bigger the numbers get. Compare that to is_square_number. is_square_number executes a constant amount of instructions no matter how big n is (actually, that's not quite true, but its quite a bit more efficient than is_triangle_number). To put it in the terminology of the previous post, is_square_number is O(1) = C while is_triangle_number is O(n^(1/2) = K*n^(1/2)). No matter what the coefficients, C and K, are (suppose for example that sqrt(n) takes a large, but constant, number of instructions, while K is very small), is_square_number will be much faster than is_triangle_number for very large n.
Let's take a crack at making is_triangle_number more efficient. You might notice that Tk = k(k+1)/2 (example, T4 = 10 = 4*5/2). So to see if n is a triangular number, the equation n = k(k+1)/2 must have integer solutions. So rearranging gives
k^2 + k - 2n = 0
which has solutions (using quadratic formula)
k = (-1 +- sqrt(1 + 8n))/2
For example, if n = 10 then k = (-1 +- sqrt(1 + 8(10)))/2 = (-1 +- 9)/2 = 4 or -5 (of course, the negative solution is useless), showing us that T4 = 10 (and therefore showing us that 10 is a triangular number).
On the other hand if n = 9 then k = (-1 +- sqrt(73))/2 which is not an integer, so 9 is not a triangular number.
So the key is that whatever is inside the sqrt (ie 1 + 8n) must be a square number. Here's the new is_triangle_number:
Click here to view the secret text
×
bool is_triangle_number_improved( n ) {
if (is_square_number( 1 + 8*n ) ) return true
else return false
}
Now we've made a significant improvement to the program. To compare, run change the program so that it looks for the first 5 square/triangle numbers instead of the first 20, and then try it with the new is_triangle_number function and the old one. The new one will run noticebly faster.
However, it's still not fast enough to run in the control time (1 minute). What's the problem here? Well, the "
Good"
numbers that we're looking for are spaced way way way apart. In fact, each number in the sequence 1, 36, 1225, 41616... is approximately 36 times bigger than the previous one! (1 * 36 = 36, 36 * 36 = 1296 ~= 1225, 1225 * 36 = 44100 ~= 41616 etc) So if it takes us 1 second to find the first 4 target numbers, then it will take us 36 seconds to find the next one, then nearly 20 minutes to find the next one, then almost 12 hours to find the one after that, then 18 days to find the next one, and then almost 2 YEARS to find the one after that. The next one will be found on your death bed, and your great-great-great (720 times over)-grand-children will be lucky if they see the program find the next one after that! And we still have to find 10 more!
So what could we do? Well, we could notice that we don't really have to check every integer if it's square and triangular. Instead, we could just look at the triangular numbers and see if they're square:
Click here to view the secret text
×
int get_answer_improved() {
num_found = 0
sum = 0
n = 1
while ( num_found < 20 )
Tn = get_triangular_number( n )
if ( is_square_number ( Tn ) )
sum += Tn
num_found++
n++
}
int get_triangular_number( n ) {
return n*(n+1)/2
}
Ok, that's a lot better. In fact, since the triangular numbers are spaced out at O(n^2) compared to the integers, we just cut our running time by an exponent of 1/2 (ie we'll run in approximately the square root of the time we did before). Try this program out with finding the first 5 target numbers. It should be almost 6 times as fast as the previous program was. But it's still too slow! The indices of the triangle numbers (1, 8, 49, 288...) are each about 6 times the previous number. So, suppose it takes us 1 second to find the first 5 target numbers. It will take 6 seconds to find the next one, 36 seconds to find the next one, 3.6 minutes to get the next one, 21.6 minutes to get the next one, 2 hours for the next one, 5 days for the next one etc. It's better, but not good enough.
The real trick to solving this problem is to realize that the indices of the square numbers (1, 6, 35, 204, 1189 etc) have the following pattern:
6 = 1*6 - 0
35 = 6*6 - 1
204 = 35*6 - 6
1189 = 204*6 - 35
ie each number is 6 times the previous number minus the previous previous number.
Gn = 6*G(n-1) - G(n-2)
So this program:
Click here to view the secret text
×
int get_answer() {
sum = 0
for ( i = 1; i <= 20; i++ )
sum += get_triangular_number( G(i) )
return sum
}
int G( n ) {
if ( n == 1 ) return 1
else if ( n == 2 ) return 6
else return 6*G(n-1) - G(n-2)
}
Has a shot a finishing in our lifetime (in fact, it ought to finish relatively quickly (less than a minute unless I miss my guess) although you'll need to use a language with arbitrary length integers, for example Scheme)
But even that program isn't all that efficient. Consider that G(n) has to call itself a number of times and that we aren't caching previous results. In fact, if I asked for the first 40 good numbers instead, this would take a really really long time. Something like this would be much more efficient:
Click here to view the secret text
×
int get_answer() {
sum = 0
prev_index = 0
this_index = 1
for ( i = 1; i <= 20; i++ )
sum += get_triangular_number( this_index )
next_index = 6 * this_index - prev_index
prev_index = this_index
this_index = next_index
return sum
}
____________________________
Progress Quest Progress
[Last edited by stigant at 01-08-2009 10:53 PM]