Tail Recursion
As nice as recursion is, sometimes our stack may grow too large, causing our program to crash due to running out of stack space. However, some languages support tail call optimization. That is, if you write your function in a way that leaves no need to use the previous stack, the compiler will use the same stack for its job and you'll never hit the stack overflow problem.
For example, here's a Fibonacci implementation in C.
// badfib.C
#include <stdio.h>
// Non-tail recursive function for Fibonacci
int fibonacci(int n) {
if (n == 0) return 0;
if (n == 1) return 1;
return fibonacci(n - 1) + fibonacci(n - 2);
}
int main(void) {
int n;
printf("Enter a number: ");
scanf("%d", &n);
printf("Fibonacci of %d is %d\n", n, fibonacci(n));
return 0;
}
Using this code, calculating the 5th term (1,1,2,3,5
) can be done like the picture above. You can see the stack frames grow too long. Each call of the function waits for the return of at least two other function calls, and neither of the first two branches can be calculated until we reach fibonacci(0)
and fibonacci(1)
. Since we need to keep track of what two numbers are being added together, we also can't reduce the amount of data we keep.
❯ gcc -O0 badfib.c -o badfib
❯ ./badfib
Enter a number: 500000
[1] 43221 segmentation fault (core dumped) ./badfib
❯ gcc -O1 badfib.c -o badfib_optimized
❯ ./badfib_optimized
Enter a number: 500000
[1] 43283 segmentation fault (core dumped) ./badfib_optimized
❯ gcc -O3 badfib.c -o badfib_super_optimized
❯ ./badfib_super_optimized
Enter a number: 500000
# takes too long :)
As you can see, no matter the level of optimization, the code doesn't finish in the way we want it to.
Now suppose we refactor our code like this:
#include <stdio.h>
// Tail recursive function for Fibonacci
int fibonacci_tail(int n, int a, int b) {
if (n == 0) return a;
if (n == 1) return b;
return fibonacci_tail(n - 1, b, a + b);
}
int fibonacci(int n) {
return fibonacci_tail(n, 0, 1);
}
int main(void) {
int n;
printf("Enter a number: ");
scanf("%d", &n);
printf("Fibonacci's %d term is %d\n", n, fibonacci(n));
return 0;
}
Here's how the functions are called:
fib(5)
:fib_tail(5, 0, 1)
fib_tail(5,0,1)
:fib_tail(4,1,1)
fib_tail(4,1,1)
:fib_tail(3,1,2)
fib_tail(3,1,2)
:fib_tail(2,2,3)
fib_tail(2,2,3)
:fib_tail(1,3,5)
fib_tail(1,3,5)
:5
Or in a nice little picture:
You may have noticed that this time is no different. It is less messy, sure, but each function still needs to wait for the result of another function so that it can finally return the value and remove its stack. The nice thing about this function, though, is that it can be optimized. At each turn we're accumulating the values we need inside two variables, and even if we only keep track of the last function call, we will still have the correct result.
Key idea: You can refractor your recursive functions, such that the recursive function call is being done at the end of the function, and the ultimate result that you will return isn't dependent on anything but the data in the very last function call. Then some languages like C will discard your stack at each new call and will just keep the last function call at each step. This will result in less memory usage and prevention of stack overflow. This is the essence of tail recursion optimization.
Let's see if the optimization makes a difference:
❯ ./fib_tail_unoptimized
Enter a number: 500000
[1] 42515 segmentation fault (core dumped) ./fib_tail_unoptimized
❯ gcc -O1 fib_tail.c -o fib_tail_optimized
❯ ./fib_tail_optimized
Enter a number: 500000
Fibonacci's 500000 term is -1975568635
It did! The answer is incorrect because int
can't such a large number, but we didn't run into a stack overflow segfault either!
Not all programming languages support tail recursion optimization though. And you need to know that discarding the stack will make it harder to debug your code because the stack won't be in its complete form.
The moral of the story is that whenever you feel like you're at risk of stack overflow, go ahead and rewrite your code using an accumulator so that it can be tail optimized, or just use a loop. In the case of relying on an accumulator the compiler should support TCO optimizaiton. If it doesn't no worries. You could use GOTO (it doesn't call new functions then, and it'll reuse the same variables so it's nice) to use the same stack frame, or write a loop.
For example, here's the Fibonacci function written using a loop, nice and easy:
#include<stdio.h>
long int fib(int term){
if(term == 1) {
return 0;
}
if(term == 2 || term == 3){
return 1;
}
long int a=1, b=1, c=0;
for(int i=0;i<term-2;i++){
c = a+b;
a = b;
b = c;
printf("%d: %ld\n", i+3, c);
}
return c;
}
int main(void){
fib(11);
return 0;
}