Memoisation (tabling)

Let's look at one of the Rosetta Code tasks: 9 billion names of God the integer.

Part of that task is to print F(i,j) for 1 ≤ ji < 25, where F is defined thus:

We can turn that into C easily enough:

#include <stdio.h>
#define N 25

static long F(int i, int j) {
    if (i <= 0 || j <= 0 || i < j) return 0;
    if (i == 1) return 1;
    return F(i-1, j-1) + F(i-j, j);
}

int main(void) {
    for (int i = 0; i < N; i++) {
        for (int j = 0; j <= i; j++) {
            if (j != 0) printf(" ");
            printf("%ld", F(i, j));
	}
	printf("\n");
    }
    return 0;
}

We get this table, which is indeed correct:

0
0 1
0 1 1
0 1 1 1
0 1 2 1 1
0 1 2 2 1 1
0 1 3 3 2 1 1
0 1 3 4 3 2 1 1
0 1 4 5 5 3 2 1 1
0 1 4 7 6 5 3 2 1 1
0 1 5 8 9 7 5 3 2 1 1
0 1 5 10 11 10 7 5 3 2 1 1
0 1 6 12 15 13 11 7 5 3 2 1 1
0 1 6 14 18 18 14 11 7 5 3 2 1 1
0 1 7 16 23 23 20 15 11 7 5 3 2 1 1
0 1 7 19 27 30 26 21 15 11 7 5 3 2 1 1
0 1 8 21 34 37 35 28 22 15 11 7 5 3 2 1 1
0 1 8 24 39 47 44 38 29 22 15 11 7 5 3 2 1 1
0 1 9 27 47 57 58 49 40 30 22 15 11 7 5 3 2 1 1
0 1 9 30 54 70 71 65 52 41 30 22 15 11 7 5 3 2 1 1
0 1 10 33 64 84 90 82 70 54 42 30 22 15 11 7 5 3 2 1 1
0 1 10 37 72 101 110 105 89 73 55 42 30 22 15 11 7 5 3 2 1 1
0 1 11 40 84 119 136 131 116 94 75 56 42 30 22 15 11 7 5 3 2 1 1
0 1 11 44 94 141 163 164 146 123 97 76 56 42 30 22 15 11 7 5 3 2 1 1
0 1 12 48 108 164 199 201 186 157 128 99 77 56 42 30 22 15 11 7 5 3 2 1 1

However, our suspicions are raised. We have two recursive calls, just like the Fibonacci numbers or the recursive definition of the binomial coefficients. Could there be repeated evaluation here?

Let's instrument the program to find out. Add

static long count[N][N];

before the definition of F. Add

    count[i][j]++;

just after the opening curly brace of F. And add

    printf("counts\n");
    for (int i = 0; i < N; i++) {
        for (int j = 0; j <= i; j++) {
            if (j != 0) printf(" ");
            printf("%ld", count[i][j]);
	}
	printf("\n");
    }

just before the return of main(). The extra output is

1
5763 7337
4508 5762 1574
3506 4507 1254 572
2714 3505 1001 462 270
2087 2713 791 374 220 149
1597 2086 626 301 181 123 92
1212 1596 489 241 147 102 76 61
915 1211 384 192 120 84 64 51 43
684 914 296 153 96 69 53 43 36 32
508 683 230 120 78 56 44 36 31 27 24
373 507 175 95 62 46 36 30 26 23 20 18
272 372 134 74 50 37 30 25 22 19 17 15 14
195 271 100 57 39 30 24 21 18 16 14 13 12 12
139 194 76 44 31 24 20 17 15 13 12 11 11 11 11
97 138 55 34 24 19 16 14 12 11 10 10 10 10 10 10
67 96 41 25 19 15 13 11 10 9 9 9 9 9 9 9 9
45 66 29 19 14 12 10 9 8 8 8 8 8 8 8 8 8 8
30 44 21 14 11 9 8 7 7 7 7 7 7 7 7 7 7 7 7
19 29 14 10 8 7 6 6 6 6 6 6 6 6 6 6 6 6 6 6
12 18 10 7 6 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
7 11 6 5 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 6 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

from which we see that there is a lot of repeated evaluation, which could be a problem for bigger tables.

What do we do?

We use a table of previously computed results. Before using the full definition of F, we look i,j up in the table to see if we have already evaluated F(i,j). If we have, we just use the stored result.

The one tricky thing is that the entries of the table have to be able to represent any legal result of F plus an extra value to say “not evaluated yet” If we put the table check after return 0, we can use 0 as the “not evaluated yet” value.

#include <stdio.h>
#define N 25

static long table[N][N];

static long F(int i, int j) {
    if (i <= 0 || j <= 0 || i < j) return 0;
    if (i == 1) return 1;
    if (table[i][j] == 0)
        table[i][j] = F(i-1, j-1) + F(i-j, j);
    return table[i][j];
}

int main(void) {
    for (int i = 0; i < N; i++) {
        for (int j = 0; j <= i; j++) {
            if (j != 0) printf(" ");
            printf("%ld", F(i, j));
	}
	printf("\n");
    }
    return 0;
}

This gives us the same results as before, except now each F(i, j) is requested at most 3 times and evaluated at most once ( for 0 < j ≤ i > 1).

What we have done is to take a recursive function and add a memo to prevent repeated evaluation. But we can now take the same computation pattern and drive it bottom-up, just building a table iteratively.

int main(void) {
 // row 0
    table[0][0] = 0;
 // row 1
    table[1][0] = 0;
    table[1][1] = 1;
 // rows 2..N-1
    for (int i = 2; i < N; i++) {
	table[i][0] = 0;
	for (int j = 1; j <= i; j++)
	    table[i][j] = table[i-1][j-1] + table[i-j][j];
    }
    for (int i = 0; i < N; i++) {
        for (int j = 0; j <= i; j++) {
            if (j != 0) printf(" ");
            printf("%ld", F(i, j));
	}
	printf("\n");
    }
    return 0;
}

It is instantly obvious from looking at the bottom-up version of the calculation that the cost is at most 2 loads, an add, and a store for each element of the table, so now we can make N bigger.

Typical C compilers make long a 32-bit integer if compiling in 32-bit mode or a 64-bit integer if compiling in 64-bit mode, and the default on a modern Mac or Linux system is 64-bit.

Letting m(i) be the maximum value in row i we have the following table, computed using an AWK program based on the C code above.
im(i)
00
11
21
31
42
52
63
74
85
97
4658.05618e+18
4668.52072e+18
4679.01133e+18
4689.52944e+18

Over this range, m is very well approximated by the empirical formula

ln(m)1.65 ~ 1.117×i - 11.06

which is to say that m is more or less exponential in i. Row i=468 already exceeds the limits of a 64-bit signed integer. Imagine what would be happening to the run time if we didn't avoid repeated calculation!