Interesting case of algorithm complexity

Here is task. You have a pyramid that looks like that:

      1
    2   3
  4   5   6
3  10  2   3

You start at top, and travel down. You can only travel down and only do adjusted node. You have only 2 choices. For example, from node with value 1 you can move to node with value 2 or 3. From node with value 2 you can only move to node with value 4 or 5, but you cannot move to node with value 6 because it is too far. Also, you cannot move from node with value 3 to node with value 4.

Your task is to find path that provides biggest sum of numbers. For pyramid above result looks like this:

      1
       \      
    2   3
       /
  4   5   6
     / 
3  10  2   3

Total sum for solution above is 19.

As you probably understand just picking biggest out of 2 choices will provide incorrect result because this algorithm will select 1->3->6->3 and total sum will be 13.

To solve this task, lets represent pyramid data as int[][]. For pyramid above data will be in this form:

int[][] pyramid = new []
{
  new []{1},
  new []{2, 3},
  new []{4, 5, 6},
  new []{3, 10, 2, 3},
}

First algorithm that come to mind is simple recursive algorithm like that:

private static int CalcLargestRecursive(int[][] nodes, int x, int y)
{
    int result = nodes[y][x];
    if (y == nodes.Length - 1)
    {
        return result;
    }

    int left = CalcLargestRecursive(nodes, x, y + 1);
    int right = CalcLargestRecursive(nodes, x + 1, y + 1);
    if (left > right)
    {
        return result + left;
    }
    else
    {
        return result + right;
    }
}

And it solves problem without any problem. But even that algorithm is quite simple to write and understand, there is something that looks suspicious. Let’s find out what size of pyramids that algorithm can solve. I used following code to generate pyramid of arbitrary size:

And that algorithm showing slow down on pyramid of size 30. At size 33 it took almost minute. Obviously, it didn’t scale very well. Let’s investigate why.

Pyramid of size 2 has only 2 possible paths to check:

      1
     / \
    2   3

Now let’s see how many paths we will have if we increase pyramid size to 3:

      1
     / \
    2   3
   / \ / \
  4   5   6

As you can see each path now get 2 additional paths. Node with value 2 gets path to 4 and 5 and node with value 3 gets path to 5 and 6. So there are 4 possible paths total. If we will check pyramid of size 4:

      1
     / \
    2   3
   / \ / \
  4   5   6
 / \ / \ / \
3  10   2   3

We will see that again each path gets 2 more paths and as result there are total 16 paths to traverse.

It is obvious that our algorithm has O(2^N) complexity and for N= 30, our code has to traverse around 1 billion paths. But it is not only problem. Our code repeatedly scanning arrays from top to bottom. For example, it scans 1->2->4->3 then 1->2->4->10 then 1-2->5->10 etc.

As result, our algorithm not only is having exponential complexity but also trashes CPU cache and breaks locality of data.

But is the any way to make it fast? Yes, and solution is quite primitive. Instead of going from top to bottom, we will go from bottom to top. We start from scanning bottom row and find out biggest number for each pair (selected in bold).

      1
     / \
    2   3
   / \ / \
  4   5   6
 / \ / \ / \
3  10   2   3

Then we add biggest number to above:

      1
     / \
    2   3
   / \ / \
  14  15  9
 / \ / \ / \
3  10   2   3

And then move to row above and repeat process:

      1
     / \
    17 18
   / \ / \
  14  15  9
 / \ / \ / \
3  10   2   3

And finally, we will have solution in top node:

      19
     / \
    17 18
   / \ / \
  14  15  9
 / \ / \ / \
3  10   2   3

Here is source code:

private static int CalcLargest(int[][] nodes)
{
    int len = nodes.Length;
    if (len == 0)
    {
        return 0;
    }

    for (int i = len - 1; i >= 1; i--)
    {
        int[] line = nodes[i];
        int[] prevLine = nodes[i - 1];
        for (int j = 0; j < line.Length - 1; j++)
        {
            int left = line[j];
            int right = line[j + 1];
            if (left > right)
            {
                prevLine[j] += left;
            }
            else
            {
                prevLine[j] += right;
            }
        }
    }

    return nodes[0][0];
}

Let’s investigate complexity of this algorithm. It is obvious that complexity is quadratic O(N^2). So for N=30 we have to do only about 900/2 operations. Even for N=10_000 we will have to do only about 100M/2 operations.

This algorithm not only has way less complexity, but it is also much better for CPU. As you can see, we are scanning items in a row one by one, and CPU can do prefetch and utilize cache because data is sequential in memory. And algorithm writes data to previous row also sequentially.

But it is not all. This algorithm allow compiler to avoid boundary check for inner loop. Compiler can check that code scans over array and never go over it and as result it can omit boundary check for reading from line array.

In conclusion, I shown that it is important to analyze complexity of your algorithms and test them on real data. I hope it helps someone.