## A specific trick and a general habit for achieving fast computations – Wealthfront Engineering Blog I recently worked on a computation involved in a client-facing projection graph. In order to delight our users, we wanted it to respond instantly to user interaction. However, there are two forces at play that make the calculation complicated: the consistent contributions of recurring deposits and the erratic motion of stock prices. We needed to calculate a variable of the form

[y_t = sum_{i=0}^tsum_{j=0}^t a_ib_je^{k(t-max(i, j))}]

for all months \$t\$ from 0 (today) to \$T\$ (retirement). This quantity \$y_t\$ is related to one parameter of the estimated distribution at time \$t\$. The \$a_i\$’s and \$b_i\$’s depend on the deposits made at different time periods, and \$k\$ depends on the portfolio’s volatility. A direct implementation of this equation would look like:

This would require evaluating \$t^2\$ terms for each \$t\$ (because of the nested sum). If we do this many operations for each \$t\$ from 0 to \$T\$, we will end up using \$O(T^3)\$ operations for the final result. We often have \$T approx 500\$ months to calculate, making this an unacceptable runtime.

The process for turning this into an \$O(T)\$ algorithm is a reminder to never redo the same operation, whether implicitly or explicitly. I brought the graph’s recalculation time down to under 0.002 seconds by expanding the formula into recurrence relations.

## Optimization 1: Incremental calculation

The formula for \$y_t\$ may look nasty, but it can be simplified if we unravel it in terms of \$y_{t-1}\$. That is, if we reuse the information in \$y_{t-1}\$. To do this, separate the terms with \$i=t\$ or \$j=t\$ from the rest:

[y_t = sum_{i=0}^{t-1}sum_{j=0}^{t-1} a_ib_je^{k(t-max(i, j))} + sum_{i=0}^t a_ib_te^{k(t-t)} + sum_{j=0}^{t-1} a_tb_je^{k(t-t)}]

[y_t = sum_{i=0}^{t-1}sum_{j=0}^{t-1} a_ib_je^{k(t-max(i, j))} + b_tsum_{i=0}^t a_i + a_tsum_{j=0}^{t-1} b_j]

Then notice that the doubly-nested sum is very similar to \$y_{t-1}\$. In fact, it is off by only a factor of \$e^k\$:

[y_t = e^ksum_{i=0}^{t-1}sum_{j=0}^{t-1} a_ib_je^{k(t-1-max(i, j))} + b_tsum_{i=0}^t a_i + a_tsum_{j=0}^{t-1} b_j]

[y_t = e^ky_{t-1} + b_tsum_{i=0}^t a_i + a_tsum_{j=0}^{t-1} b_j]

Interpreting this recurrence relation as code gives us:

If we iterate over \$t\$ from 0 to \$T\$, we now only need to reuse \$y_{t-1}\$, do a few miscellaneous operations, and calculate \$O(t)\$ new terms. That means that in total, we require only \$O(T^2)\$ operations to calculate all \$y_t\$.

## Optimization 2: Incremental calculation again

We can make this formula even faster by reusing the same trick, this time on each of the remaining summations. If we let

[A_t = sum_{i=0}^{t-1}a_i]

[B_t = sum_{j=0}^{t-1}b_j]

Then we can use recurrence relations to describe \$A_t\$ and \$B_t\$:

[A_t = A_{t-1} + a_{t-1}]

[B_t = B_{t-1} + b_{t-1}]

And we can use these in our formula for \$y_t\$:

[y_t = e^ky_{t-1} + A_tb_t + a_tB_t + a_tb_t]

Interpreting these recurrence relations as code gives:

Now evaluating \$y_t, A_t,\$ and \$B_t\$ from \$y_{t-1}, A_{t-1},\$ and \$B_{t-1}\$ takes only a constant number of operations! That means we can calculate all \$y_t\$ in \$O(T)\$ time. Thanks to this optimization, the runtime is indeed imperceptible (about 1.4ms on my machine when \$T = 500\$).

## Optimization \$n\$

That’s as fast as we can make this algorithm, but what if we were to use this trick many times on a deeply nested sum? Here we had a doubly-nested sum, but I invite you to investigate what would happen if you had an \$n\$-times-nested summation of similar form:

[z_t = sum_{i_1=0}^tsum_{i_2=0}^tldotssum_{i_n=0}^ta_{1,i_1}a_{2,i_2}ldots a_{n,i_n}f(max(i_1, i_2, ldots i_n))]

Each time you do this trick, the algorithm requires one fewer loop but gets more complicated. To be precise, for large \$n\$ you can change this from an \$O(nT^{n+1})\$ algorithm to an \$O(3^nT)\$ algorithm.

The takeaway is that we often compute the same data multiple times in subtle ways, and that we can achieve huge performance gains by identifying and correcting these issues.