Wikipedia’s page for Standard Deviation presents a formula for rapidly calculating the standard deviation of a growing set of numbers:

[1] Q_{k} = Q_{k-1} + (x_{k} – A_{k-1})(x_{k} – A_{k}) where x_{k} is the kth element in the set and A_{k} is the mean of the first kth numbers.

The standard deviation squared of the first n numbers of a set is Q_{n} / n.

From this, it is easy to compute the standard deviation for a growing set of numbers without having to read through all numbers in the set a second time. Recomputing the mean requires only an addition of (x_{k} – A_{k-1}) / k to the prior mean of A_{k-1} — so this does not require looking back through all the elements of the set.

Therefore recomputing Q does not require looking back through all the elements in the set since it only depends on x_{k}, Q_{k-1}, A_{k}, and A_{k-1}. We’ve deduced that A_{k} can be derived from A_{k-1} without having to look back through all elements of the set, so new values of Q can be computed likewise.

Now how would one rapidly calculate the standard deviation if the set was not growing, but rather was a moving window of fixed length? This is useful in finance or science when we want to keep an up to date standard deviation for a set of constant size with numbers changing one at a time. For example, in finance, when calculating Bollinger Bands, we want to always know the standard deviation in stock prices for the last 20 periods. So the set of stock prices isn’t growing in our set, instead it is of constant size, removing the oldest stock price and replacing it with the latest one at every tick interval. We can adapt formula [1] above for growing sets to meet our needs here by realizing that a moving window is a set that grows by one at one end, and shrinks by one at the other.

Without loss of generality, let’s assume the beginning of our window is element a of a given set (and we will be discarding the a-th element when the window moves by one), the second element is b, the last element of the window is c, and the element to be added is d. We are moving from a window of data that encompasses elements a through c, to a window of data one over that encompasses elements b through d. Our window length is constant, that is d – b = c – a.

So if our data set looks like … 1, 2, 3, 5, 8, 11, 13 … and our period is 4, the a-th element might have a value of 2, b-th is 3, c-th 8 and d-th 11. We moved from a window looking at [2, 3, 5, 8] to [3, 5, 8, 11].

If we grow our set from b to c to b to d by adding the d-th element, the Q formula would look like the following:

[2] Q_{bd} = Q_{bc} + (x_{d} – A_{bc})(x_{d} – A_{bd})

Now this formula looks the same as growing a set from b to c to a to c by adding the a-th element:

[3] Q_{ac} = Q_{bc} + (x_{a} – A_{bc})(x_{a} – A_{ac})

Rearranging [3], we find out that shrinking a set, in this case, from a to c to b to c, is simply reducing Q by the product of terms on the RHS:

[4] Q_{bc} = Q_{ac} – (x_{a} – A_{bc})(x_{a} – A_{ac})

Substituting [4] into [2], we find out how to get Q (and consequently the standard deviation) for the new window b to d, from the Q (or standard deviation) of our old window a to c:

[5] Q_{bd} = Q_{ac} – (x_{a} – A_{bc})(x_{a} – A_{ac}) + (x_{d} – A_{bc})(x_{d} – A_{bd})

We arrive at [5], a formula where we can compute new values of Q for a moving window (and we can easily calculate standard deviation from that) without having to go through a second pass of all numbers in the set. The derivation is simple after realizing a moving window is simply a set that grows and shrinks by one element.