incremental computation of standard deviation

How can I compute the standard deviation in an incremental way (using the new value and the last computed mean and/or std deviation) ?

for the non incremental way, I just do something like:

mean = Mean(list)

for i = 0 to list.size

   stdev = stdev + (list[i] - mean)^2

stdev = sqrRoot( stdev / list.size )

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

asked Jan 27 '12 at 15:12

shn

3751415

$begingroup$
The formula for this is a special case of the sample variance decomposition formula given in O'Neill (2014) (Result 1).
$endgroup$
– Ben
Jan 31 at 3:54

add a comment |

How can I compute the standard deviation in an incremental way (using the new value and the last computed mean and/or std deviation) ?

for the non incremental way, I just do something like:

mean = Mean(list)

for i = 0 to list.size

   stdev = stdev + (list[i] - mean)^2

stdev = sqrRoot( stdev / list.size )

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

asked Jan 27 '12 at 15:12

shn

3751415

$begingroup$
The formula for this is a special case of the sample variance decomposition formula given in O'Neill (2014) (Result 1).
$endgroup$
– Ben
Jan 31 at 3:54

add a comment |

How can I compute the standard deviation in an incremental way (using the new value and the last computed mean and/or std deviation) ?

for the non incremental way, I just do something like:

mean = Mean(list)

for i = 0 to list.size

   stdev = stdev + (list[i] - mean)^2

stdev = sqrRoot( stdev / list.size )

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

asked Jan 27 '12 at 15:12

shn

3751415

How can I compute the standard deviation in an incremental way (using the new value and the last computed mean and/or std deviation) ?

for the non incremental way, I just do something like:

mean = Mean(list)

for i = 0 to list.size

   stdev = stdev + (list[i] - mean)^2

stdev = sqrRoot( stdev / list.size )

statistics algorithms computational-mathematics standard-deviation

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

asked Jan 27 '12 at 15:12

shn

3751415

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

asked Jan 27 '12 at 15:12

shn

3751415

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

edited Jan 30 at 1:33

Alexander Gruber♦

20.1k25103174

asked Jan 27 '12 at 15:12

shn

3751415

asked Jan 27 '12 at 15:12

shn

3751415

asked Jan 27 '12 at 15:12

shn

3751415

$begingroup$
The formula for this is a special case of the sample variance decomposition formula given in O'Neill (2014) (Result 1).
$endgroup$
– Ben
Jan 31 at 3:54

add a comment |

$begingroup$
The formula for this is a special case of the sample variance decomposition formula given in O'Neill (2014) (Result 1).
$endgroup$
– Ben
Jan 31 at 3:54

The formula for this is a special case of the sample variance decomposition formula given in O'Neill (2014) (Result 1).

– Ben
Jan 31 at 3:54

add a comment |

6 Answers
6

active

oldest

votes

I think the easiest way to do this is with an orthogonality trick. I'll show how to incrementally compute the variance instead of the standard deviation. Let $X_1, X_2, ...$ be an iid sequence of random variables with $bar X = n^{-1} sum_{j = 1} ^ n X_j$ and $s^2_n$ defined similarly as the $n$'th sample variance (I use a denominator of $n-1$ instead of $n$ in your picture to keep things unbiased for the variance, but you can use the same argument just adding $1$ to all the terms with $n$ in them). First write
$$
s^2_n = frac{sum_{j = 1} ^ n (X_j - bar X_n)^2}{n - 1}
= frac{sum_{j = 1} ^ n (X_j - bar X_{n - 1} + bar X_{n - 1} + bar X_n)^2}{n - 1}.
$$
Expand this to get $$
s^2_n = frac{(n - 2)s^2_{n - 1} + (n - 1) (bar X_{n - 1} - bar X_n)^2 + 2 sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1})(bar X_{n - 1} - bar X_n) + (X_n - bar X_{n} ^ 2)}{n - 1}
$$
and it is easy to show that the summation term above is equal to $0$ which gives $$
s^2_n = frac{(n - 2)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2
+ (X_n - bar X_{n})^2}{n - 1}.
$$

EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $bar X_n = n^{-1}[X_n + (n-1)bar X_{n-1}]$.

edited Jan 27 '12 at 18:03

answered Jan 27 '12 at 17:45

guy

2,62311827

18

$begingroup$
Nice answer. Note that your formula can be simplified to $$s_n^2={n-2over n-1}s^2_{n-1}+{1over n}(X_n-bar X_{n-1})^2.$$
$endgroup$
– user940
Jan 27 '12 at 18:49

1

$begingroup$
Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick...
$endgroup$
– user55490
Jan 7 '13 at 18:09

1

$begingroup$
@user55490: $$ sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) bar X_{n - 1}) = (n - 1) (bar X_{n - 1} - bar X_{n - 1})$$
$endgroup$
– Jean Hominal
Oct 26 '15 at 14:24

2

$begingroup$
See also this wikipedia post
$endgroup$
– Geoffrey De Smet
Feb 1 '17 at 9:35

1

$begingroup$
Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2 + (X_n - bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance.
$endgroup$
– user3019105
Feb 24 '18 at 18:41

|
show 1 more comment

The standard deviation is a function of the totals $T_{alpha}=sum_{i=1}^{N}x_{i}^{alpha}$ for $alpha=0,1,2$, each of which can be calculated incrementally in an obvious way. In particular, $E[X]=T_{1}/T_{0}$ and $E[X^2]=T_{2}/T_{0}$, and the standard deviation is
$$
sigma = sqrt{text{Var}[X]} = sqrt{E[X^2]-E[X]^2} = frac{1}{T_0}sqrt{T_{0}T_{2}-T_{1}^2}.
$$
By maintaining totals of higher powers ($T_{alpha}$ for $alpha ge 3$), you can derive similar "incremental" expressions for the skewness, kurtosis, and so on.

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

1

$begingroup$
I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this.
$endgroup$
– guy
Jan 27 '12 at 17:52

1

$begingroup$
This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.)
$endgroup$
– Jon Watte
Aug 1 '14 at 3:39

$begingroup$
Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily.
$endgroup$
– Quentin
Feb 11 '16 at 16:31

$begingroup$
I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake?
$endgroup$
– Phil
Oct 18 '18 at 16:00

1

$begingroup$
@Phil: the formula I gave, $sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $sqrt{3/2}$, giving the value of $1$ that you were expecting.
$endgroup$
– mjqxxxx
Oct 22 '18 at 18:20

add a comment |

What you refer to as an incremental computation is very close to the computer scientist's notion of an online algorithm. There is in fact a well-known online algorithm for computing the variance and thus square root of a sequence of data, documented here.

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

$begingroup$
Just for the record, the algorithm in the linked wikipedia entry was implemented in an answer two years later.
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:28

add a comment |

Another way of saying the above is that you need to keep a count, an incremental sum of values and an incremental sum of squares.

Let $N$ be the count of values seen so far, $S = sum_{1,N} x_i$ and $Q = sum_{1,N} x_i^2$ (where both $S$ and $Q$ are maintained incrementally).
Then any stage,

the mean is $frac{S}{N}$

and the variance is $frac{Q}{N} - left( frac{S}{N}right)^2$

An advantage of this method is that it is less prone to rounding errors after a long series of calculations.

answered Jul 31 '15 at 2:32

Philip Kilby

5111

$begingroup$
This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid?
$endgroup$
– Edward Ross
Aug 25 '16 at 13:12

$begingroup$
indeed (let $X=S/N$ be mean): $s^2=1/Nsum_{i=0}^n (x_i-X)^2 = 1/Nsum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/Nsum_{i=0}^n x_i^2-1/Nsum_{i=0}^n 2x_iX+1/Nsum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$
$endgroup$
– dennyrolling
Apr 26 '17 at 20:52

add a comment |

Forgive my poor math background, what I need is detail!

I added my progress here for someone like me.

$$
s_n^2=frac {sum_{i=1}^{n}(x_i-bar{x}_n)^2}{n-1} \
= frac {sum_{i=1}^n(x_i - bar{x}_{n-1} + bar{x}_{n-1} - bar{x}_n)^2}{n-1} \
= frac {sum_{i=1}^{n}(x_i - bar{x}_{n-1})^2 + 2sum_{i=1}^n(x_i - bar{x}_{n-1})(bar{x}_{n-1} - bar{x}_n) + sum_{i=1}^n(bar{x}_{n-1} - bar{x}_n)^2} {n-1} \
= frac {(sum_{i=1}^{n-1}(x_i - bar{x}_{n-1})^2 + (x_n - bar{x}_{n-1})^2) + (2sum_{i=1}^{n-1}(x_i - bar{x}_{n-1})(bar{x}_{n-1} - bar{x}_n) + 2(x_n - bar{x}_{n-1})(bar{x}_{n-1} - bar{x}_n)) + sum_{i=1}^n(bar{x}_{n-1} - bar{x}_n)^2} {n-1} \
= frac {(n-2)s_{n-1}^2 + (x_n - bar{x}_{n-1})^2 + 0 + 2(x_n - bar{x}_{n-1})(bar{x}_{n-1} - bar{x}_n)) + n(bar{x}_{n-1} - bar{x}_n)^2} {n-1} \
= frac {(n-2)s_{n-1}^2 + x_n^2 - 2x_nbar{x}_{n-1} + bar{x}_{n-1}^2 + 2x_n bar{x}_{n-1} - 2x_n bar{x}_n - 2bar{x}_{n-1}^2 + 2bar{x}_{n-1}bar{x}_n + nbar{x}_{n-1}^2 - 2nbar{x}_{n-1}bar{x}_n + nbar{x}_n^2} {n-1} \
= frac {(n-2)s_{n-1}^2 + x_n^2 + bar{x}_{n-1}^2 - 2x_n bar{x}_n - 2bar{x}_{n-1}^2 + 2bar{x}_{n-1}bar{x}_n + nbar{x}_{n-1}^2 - 2nbar{x}_{n-1}bar{x}_n + nbar{x}_n^2} {n-1} \
= frac {(n-2)s_{n-1}^2 + x_n^2 - 2x_nbar{x}_{n-1} + bar{x}_{n-1}^2 + 2x_n bar{x}_{n-1} - 2x_n bar{x}_n - 2bar{x}_{n-1}^2 + 2bar{x}_{n-1}bar{x}_n + nbar{x}_{n-1}^2 - 2nbar{x}_{n-1}bar{x}_n + nbar{x}_n^2} {n-1} \
= frac {(n-2)s_{n-1}^2 + (x_n^2 - 2x_nbar{x}_n + bar{x}_n^2) + (n-1)(bar{x}_{n-1}^2 - 2bar{x}_{n-1}bar{x}_n + bar{x}_n^2)} {n-1} \
= frac {(n-2)s_{n-1}^2 + (n-1)(bar{x}_{n-1} - bar{x}_n)^2 + (x_n - bar{x}_n)^2} {n-1} \
= frac {n-2}{n-1}s_{n-1}^2 + (bar{x}_{n-1} - bar{x}_n)^2 + frac {(x_n - bar{x}_n)^2}{n-1}
$$

and

$$
(bar{x}_{n-1} - bar{x}_n)^2 + frac {(x_n - bar{x}_n)^2}{n-1} \
= (bar{x}_{n-1} - frac {x_n + (n-1)bar{x}_{n-1}}{n})^2 + frac {(x_n - frac {x_n + (n-1)bar{x}_{n-1}}{n})^2}{n-1} \
= frac {1}{n} (x_n - bar{x}_{n-1})^2
$$

$$
s_n^2 = frac{n-2}{n-1}s_{n-1}^2 + frac{1}{n}(x_n - bar{x}_{n-1})^2
$$

answered Sep 2 '17 at 11:38

secfree

1412

$begingroup$
Awesome! Thanks!
$endgroup$
– user3019105
Feb 24 '18 at 18:33

add a comment |

If it is of any help I found what seems to be a much nicer way here wikipedia - online algorithm .

The following is how I have implemented it in java. It has work very well for me. Hope it is clear enough, Please feel free to ask me questions about it.

/**

 * standardDeviation() - designed to calculate the standard deviation of a data set incrementally by taking the last entered value and the previous sum of differences to the mean recorded.

 * (i.e; upon adding a value to the data set this function should immediately be called)

 * 

 * NOTE: do not call this function if the data set size it less than 2 since standard deviation cannot be calculated on a single value

 * NOTE: sum_avg, sum_sd and avg are all static variables

 * NOTE: on attempting to use this on another set following previous use, the static values will have to be reset**

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the Standard deviation

 */

public static double standardDeviation(List<Double> vector)

{   

    double N = (double) vector.size();                  //size of the data set

    double oldavg = avg;                                //save the old average

    avg = updateAverage(vector);                        //update the new average



    if(N==2.0)                                          //if there are only two, we calculate the standard deviation using the standard formula 

    {                                                               

        for(double d:vector)                            //cycle through the 2 elements of the data set - there is no need to use a loop here, the set is quite small to just do manually

        {

            sum_sd += Math.pow((Math.abs(d)-avg), 2);   //sum the following according to the formula

        }

    }

    else if(N>2)                                        //once we have calculated the base sum_std  

    {   

        double newel = (vector.get(vector.size()-1));   //get the latest addition to the data set



        sum_sd = sum_sd + (newel - oldavg)*(newel-avg); //https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

    }

    return Math.sqrt((sum_sd)/(N));                     //N or N-1 depends on your choice of sample of population standard deviation



}



/**

 * simplistic method for incrementally calculating the mean of a data set

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the mean of the given data set

 */

public static double updateAverage(List<Double> vector)

{

    if(vector.size()==2){

        sum_avg = vector.get(vector.size()-1) + vector.get(vector.size()-2);

    }

    else{

        sum_avg += vector.get(vector.size()-1);

    }





    return sum_avg/(double)vector.size();



}

edited Jul 12 '17 at 12:12

answered Jul 12 '17 at 12:02

KevinCorcor

112

$begingroup$
This points to exactly same source as the the answer by @oldrinb two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference).
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:25

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f102978%2fincremental-computation-of-standard-deviation%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $bar X_n = n^{-1}[X_n + (n-1)bar X_{n-1}]$.

edited Jan 27 '12 at 18:03

answered Jan 27 '12 at 17:45

guy

2,62311827

18

$begingroup$
Nice answer. Note that your formula can be simplified to $$s_n^2={n-2over n-1}s^2_{n-1}+{1over n}(X_n-bar X_{n-1})^2.$$
$endgroup$
– user940
Jan 27 '12 at 18:49

1

$begingroup$
Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick...
$endgroup$
– user55490
Jan 7 '13 at 18:09

1

$begingroup$
@user55490: $$ sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) bar X_{n - 1}) = (n - 1) (bar X_{n - 1} - bar X_{n - 1})$$
$endgroup$
– Jean Hominal
Oct 26 '15 at 14:24

2

$begingroup$
See also this wikipedia post
$endgroup$
– Geoffrey De Smet
Feb 1 '17 at 9:35

1

$begingroup$
Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2 + (X_n - bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance.
$endgroup$
– user3019105
Feb 24 '18 at 18:41

|
show 1 more comment

EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $bar X_n = n^{-1}[X_n + (n-1)bar X_{n-1}]$.

edited Jan 27 '12 at 18:03

answered Jan 27 '12 at 17:45

guy

2,62311827

18

$begingroup$
Nice answer. Note that your formula can be simplified to $$s_n^2={n-2over n-1}s^2_{n-1}+{1over n}(X_n-bar X_{n-1})^2.$$
$endgroup$
– user940
Jan 27 '12 at 18:49

1

$begingroup$
Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick...
$endgroup$
– user55490
Jan 7 '13 at 18:09

1

$begingroup$
@user55490: $$ sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) bar X_{n - 1}) = (n - 1) (bar X_{n - 1} - bar X_{n - 1})$$
$endgroup$
– Jean Hominal
Oct 26 '15 at 14:24

2

$begingroup$
See also this wikipedia post
$endgroup$
– Geoffrey De Smet
Feb 1 '17 at 9:35

1

$begingroup$
Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2 + (X_n - bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance.
$endgroup$
– user3019105
Feb 24 '18 at 18:41

|
show 1 more comment

EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $bar X_n = n^{-1}[X_n + (n-1)bar X_{n-1}]$.

edited Jan 27 '12 at 18:03

answered Jan 27 '12 at 17:45

guy

2,62311827

EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $bar X_n = n^{-1}[X_n + (n-1)bar X_{n-1}]$.

edited Jan 27 '12 at 18:03

answered Jan 27 '12 at 17:45

guy

2,62311827

edited Jan 27 '12 at 18:03

answered Jan 27 '12 at 17:45

guy

2,62311827

answered Jan 27 '12 at 17:45

guy

2,62311827

answered Jan 27 '12 at 17:45

guy

2,62311827

18

$begingroup$
Nice answer. Note that your formula can be simplified to $$s_n^2={n-2over n-1}s^2_{n-1}+{1over n}(X_n-bar X_{n-1})^2.$$
$endgroup$
– user940
Jan 27 '12 at 18:49

1

$begingroup$
Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick...
$endgroup$
– user55490
Jan 7 '13 at 18:09

1

$begingroup$
@user55490: $$ sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) bar X_{n - 1}) = (n - 1) (bar X_{n - 1} - bar X_{n - 1})$$
$endgroup$
– Jean Hominal
Oct 26 '15 at 14:24

2

$begingroup$
See also this wikipedia post
$endgroup$
– Geoffrey De Smet
Feb 1 '17 at 9:35

1

$begingroup$
Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2 + (X_n - bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance.
$endgroup$
– user3019105
Feb 24 '18 at 18:41

|
show 1 more comment

18

$begingroup$
Nice answer. Note that your formula can be simplified to $$s_n^2={n-2over n-1}s^2_{n-1}+{1over n}(X_n-bar X_{n-1})^2.$$
$endgroup$
– user940
Jan 27 '12 at 18:49

1

$begingroup$
Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick...
$endgroup$
– user55490
Jan 7 '13 at 18:09

1

$begingroup$
@user55490: $$ sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) bar X_{n - 1}) = (n - 1) (bar X_{n - 1} - bar X_{n - 1})$$
$endgroup$
– Jean Hominal
Oct 26 '15 at 14:24

2

$begingroup$
See also this wikipedia post
$endgroup$
– Geoffrey De Smet
Feb 1 '17 at 9:35

1

$begingroup$
Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2 + (X_n - bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance.
$endgroup$
– user3019105
Feb 24 '18 at 18:41

Nice answer. Note that your formula can be simplified to $$s_n^2={n-2over n-1}s^2_{n-1}+{1over n}(X_n-bar X_{n-1})^2.$$

– user940
Jan 27 '12 at 18:49

Thanks Guy for the great explaination (2). it is exactly what i was looking for. but can you please elaborate how the summation term equals zero as i seem to be missing a trick...

– user55490
Jan 7 '13 at 18:09

@user55490: $$ sum_{j = 1} ^ {n - 1} (X_j - bar X_{n - 1}) = (sum_{j = 1} ^ {n - 1} (X_j) - (n - 1) bar X_{n - 1}) = (n - 1) (bar X_{n - 1} - bar X_{n - 1})$$

– Jean Hominal
Oct 26 '15 at 14:24

See also this wikipedia post

– Geoffrey De Smet
Feb 1 '17 at 9:35

Please, note that the simplified formula of @user940 can only be used if we are computing the sample variance. If we want to use a custom DDOF (Delta Degrees Of Freedom), the formula becomes: $$s^2_n = frac{(n - 1 - d)s^2_{n - 1} + (n - 1)(bar X_{n - 1} - bar X_n)^2 + (X_n - bar X_{n})^2}{n - d}$$ Where $d$ is the DDOF, usually $d = 0$ for population variance and $d = 1$ for sample variance.

– user3019105
Feb 24 '18 at 18:41

|
show 1 more comment

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

1

$begingroup$
I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this.
$endgroup$
– guy
Jan 27 '12 at 17:52

1

$begingroup$
This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.)
$endgroup$
– Jon Watte
Aug 1 '14 at 3:39

$begingroup$
Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily.
$endgroup$
– Quentin
Feb 11 '16 at 16:31

$begingroup$
I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake?
$endgroup$
– Phil
Oct 18 '18 at 16:00

1

$begingroup$
@Phil: the formula I gave, $sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $sqrt{3/2}$, giving the value of $1$ that you were expecting.
$endgroup$
– mjqxxxx
Oct 22 '18 at 18:20

add a comment |

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

1

$begingroup$
I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this.
$endgroup$
– guy
Jan 27 '12 at 17:52

1

$begingroup$
This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.)
$endgroup$
– Jon Watte
Aug 1 '14 at 3:39

$begingroup$
Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily.
$endgroup$
– Quentin
Feb 11 '16 at 16:31

$begingroup$
I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake?
$endgroup$
– Phil
Oct 18 '18 at 16:00

1

$begingroup$
@Phil: the formula I gave, $sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $sqrt{3/2}$, giving the value of $1$ that you were expecting.
$endgroup$
– mjqxxxx
Oct 22 '18 at 18:20

add a comment |

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

answered Jan 27 '12 at 15:37

mjqxxxx

31.6k24186

1

$begingroup$
I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this.
$endgroup$
– guy
Jan 27 '12 at 17:52

1

$begingroup$
This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.)
$endgroup$
– Jon Watte
Aug 1 '14 at 3:39

$begingroup$
Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily.
$endgroup$
– Quentin
Feb 11 '16 at 16:31

$begingroup$
I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake?
$endgroup$
– Phil
Oct 18 '18 at 16:00

1

$begingroup$
@Phil: the formula I gave, $sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $sqrt{3/2}$, giving the value of $1$ that you were expecting.
$endgroup$
– mjqxxxx
Oct 22 '18 at 18:20

add a comment |

1

$begingroup$
I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this.
$endgroup$
– guy
Jan 27 '12 at 17:52

1

$begingroup$
This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.)
$endgroup$
– Jon Watte
Aug 1 '14 at 3:39

$begingroup$
Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily.
$endgroup$
– Quentin
Feb 11 '16 at 16:31

$begingroup$
I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake?
$endgroup$
– Phil
Oct 18 '18 at 16:00

1

$begingroup$
@Phil: the formula I gave, $sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $sqrt{3/2}$, giving the value of $1$ that you were expecting.
$endgroup$
– mjqxxxx
Oct 22 '18 at 18:20

I think the @user wanted a formula that lets you incrementally calculate the standard deviation as you increase the data set in a way that is computationally efficient, not this.

– guy
Jan 27 '12 at 17:52

This is actually computationally efficient, because you just need to count the number of samples, the sum of the samples, and the sum of the squares of each sample. Once you need to "read" the standard deviation (or average,) you calculate the square root, which costs two multuplies, one subtraction, one square root, and one divide. On most modern CPUs, this is highly efficient (and very cache friendly.)

– Jon Watte
Aug 1 '14 at 3:39

Furthermore, this methods allows to compute easily a running standard deviation as well, which is what i was looking for. in addition to adding the new value to the sum of values and squares, remove the oldest value from them as well. Sorry for the late comment, but I found this page after googling this issue and I hope it will let future people find this more easily.

– Quentin
Feb 11 '16 at 16:31

I have tried to implement this in R. However, I seem to am doing something wrong as I don't get correct standard deviation values. E.g. for the three x values 1, 2, and 3 the standard deviation should be 1. T0 = 3 (1^0 + 2^0 + 3^0), T1 = 6 (1^1 + 2^1 + 3^1), T2 = 14 (1^2 + 2^2 + 3^2). The final standard deviation value according to the formula above would be 0.8165 which is different from 1. Could you tell me where I am making a mistake?

– Phil
Oct 18 '18 at 16:00

@Phil: the formula I gave, $sqrt{E[X^2]-E[X]^2}$, is the population standard deviation. Remember that the sample standard deviation differs from this by a factor of $sqrt{n/(n-1)}$. To get the sample standard deviation in your case you would multiply by $sqrt{3/2}$, giving the value of $1$ that you were expecting.

– mjqxxxx
Oct 22 '18 at 18:20

add a comment |

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

$begingroup$
Just for the record, the algorithm in the linked wikipedia entry was implemented in an answer two years later.
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:28

add a comment |

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

$begingroup$
Just for the record, the algorithm in the linked wikipedia entry was implemented in an answer two years later.
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:28

add a comment |

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

answered Jul 31 '15 at 2:36

oldrinb

4,5801622

$begingroup$
Just for the record, the algorithm in the linked wikipedia entry was implemented in an answer two years later.
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:28

add a comment |

$begingroup$
Just for the record, the algorithm in the linked wikipedia entry was implemented in an answer two years later.
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:28

Just for the record, the algorithm in the linked wikipedia entry was implemented in an answer two years later.

– Lee David Chung Lin
Feb 14 at 14:28

add a comment |

answered Jul 31 '15 at 2:32

Philip Kilby

5111

$begingroup$
This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid?
$endgroup$
– Edward Ross
Aug 25 '16 at 13:12

$begingroup$
indeed (let $X=S/N$ be mean): $s^2=1/Nsum_{i=0}^n (x_i-X)^2 = 1/Nsum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/Nsum_{i=0}^n x_i^2-1/Nsum_{i=0}^n 2x_iX+1/Nsum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$
$endgroup$
– dennyrolling
Apr 26 '17 at 20:52

add a comment |

answered Jul 31 '15 at 2:32

Philip Kilby

5111

$begingroup$
This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid?
$endgroup$
– Edward Ross
Aug 25 '16 at 13:12

$begingroup$
indeed (let $X=S/N$ be mean): $s^2=1/Nsum_{i=0}^n (x_i-X)^2 = 1/Nsum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/Nsum_{i=0}^n x_i^2-1/Nsum_{i=0}^n 2x_iX+1/Nsum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$
$endgroup$
– dennyrolling
Apr 26 '17 at 20:52

add a comment |

answered Jul 31 '15 at 2:32

Philip Kilby

5111

answered Jul 31 '15 at 2:32

Philip Kilby

5111

answered Jul 31 '15 at 2:32

Philip Kilby

5111

answered Jul 31 '15 at 2:32

Philip Kilby

5111

answered Jul 31 '15 at 2:32

Philip Kilby

5111

$begingroup$
This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid?
$endgroup$
– Edward Ross
Aug 25 '16 at 13:12

$begingroup$
indeed (let $X=S/N$ be mean): $s^2=1/Nsum_{i=0}^n (x_i-X)^2 = 1/Nsum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/Nsum_{i=0}^n x_i^2-1/Nsum_{i=0}^n 2x_iX+1/Nsum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$
$endgroup$
– dennyrolling
Apr 26 '17 at 20:52

add a comment |

$begingroup$
This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid?
$endgroup$
– Edward Ross
Aug 25 '16 at 13:12

$begingroup$
indeed (let $X=S/N$ be mean): $s^2=1/Nsum_{i=0}^n (x_i-X)^2 = 1/Nsum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/Nsum_{i=0}^n x_i^2-1/Nsum_{i=0}^n 2x_iX+1/Nsum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$
$endgroup$
– dennyrolling
Apr 26 '17 at 20:52

This is the best way to achieve the goal - single pass calculate with fixed amount of storage required. But is it mathematically valid?

– Edward Ross
Aug 25 '16 at 13:12

indeed (let $X=S/N$ be mean): $s^2=1/Nsum_{i=0}^n (x_i-X)^2 = 1/Nsum_{i=0}^n (x_i^2 - 2x_iX+X^2)=1/Nsum_{i=0}^n x_i^2-1/Nsum_{i=0}^n 2x_iX+1/Nsum_{i=0}^n X^2=Q/N-2X^2+X^2=Q/N-X^2=Q/N-(S/N)^2$

– dennyrolling
Apr 26 '17 at 20:52

add a comment |

Forgive my poor math background, what I need is detail!

I added my progress here for someone like me.

and

$$
s_n^2 = frac{n-2}{n-1}s_{n-1}^2 + frac{1}{n}(x_n - bar{x}_{n-1})^2
$$

answered Sep 2 '17 at 11:38

secfree

1412

$begingroup$
Awesome! Thanks!
$endgroup$
– user3019105
Feb 24 '18 at 18:33

add a comment |

Forgive my poor math background, what I need is detail!

I added my progress here for someone like me.

and

$$
s_n^2 = frac{n-2}{n-1}s_{n-1}^2 + frac{1}{n}(x_n - bar{x}_{n-1})^2
$$

answered Sep 2 '17 at 11:38

secfree

1412

$begingroup$
Awesome! Thanks!
$endgroup$
– user3019105
Feb 24 '18 at 18:33

add a comment |

Forgive my poor math background, what I need is detail!

I added my progress here for someone like me.

and

$$
s_n^2 = frac{n-2}{n-1}s_{n-1}^2 + frac{1}{n}(x_n - bar{x}_{n-1})^2
$$

answered Sep 2 '17 at 11:38

secfree

1412

Forgive my poor math background, what I need is detail!

I added my progress here for someone like me.

and

$$
s_n^2 = frac{n-2}{n-1}s_{n-1}^2 + frac{1}{n}(x_n - bar{x}_{n-1})^2
$$

answered Sep 2 '17 at 11:38

secfree

1412

answered Sep 2 '17 at 11:38

secfree

1412

answered Sep 2 '17 at 11:38

secfree

1412

answered Sep 2 '17 at 11:38

secfree

1412

$begingroup$
Awesome! Thanks!
$endgroup$
– user3019105
Feb 24 '18 at 18:33

add a comment |

$begingroup$
Awesome! Thanks!
$endgroup$
– user3019105
Feb 24 '18 at 18:33

Awesome! Thanks!

– user3019105
Feb 24 '18 at 18:33

add a comment |

If it is of any help I found what seems to be a much nicer way here wikipedia - online algorithm .

The following is how I have implemented it in java. It has work very well for me. Hope it is clear enough, Please feel free to ask me questions about it.

/**

 * standardDeviation() - designed to calculate the standard deviation of a data set incrementally by taking the last entered value and the previous sum of differences to the mean recorded.

 * (i.e; upon adding a value to the data set this function should immediately be called)

 * 

 * NOTE: do not call this function if the data set size it less than 2 since standard deviation cannot be calculated on a single value

 * NOTE: sum_avg, sum_sd and avg are all static variables

 * NOTE: on attempting to use this on another set following previous use, the static values will have to be reset**

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the Standard deviation

 */

public static double standardDeviation(List<Double> vector)

{   

    double N = (double) vector.size();                  //size of the data set

    double oldavg = avg;                                //save the old average

    avg = updateAverage(vector);                        //update the new average



    if(N==2.0)                                          //if there are only two, we calculate the standard deviation using the standard formula 

    {                                                               

        for(double d:vector)                            //cycle through the 2 elements of the data set - there is no need to use a loop here, the set is quite small to just do manually

        {

            sum_sd += Math.pow((Math.abs(d)-avg), 2);   //sum the following according to the formula

        }

    }

    else if(N>2)                                        //once we have calculated the base sum_std  

    {   

        double newel = (vector.get(vector.size()-1));   //get the latest addition to the data set



        sum_sd = sum_sd + (newel - oldavg)*(newel-avg); //https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

    }

    return Math.sqrt((sum_sd)/(N));                     //N or N-1 depends on your choice of sample of population standard deviation



}



/**

 * simplistic method for incrementally calculating the mean of a data set

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the mean of the given data set

 */

public static double updateAverage(List<Double> vector)

{

    if(vector.size()==2){

        sum_avg = vector.get(vector.size()-1) + vector.get(vector.size()-2);

    }

    else{

        sum_avg += vector.get(vector.size()-1);

    }





    return sum_avg/(double)vector.size();



}

edited Jul 12 '17 at 12:12

answered Jul 12 '17 at 12:02

KevinCorcor

112

$begingroup$
This points to exactly same source as the the answer by @oldrinb two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference).
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:25

add a comment |

If it is of any help I found what seems to be a much nicer way here wikipedia - online algorithm .

The following is how I have implemented it in java. It has work very well for me. Hope it is clear enough, Please feel free to ask me questions about it.

/**

 * standardDeviation() - designed to calculate the standard deviation of a data set incrementally by taking the last entered value and the previous sum of differences to the mean recorded.

 * (i.e; upon adding a value to the data set this function should immediately be called)

 * 

 * NOTE: do not call this function if the data set size it less than 2 since standard deviation cannot be calculated on a single value

 * NOTE: sum_avg, sum_sd and avg are all static variables

 * NOTE: on attempting to use this on another set following previous use, the static values will have to be reset**

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the Standard deviation

 */

public static double standardDeviation(List<Double> vector)

{   

    double N = (double) vector.size();                  //size of the data set

    double oldavg = avg;                                //save the old average

    avg = updateAverage(vector);                        //update the new average



    if(N==2.0)                                          //if there are only two, we calculate the standard deviation using the standard formula 

    {                                                               

        for(double d:vector)                            //cycle through the 2 elements of the data set - there is no need to use a loop here, the set is quite small to just do manually

        {

            sum_sd += Math.pow((Math.abs(d)-avg), 2);   //sum the following according to the formula

        }

    }

    else if(N>2)                                        //once we have calculated the base sum_std  

    {   

        double newel = (vector.get(vector.size()-1));   //get the latest addition to the data set



        sum_sd = sum_sd + (newel - oldavg)*(newel-avg); //https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

    }

    return Math.sqrt((sum_sd)/(N));                     //N or N-1 depends on your choice of sample of population standard deviation



}



/**

 * simplistic method for incrementally calculating the mean of a data set

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the mean of the given data set

 */

public static double updateAverage(List<Double> vector)

{

    if(vector.size()==2){

        sum_avg = vector.get(vector.size()-1) + vector.get(vector.size()-2);

    }

    else{

        sum_avg += vector.get(vector.size()-1);

    }





    return sum_avg/(double)vector.size();



}

edited Jul 12 '17 at 12:12

answered Jul 12 '17 at 12:02

KevinCorcor

112

$begingroup$
This points to exactly same source as the the answer by @oldrinb two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference).
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:25

add a comment |

If it is of any help I found what seems to be a much nicer way here wikipedia - online algorithm .

The following is how I have implemented it in java. It has work very well for me. Hope it is clear enough, Please feel free to ask me questions about it.

/**

 * standardDeviation() - designed to calculate the standard deviation of a data set incrementally by taking the last entered value and the previous sum of differences to the mean recorded.

 * (i.e; upon adding a value to the data set this function should immediately be called)

 * 

 * NOTE: do not call this function if the data set size it less than 2 since standard deviation cannot be calculated on a single value

 * NOTE: sum_avg, sum_sd and avg are all static variables

 * NOTE: on attempting to use this on another set following previous use, the static values will have to be reset**

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the Standard deviation

 */

public static double standardDeviation(List<Double> vector)

{   

    double N = (double) vector.size();                  //size of the data set

    double oldavg = avg;                                //save the old average

    avg = updateAverage(vector);                        //update the new average



    if(N==2.0)                                          //if there are only two, we calculate the standard deviation using the standard formula 

    {                                                               

        for(double d:vector)                            //cycle through the 2 elements of the data set - there is no need to use a loop here, the set is quite small to just do manually

        {

            sum_sd += Math.pow((Math.abs(d)-avg), 2);   //sum the following according to the formula

        }

    }

    else if(N>2)                                        //once we have calculated the base sum_std  

    {   

        double newel = (vector.get(vector.size()-1));   //get the latest addition to the data set



        sum_sd = sum_sd + (newel - oldavg)*(newel-avg); //https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

    }

    return Math.sqrt((sum_sd)/(N));                     //N or N-1 depends on your choice of sample of population standard deviation



}



/**

 * simplistic method for incrementally calculating the mean of a data set

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the mean of the given data set

 */

public static double updateAverage(List<Double> vector)

{

    if(vector.size()==2){

        sum_avg = vector.get(vector.size()-1) + vector.get(vector.size()-2);

    }

    else{

        sum_avg += vector.get(vector.size()-1);

    }





    return sum_avg/(double)vector.size();



}

edited Jul 12 '17 at 12:12

answered Jul 12 '17 at 12:02

KevinCorcor

112

If it is of any help I found what seems to be a much nicer way here wikipedia - online algorithm .

The following is how I have implemented it in java. It has work very well for me. Hope it is clear enough, Please feel free to ask me questions about it.

/**

 * standardDeviation() - designed to calculate the standard deviation of a data set incrementally by taking the last entered value and the previous sum of differences to the mean recorded.

 * (i.e; upon adding a value to the data set this function should immediately be called)

 * 

 * NOTE: do not call this function if the data set size it less than 2 since standard deviation cannot be calculated on a single value

 * NOTE: sum_avg, sum_sd and avg are all static variables

 * NOTE: on attempting to use this on another set following previous use, the static values will have to be reset**

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the Standard deviation

 */

public static double standardDeviation(List<Double> vector)

{   

    double N = (double) vector.size();                  //size of the data set

    double oldavg = avg;                                //save the old average

    avg = updateAverage(vector);                        //update the new average



    if(N==2.0)                                          //if there are only two, we calculate the standard deviation using the standard formula 

    {                                                               

        for(double d:vector)                            //cycle through the 2 elements of the data set - there is no need to use a loop here, the set is quite small to just do manually

        {

            sum_sd += Math.pow((Math.abs(d)-avg), 2);   //sum the following according to the formula

        }

    }

    else if(N>2)                                        //once we have calculated the base sum_std  

    {   

        double newel = (vector.get(vector.size()-1));   //get the latest addition to the data set



        sum_sd = sum_sd + (newel - oldavg)*(newel-avg); //https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

    }

    return Math.sqrt((sum_sd)/(N));                     //N or N-1 depends on your choice of sample of population standard deviation



}



/**

 * simplistic method for incrementally calculating the mean of a data set

 * 

 * @param vector - List<Double> - data with only one additional value from previous method call

 * @return updated value for the mean of the given data set

 */

public static double updateAverage(List<Double> vector)

{

    if(vector.size()==2){

        sum_avg = vector.get(vector.size()-1) + vector.get(vector.size()-2);

    }

    else{

        sum_avg += vector.get(vector.size()-1);

    }





    return sum_avg/(double)vector.size();



}

edited Jul 12 '17 at 12:12

answered Jul 12 '17 at 12:02

KevinCorcor

112

edited Jul 12 '17 at 12:12

answered Jul 12 '17 at 12:02

KevinCorcor

112

answered Jul 12 '17 at 12:02

KevinCorcor

112

answered Jul 12 '17 at 12:02

KevinCorcor

112

$begingroup$
This points to exactly same source as the the answer by @oldrinb two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference).
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:25

add a comment |

$begingroup$
This points to exactly same source as the the answer by @oldrinb two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference).
$endgroup$
– Lee David Chung Lin
Feb 14 at 14:25

This points to exactly same source as the the answer by @oldrinb two years earlier. The good thing here is that this is self-contained while the older one is just a link (with a keyword reference).

– Lee David Chung Lin
Feb 14 at 14:25

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Search This Blog

Ufyukyu