Mean and standard deviation

If you have a frequency table, you have one column of data and one column of frequencies. Let \(f_i\) denote the frequency and \(x_i\) corresponding data. The formulas for the mean \(\mu\) and the standard deviation \(\sigma\) are:

\[\mu=\frac{\sum_{i=1}^{k}f_ix_i}{n} \hspace{1cm} \sigma=\sqrt{\frac{\sum_{i=1}^{k}f_i(x_i-\mu)^2}{n}} \]

where \(k\) is the number of data and \(n=\sum_{i=1}^{k}f_i\).

Expectation value and variance

If a random variable \(X\) can take values \(x_i\) with probability \(p_i\), then the most expected value of an outcome is called the expectation value \(E[X]\).

\[E[X]=p_1x_1+p_2x_2+\ldots +p_kx_k=\sum_{i=1}^kp_ix_i\]

The value \(x_i\) deviates from the mean \(\mu\) by the value \(|x_i-\mu |\). For computational reasons, it is easer to consider the square than the absolute value, \((x_i-\mu)^2\). The most expected value of the squares of the deviations is called the variance \(Var(X)\).

\[Var[X]=p_1(x_1-\mu)^2+p_2(x_2-\mu)^2+\ldots +p_k(x_k-\mu)^2=\sum_{i=1}^kp_i(x_i-\mu)^2\]

If you are given a frequency table with data values \(x_i\) and frequencies \(f_i\), and if \(n=\sum_{i=1}^{k}f_i\), then the probability of the outcome \(x_i\) is \(p_i=\frac{f_i}{n}\). From this we get that:

\[\mu=E[X]=\sum_{i=1}^kp_ix_i \hspace{1cm} \sigma^2=Var[X]=\sum_{i=1}^kp_i(x_i-\mu)^2 \]

Statistical one variable analysis

As of GeoGebra 4.0 you can make a statistical one variable analysis, and hence find the mean and the standard deviation. The GeoGebra tool uses either one column of data or several columns of grouped data. It can however not handle a frequency table. If you have a frequency table, you must turn it into a table of grouped data.

Image

In the picture above the columns A and B are the frequency table. The columns D-J contain corresponding grouped data.

Select columns D-J and choose the tool "One Variable Analysis".

Image

In the window that pops up, you will find all statistical facts about the grouped data.

Image

You can see that the mean is 173.2492 and the standard deviation 14.0333.

Use the formulas in the spreadsheet

You can also use the formulas for the mean and the standard deviation directly in the spreadsheet.

Image

The formula in cell D2 is B2 (A2 - $C$12)², where $C$12 specifies that the value of cell C12 should not be copied relatively but absolute.

by Malin Christersson under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 Sweden License