Mean and Variance of a Dataset
When we make a measurement we are trying to determine the true value of a particular characteristic of some material or process. However, the measurements that we make involve uncertainty (i.e., no measurement is perfect). Therefore, the value we acquire is not the true value but some estimate of the true value. If we measure the same sample 20 times, we typically do not expect to acquire the same measured value every time. We will get a distribution of measured values. For example, Figure 1 shows a plot of the result from 20 measurements of the same sample. We need to be able to characterize the true value of the sample (i.e., determine our best estimate of the true value) based on this distribution
If we plot the frequency of each particular measured value, then we acquire the plot shown in Figure 2. This distribution suggests that the next measurement that we make is likely to be close to 8 or 9, but there is some chance that it will be higher or lower than that. However, it is very unlikely that it will be lower than 3 or higher than 16.
We characterize this dataset by calculating the experimental mean ( ) given by
where xn are the measured values for the n=1,2,...N data points. In this example N was equal to 20. This experimental mean tells us that "on average" we expect the true value for x to be close to . But how close would we expect the true value to be from ? We can begin to estimate that by looking at the variance in the dataset.
We characterize the spread in this dataset by calculating the experimental variance ( ) of the dataset given by
The standard deviation is the square root of the variance. A large standard deviation implies that if we make another measurement, then we will have a low confidence that it will be close to the mean. A small standard deviation means that if we make another measurement, then we have a high confidence that it will be close to the mean. One measure of the size of the standard deviation is given by the relative standard deviation (S), which is the ratio of the standard deviation to the mean or
For the dataset shown in Figure 2, the experimental mean is 8.80 and the standard deviation is 2.78. The relative standard deviation is 0.317 or 31.7%. This suggests that there is a large variation in the measured data points. If we make one more measurement, we would have a low confidence that it would be close to the mean.
How spread out this distribution is will depend on the uncertainty in the measured values and specifically uncertainties in the measurement instrument used. Assume we have a sample whose true value for the mass of the sample is 20.00 g. We make 155 measurements of this sample using an instrument that has low uncertainties. The value for each of the 155 measurements is shown in Figure 3. A frequency plot showing the distribution of these measurement values is shown in Figure 4. The experimental mean of these values is 19.986 g and the standard deviation is 1.034 g. The relative standard deviation (S) in this dataset is 5.2%. Thus, if we make another measurement of this sample, we would have a high confidence that it would be close to the mean. Note that since we know the true value of the characteristics (20.00 g), we can confirm that the experimental mean differs from the true value by only a small amount (0.014 g).
Now assume that we make another 155 measurements of the same 20.00 g sample but this time we use an instrument that has a higher uncertainty. The value for each of the 155 measurements is shown in Figure 5 and a frequency plot showing the distribution of these measured values is shown in Figure 6. The experimental mean of these values is 19.956 g and the standard deviation is 3.102 g. The relative standard deviation is 15.5%. Thus, we expect that if we perform an additional measurement we have a lower confidence than in the previous example that it would likely be close to the experimental mean
The examples above show us how we could use repetitive measurements of the same sample using the same instrument to determine characteristics of the measurement system. By measuring the same sample over and over again, we can determine the expected uncertainties for the measurement instrument and the shape of the distribution of measured values.
Page 2 / 9