Calculating Statistics
How-to


Related pages

Performance analysis, Data collector monitors, Data collector functions, Performance output, Performance options functions, IID data values

Calculating Statistics

Data collector monitors extract numerical data from a net during simulations. The numerical data is extracted by the observation, initialization, and stop data collector monitoring functions. The numerical data that is extracted is used to calculate statistics. The statistics that are calculated for a particular data collector will be either untimed statistics or timed statistics (see below for more details).

The statistics that can be accessed from each data collector monitor are:

If timed statistics are calculated for a data collector monitor, then the following additional statistics are calculated: There is support for calculating 90%, 95%, and 99% confidence intervals for averages. One of the performance options functions can be used to select which confidence interval levels should be calculated. Note that the confidence intervals for the average of the data values collected by a data collector monitor will be accurate only if the data values are independent and identically distributed (IID). For more information, see the help page for IID data values.

All of the statistics mentioned above can be accessed using the Data collector functions.

In the following, let x_i, i=1..n, be the values that are returned by the observation, initialization, and stop functions for a data collector monitor.

Untimed Statistics

If untimed statistics are to be calculated for the data collector, then the sum and average of n values are calculated in the following way:

Sum_n = x_1 + x_2 + ... + x_n

Avrg_n = Sum_n / n

The remaining statistics are calculated in a similar way.

If a data collector observes the same value twice then the value influences the statistics twice, as expected.

The following figure show an example of data values that are used to calculate untimed statistics.

Data for untimed statistics

The data values in the figure above are:

#x_i i
0 1
1 2
0 3
1 4
1 5
2 6
1 7
0 8
0 9
1 10
0 11
1 12
1 13
0 14
0 15

For these values sum=9 and avrg=0.6.

Timed Statistics

Timed statistics differ from untimed statistics in that an interval of time is used to weight each observed value. The figure below shows an example of the intervals of time that are associated with observed data values. The line segment after an observed value corresponds to the interval of time that is used to weight the observed value.

Data for timed statistics

Assume that data value x_i is extracted at time t_i, for i=1..n. The interval [t_i,t_i+1] is used to weight the value x_i, i.e. the weight of the value x_i is (t_i+1 - t_i). At precisely time t_i, x_i has no influence on the following statistics:

This is due to the fact that the weight of the value is zero, but for all time t>t_i, x_i will influence these values.

The (timed) sum and (timed) average of the n values at time t>=t_n are calculated as follows:

Sum_t = x_1*(t_2-t_1) + x_2*(t_3-t_2) + ... + x_n*(t-t_n)

Avrg_t = Sum_t/(t-t_1)

With timed statistics it is possible for a value to exist for zero time. In the figure above, the second observation of value 2 exists for zero time, as indicated by a missing line segment after the data value.

In contrast to the statistics mentioned above, the following statistics take into account all data values, including those that are weighted with zero time, observed by the data collection:

The data values, time of observation, and time intervals for the figure above are:
#xi ti interval
0 3 17
1 20 2
0 22 5
1 27 9
2 36 3
1 39 4
0 43 6
1 49 2
2 51 0
1 51 2
0 53 14

For these values at time t=67, timed sum=25 and timed avrg=0.390625.