<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a Statistical System Analysis</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Bernd Heidergott Professor of Stochastic Optimization Department of Econometrics and Operations Research VU Amsterdam University</institution>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
      </contrib-group>
      <fpage>16</fpage>
      <lpage>17</lpage>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Academic applied probability/operations research is mainly focused on the
mathematical analysis of models that nd their motivation in the outside (read, non-academic)
world. In preparing a real-life problem for mathematical analysis, a "model" has to be
distilled, and once this is done, reality is replaced by this model, which is subsequently
analyzed with much energy and analytical rigor. However, hardly ever are the exact
model speci cations known, and de ning parameters of the model under consideration,
such as arrival rates in queueing networks, failure rates of servers in reliability models,
or demand rates in inventory systems, are only revealed to the analyst by statistics. The
classical approach for dealing with such parameter insecurity is to integrate out the
system performance with respect to the assumed/estimated distribution of the unknown
parameter.</p>
      <p>We believe that in order to achieve a better understanding of model/parameter
insecurity a closer look into the way "randomness" is used in the analysis of a given model is
of importance. Randomness is an ubiquitous phenomenon. Without going too much into
detail on the philosophical aspects of the concept of randomness and probability, one can
loosely state that randomness is encountered as (1) lack of knowledge, or (2) variability in
repeated realizations of a phenomenon. For example, (1) covers the so-called parameter
insecurity and/or model insecurity. Indeed, often either the true distribution of a random
variable used in a model is not known (=model insecurity) or the distributional
parameters, such as mean, variance etc. are not known (=parameter insecurity). Statistics can
then be used to narrow down the possible range of distribution models or range of
parameter values, but reaching certainty is epistemologically impossible. This in contrast to (2),
where in principle laws of large numbers and ergodic theorems are available that allow to
produce reliable measurements for which mathematically supported quality assessments
are possible. The concept described in (1) relates to subjective probabilities, whereas the
(2) relates to the frequentialistic interpretation of probability. Consider, for the sake of
exposition, the following simple problem. Let X be random variable with cumulative
distribution function F , where denotes a parameter of the distribution, for example,
the mean or the variance. Suppose we are interest in estimating the mean value of X ,
denoted by , and we perform a computer simulation to sample n independently and
identically distributed realization of X , denoted by X (i), 1 i n. Then, the sample
average
n
X = 1 X X (i)</p>
      <p>n i=1
is a natural estimator for the mean. The above estimator is however deceivingly simple as
it assumes that is known, i.e., we know the correct value of . Now assume that we do
not know the exact value of . To formalize, let 0 denote the true value of and suppose
that for the simulation we use, due to lack of better knowledge, 1. Then, the error we
make in estimating 0 is
0</p>
      <p>X =
+</p>
      <p>X ;
| ({2z) }
where the rst error is due to our lack of knowledge and the second error can be controlled
through the sample size. Much research in applied probability and statistics is targeted
at reducing the second error. Although the presence of the type (1) error is acknowledged
in the literature, how to deal with type (1) error is still an open question. The area of
perfromability analysis is devoted to nding models for the lack of knowledge based on
entropy. Starting point is expert knowledge about typical behavior of and taking the
distributional model that maximizes the entropy with respect to the prede ned
characteristics provides a distributional model for , i.e., is now considered a random variable.
Alternatively, a statistical estimator for may be available. Then, the sample-distribution
of the estimator can be used as distributional model for . Consider, for example, the case
where is estimated through a sample-mean of identical and independently distributed
random variables, and assume that the sample-size is su cient for the strong law of large
numbers to hold, then we can model as (!) = + N (!), where is the sample
average, i.e., point estimator, and N (!) is standard normal random variable. Put di erently,
statistics allows to build a distributional model for .</p>
      <p>This lecture will elaborate on the above distributional model for parameter insecurity
and is aimed at stimulating a discussion on the relation between statistics and applied
probability/operations research. This lecture will advocate supporting the analyst by
studying the risk incurred by parameter insecurity. Rather than taking an entirely
statistical point of view by dismissing "model building" at all, we want to integrate the
data-driven statistical nature of model building into the analytical analysis. We will
discuss an analytical framework for doing so that allows for separating (i) the (analytical)
analysis of the system from (ii) the statistical model for the parameter insecurity. We
present a series of numerical examples illustrating our approach.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>