<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Temporal Density Extrapolation</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Georg Krempl Knowledge Management &amp; Discovery Otto-von-Guericke University Magdeburg Universitatsplatz 2</institution>
          ,
          <addr-line>39106 Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <abstract>
        <p>Mining evolving datastreams raises the question how to extrapolate trends in the evolution of densities over time. While approaches for change diagnosis work well for interpolating spatio-temporal densities, they are not designed for extrapolation tasks. This work studies the temporal density extrapolation problem and sketches two approaches that addresses it. Both use a set of pseudo-points in combination with spatio-temporal kernel density estimation. The rst, weightextrapolating approach, uses regression on the weights of stationarylocated pseudo-points. The second, location-extrapolating approach, extrapolates the trajectory of uniformly-weighted pseudo-points within the feature space.</p>
      </abstract>
      <kwd-group>
        <kwd>kernel density estimation</kwd>
        <kwd>density extrapolation</kwd>
        <kwd>density forecasting</kwd>
        <kwd>spatio-temporal density</kwd>
        <kwd>evolving datastreams</kwd>
        <kwd>nonstationary environments</kwd>
        <kwd>concept drift</kwd>
        <kwd>drift mining</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>https://kmd.cs.ovgu.de/res/driftmining</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>
        Density estimation methods, like kernel density estimation [
        <xref ref-type="bibr" rid="ref13 ref14">14, 13</xref>
        ], allow to learn
a model from instances observed at di erent positions in feature space, and
to use this model to estimate the density at any position within this feature
space. While the original work in [
        <xref ref-type="bibr" rid="ref13 ref14">14, 13</xref>
        ] is limited to spatial densities of a
stationary distribution, the approach was extended for spatio-temporal density
estimation of non-stationary distributions in [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. This so-called velocity density
estimation allows to estimate and visualise trends in densities. However, these
existing approaches are not directly applicable for predicting the density at future
time points, for example to extrapolate trends in the evolution of densities and
for building classi ers that work with delayed label information [
        <xref ref-type="bibr" rid="ref10 ref5">10, 5</xref>
        ]. Such
temporal density extrapolation should predict the densities at spatio-temporal
coordinates in the future, given a sample of (historic) instances observed at
di erent positions in feature space and at di erent times in the past.
      </p>
      <p>We propose and study two approaches to address this problem. Both use
extrapolation of pseudo-points in combination with spatio-temporal kernel
density estimation. The rst approach extrapolates the weights of stationary located
pseudo-points, while the second extrapolates the path of moving pseudo-points of
xed weight. Subsequently, these weight- or position-extrapolated pseudo-points
are used in a spatio-temporal kernel density estimation.</p>
      <p>Copyright©2015 for this paper by its authors. Copying permitted for private and academic
purposes.</p>
      <p>In the following Section 2, we review the related work, before sketching the
two approaches in Section 3 and concluding in Section 4.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        The task of estimating the probability density based on a sample of
independently and identically distributed (iid) observations has been intensively studied.
Density estimation methods, like kernel density estimation [
        <xref ref-type="bibr" rid="ref13 ref14">14, 13</xref>
        ], allow to learn
a model from instances observed at di erent positions in feature space, and to
use this model to estimate the density at any position within this feature space.
The di culties of the early kernel and near-neighbour density estimation
techniques when extended to multivariate settings was addressed by approaches like
projection pursuit density estimation, proposed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. All these density
estimation approaches, as well as related curve regression approaches, require an iid
sample from a stationary distribution [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        In the case of a nonstationary distribution, one might be interested in
estimating the density at di erent points in time and space. In [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], this problem
of spatio-temporal density estimation is addressed by combining spatial kernel
density estimation with a temporal weighting of instances. A framework for
socalled change diagnosis in evolving datastreams is proposed, which estimates
the rate of change at each region by using a user-speci ed temporal window to
calculate forward and reverse time slice density estimates. This velocity
density estimation technique is applicable for spatio-temporal density interpolation,
for monitoring and visualising the change of densities in a (past) time window.
However, it is not designed for extrapolating the density to (future) time points
outside the window of observed historical data.
      </p>
      <p>
        Related to change diagnosis is change mining [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which aims to understand
the changes in data distributions themselves. Within this paradigm, the idea
of so-called drift-mining approaches [
        <xref ref-type="bibr" rid="ref5 ref8 ref9">8, 9, 5</xref>
        ] is to model the evolution of
distributions in order to extrapolate them to future time points, thereby addressing
problems of veri cation latency or label delay. The algorithm APT proposed in
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] uses matching between labelled old and unlabelled new instances to infer the
labels of the later, thus indirectly estimating the class-conditional distributions
of the new instances. Likewise, an expectation-maximisation approach is used in
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] to track components of a Gaussian mixture model over time. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], a
mixture model is learned on old labelled data and compared to density estimates
on current unlabelled data, thereby inferring changes such as of the class prior.
However, these approaches are again not designed for directly extrapolating
densities.
      </p>
      <p>
        Density forecasting approaches [
        <xref ref-type="bibr" rid="ref15 ref16 ref7">15, 16, 7</xref>
        ], on the other hand, focus on the
prediction of a single variable's (mostly unimodal) density at a particular future
timepoint, based on an observed time series of this variable's values. In the
simplest case, as discussed for example in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], this is done by providing a con dence
interval for a point estimate, obtained by assuming a particular distribution of
this variable. More sophisticated approaches return (potentially multi-modal)
density estimates by combining several predictions, which are obtained for
example by di erent macroeconomic models, experts, or simulation outcomes, into
a distribution estimation by kernel methods. Nevertheless, their multi-modal
character originates from the di erent modes in the combined unimodal
models. In addition, most works consider only a single variable. One exception is
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where univariate forecasts of two explanatory variables are converted using
conditional kernel density estimation into forecasts of the dependent variable.
      </p>
      <p>
        In contrast to density forecasting above, we are concerned with temporal
density extrapolation of a potentially multi-modal density distribution.
Furthermore, instead of having a time series of single observations at any one time,
our input data consists of multiple observations at any one time. This temporal
density extrapolation is related to spatial density extrapolation [
        <xref ref-type="bibr" rid="ref17 ref6">17, 6</xref>
        ], which
addresses the extrapolation of densities for feature values that have not been seen
yet in historical instances. In [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the authors suggest a Taylor series expansion
about the point of interest to estimate the density, while in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] a statistical test
is provided to examine whether the data distribution is distinct from a uniform
distribution at the extrapolation position. While modelling time as a feature is
possible, there is an important di erence in extrapolation between time and
feature space: one expects the density to diminish towards unpopulated (and thus
unseen) positions in feature space. However, there is no a priori reason to assume
densities to decrease towards yet unseen moments in time. On the contrary, it
is reasonable to assume that at each point in time (whether future, current, or
past) the density integrates to one over the feature space.
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>Temporal Density Extrapolation</title>
      <p>To address the problem of extrapolating the observed, potentially multi-modal
density-distribution of instances to future time points, we propose an approach
based on pseudo-points. These pseudo-points are used in the spatio-temporal
kernel density estimation in lieu of the originally observed instances. The resulting
kernel density estimation model can be interpreted as a mixture model, where
each pseudo-point constitutes itself a component. The pseudo-points evolve over
time, either by changing their weight (their component's mixing proportion), or
by changing their position (their component's location). Therefore, the learning
task is to t a trend function to the evolution of each pseudo-point. We present
each of the two variants in the next Subsections 3.1 and 3.2, before discussing
their potential di culties and limitations in Section 3.3.
3.1</p>
      <sec id="sec-4-1">
        <title>Weight-Extrapolated, Stationary Pseudo-Points</title>
        <p>Given a set of stationary pseudo-points, the rst approach models their weights
as functions of time. These functions are then t on a window with historical
data, such that the distribution therein is modelled with maximum likelihood.</p>
        <p>The approach is illustrated for a one-dimensional feature space in Figure 1.
At the rst time point in the past (time = 0), a density estimate is calculated
using historical data collected at that time (solid blue line). Then, a set of
pseudo-points (here 1; 2; 4) is generated, either by placing them equidistant
on a grid or by drawing them at random. Next, the weights (w1; w2; w4) of all
pseudo-points are calculated such that the divergence is minimised between the
kernel density estimate over the weighted pseudo-points and the kernel-density
estimate over the original data instances at that time point. The pseudo-point's
weights are estimated in the same way for subsequent time points (e.g. t = 1), as
soon as instances become available for them. This results for each pseudo-point
in a time series of weight values, for which a polynomial trend function (red
curves) is learned by regression. Finally, for a future time point (e.g. time = 2),
the trend functions' values are predicted (w1; w2; w4 in red at time = 2).
Using these weighted pseudo-points in a kernel density estimate at time = 2,
one obtains the extrapolated density (red dotted line), which is later evaluated
against the observed density (solid blue-gray line).
3.2</p>
        <p>Position-Extrapolated, Uniformly-Weighted Pseudo-Points
The second approach to address this problem is to use uniformly-weighted, but
exibly located pseudo-points. Thus, the pseudo-point's weights are uniform and
constant, but their positions are functions of time, tted such that the divergence
on the available historical data is minimised.</p>
        <p>In analogy to the previous gure, this approach is illustrated for a
onedimensional feature space in Figure 2. Given a set of historical instances and
a speci ed number of pseudo-points, density estimates (solid blue lines) are
made for historical time points (time = 0 and time = 1). Then, a mixture
model with each pseudo-point as a single Gaussian component is formulated.
Assuming polynomial trajectories (red solid lines) for the pseudo-points, the
parameters of this model are the coe cients of the pseudo-points polynomial
trajectories, which are learned using Expectation-Maximisation. Finally, for a
future time point (time = 2), the pseudo-point's positions are predicted using
the polynomial function, and the density (red dotted line) at this time point is
estimated using kernel density estimation over the pseudo-points placed at their
extrapolated positions.
3.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Discussion</title>
        <p>Both approaches above rely on a regression over time for extrapolating trends in
the development of either weights or positions. In order to make this
extrapolation more robust, we recommend using regularised trend functions that consider
penalties for the models' complexities. The choice of the type of regression
function depends on the type of drift, as for example polynomial functions require
gradual drift, while trigonometric functions seem to be interesting candidates
for modelling recurring context.</p>
        <p>The weight-extrapolation in the rst approach requires a normalisation, such
that the extrapolated weights are all non-negative and sum up to one. An
important question concerns the choice of the pseudo-point's location in this approach,
Temporal Density Extrapolation
5
3
w
2
tim
e
w feature X
4
4
w
1
w
3
w
2
w</p>
        <p>X
Observed Density
Extrapolated Density
Extrapolated
Pseudopoint Weight
y
t
i
s
n
e
d
0
y
t
i
s
n
e
d
0
1
2
tim
e
feature X</p>
        <p>X</p>
        <p>X
as it in uences the precision of the extrapolated values: in regions with sparse
pseudo-point populations, the model is less exible than in densely populated
ones. Therefore, this approach seems better suited for constricted (bounded)
feature spaces. A simple equidistant placement of pseudo-points distributes the
precision over the whole feature space. Alternatively, the pseudo-points might be
placed at the coordinates of a subsample of the observed instances, thus
concentrating the precision on areas with previously high density. However, if densities
change largely over time, these areas might become less relevant.</p>
        <p>In contrast, the second, position-extrapolating approach determines the
positions of each pseudo-point automatically. It aims to adjust the future location
of the pseudo-points such that they are densely placed in regions with a high
expected density. However, in the case polynomial regression functions are used,
a potential drawback is that their trajectories diverge in the long run. Thus,
in contrast to the rst approach, the second one seems to be better suited for
in nite (unbounded) feature spaces.</p>
        <p>
          Related to the choice of the pseudo-points' placements is the question of
optimal bandwidth selection, which for kernel density estimation has already
been reviewed in [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. In short, we expect that with an increasing number of
pseudo-points the optimal bandwidth decreases, while the extrapolation's
precision increases. Furthermore, the number of pseudo-points is also an upper bound
on the number of modes that both approaches are able to model.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we have addressed the problem of temporal density extrapolation,
where the objective is the prediction of a (potentially multi-modal) density
distribution at future time points, given a sample of historical instances observed
at di erent positions in feature space and at di erent times in the past. Two
approaches based on pseudo-points were sketched: the rst uses an extrapolation of
time-varying weights of stationary located pseudo-points, while the second uses
an extrapolation of the trajectory of the time-varying location of pseudo-points
with uniform weights. Subsequently, these extrapolated pseudo-points are used
in a kernel density estimation at future time points.</p>
      <p>
        Having sketched the idea of the two temporal density extrapolation
approaches, a more detailed speci cation and evaluation of these methods needs
to be done in future work. Furthermore, the conjectures in the discussion above,
in particular the usability of each approach for bounded and unbounded
feature spaces, need to be veri ed. Finally, a known challenge for kernel-based
approaches is the curse of dimensionality on multi-dimensional data. A naive
approach is to combine multiple univariate temporal density extrapolations.
However, an optimisation for multi-variate problems by using either projection
pursuit [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] or copula [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] techniques seems worth investigating.
Acknowledgments. We thank Vera Hofer from Karl-Franzens-University Graz,
Stephan Mohring, and Andy Koch for insightful discussions.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Aggarwal</surname>
          </string-name>
          , C.C.
          <article-title>: A framework for diagnosing changes in evolving data streams</article-title>
          .
          <source>In: Proceedings of the ACM SIGMOD Conference</source>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Aggarwal</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          :
          <article-title>On change diagnosis in evolving data streams</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>17</volume>
          (
          <issue>5</issue>
          ),
          <volume>587</volume>
          {
          <fpage>600</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Bottcher,
          <string-name>
            <surname>M.</surname>
          </string-name>
          , Hoppner,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Spiliopoulou</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>On exploiting the power of time in data mining</article-title>
          .
          <source>ACM SIGKDD Explorations Newsletter</source>
          <volume>10</volume>
          (
          <issue>2</issue>
          ),
          <volume>3</volume>
          {
          <fpage>11</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Friedman</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stuetzle</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schroeder</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Projection pursuit density estimation</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          <volume>79</volume>
          (
          <issue>387</issue>
          ) (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hofer</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krempl</surname>
          </string-name>
          , G.:
          <article-title>Drift mining in data: A framework for addressing drift in classi cation</article-title>
          .
          <source>Computational Statistics and Data Analysis</source>
          <volume>57</volume>
          (
          <issue>1</issue>
          ),
          <volume>377</volume>
          {
          <fpage>391</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hooker</surname>
          </string-name>
          , G.:
          <article-title>Diagnosing extrapolation: Tree-based density estimation</article-title>
          .
          <source>In: Knowledge Discovery in Databases (KDD)</source>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jeon</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , J.W.:
          <article-title>Using conditional kernel density estimation for wind power density forecasting</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          <volume>107</volume>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Krempl</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The algorithm APT to classify in concurrence of latency and drift</article-title>
          . In: Gama,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Bradley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Hollmen</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>Advances in Intelligent Data Analysis X, Lecture Notes in Computer Science</source>
          , vol.
          <volume>7014</volume>
          , pp.
          <volume>222</volume>
          {
          <fpage>233</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Krempl</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hofer</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Classi cation in presence of drift and latency</article-title>
          . In: Spiliopoulou,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Cook</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>W.</surname>
          </string-name>
          , Zaane,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <surname>X</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the 11th IEEE International Conference on Data Mining Workshops (ICDMW</source>
          <year>2011</year>
          ). IEEE (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Krempl</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Zliobaite_,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Brzezinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            , Hullermeier, E.,
            <surname>Last</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Lemaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Noack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Shaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Sievi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Spiliopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stefanowski</surname>
          </string-name>
          , J.:
          <article-title>Open challenges for data stream mining research</article-title>
          .
          <source>SIGKDD Explorations</source>
          <volume>16</volume>
          (
          <issue>1</issue>
          ),
          <volume>1</volume>
          {
          <fpage>10</fpage>
          (
          <year>2014</year>
          ), special Issue on Big Data
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nadaraya</surname>
            ,
            <given-names>E.A.</given-names>
          </string-name>
          :
          <article-title>Nonparametric estimation of probability densities and regression curves</article-title>
          .
          <source>Kluwer</source>
          (
          <year>1989</year>
          ), originally published in Russian by Tbilisi University Press, Translated by S. Klotz
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Nelsen</surname>
            ,
            <given-names>R.B.</given-names>
          </string-name>
          : An Introduction to Copulas. Springer (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Parzen</surname>
          </string-name>
          , E.:
          <article-title>On estimation of a probability density function and mode</article-title>
          .
          <source>Annals of Mathematical Statistics</source>
          <volume>33</volume>
          ,
          <volume>1065</volume>
          {
          <fpage>1076</fpage>
          (
          <year>1962</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rosenblatt</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Remarks on some non-parametric estimates of a density function</article-title>
          .
          <source>Annals of Mathematical Statistics</source>
          <volume>27</volume>
          (
          <issue>3</issue>
          ),
          <volume>832</volume>
          {
          <fpage>837</fpage>
          (
          <year>1956</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Skouras</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dawid</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          :
          <article-title>On e cient probability forecasting systems</article-title>
          .
          <source>Biometrika</source>
          <volume>86</volume>
          (
          <issue>4</issue>
          ),
          <volume>765</volume>
          {
          <fpage>784</fpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Tay</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wallis</surname>
            ,
            <given-names>K.F.</given-names>
          </string-name>
          :
          <article-title>Density forecasting: A survey</article-title>
          . Companion to Economic Forecasting pp.
          <volume>45</volume>
          {
          <issue>68</issue>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Terrell</surname>
            ,
            <given-names>G.R.</given-names>
          </string-name>
          :
          <article-title>Tail probabilities by density extrapolation</article-title>
          .
          <source>In: Proceedings of the Annual Meeting of the American Statistical Association</source>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Turlach</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          :
          <article-title>Bandwidth selection in kernel density estimation: A review</article-title>
          .
          <source>Tech. Rep</source>
          .
          <volume>9307</volume>
          ,
          <string-name>
            <surname>Humboldt</surname>
            <given-names>University</given-names>
          </string-name>
          , Statistic und Okonometrie (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>