=Paper= {{Paper |id=Vol-2219/paper4 |storemode=property |title=Interval-based Activity Recognition |pdfUrl=https://ceur-ws.org/Vol-2219/paper4.pdf |volume=Vol-2219 |authors=Evangelos Makris,Alexander Artikis,Georgios Paliouras |dblpUrl=https://dblp.org/rec/conf/ilp/MakrisAP18 }} ==Interval-based Activity Recognition== https://ceur-ws.org/Vol-2219/paper4.pdf
           Interval-based Activity Recognition

       Evangelos Makris1 , Alexander Artikis2,1 , and Georgios Paliouras1

       Institute of Informatics and Telecommunications, NCSR “Demokritos”1
               Department of Maritime Studies, University of Piraeus2
                 {vmakris,a.artikis,paliourg}@iit.demokritos.gr



       Abstract. Activity recognition refers to the detection of temporal com-
       binations of ‘low-level’ or ‘short-term’ activities on sensor data. Various
       types of uncertainty exist in activity recognition systems and this often
       leads to erroneous detection. Typically, the frameworks aiming to han-
       dle uncertainty compute the probability of the occurrence of activities
       at each time-point. We extend this approach by defining the probability
       of a maximal interval and the credibility rate for such intervals.


1     Introduction

In activity recognition, multiple sources provide spatial and temporal data that
can be used to detect various types of human behaviour. The input data are
short-term activities (STA), such as ‘walking, ‘running, ‘active’ and ‘inactive’,
indicating that a person is walking, running, moving his arms while in the same
position, and so on. The output is a set of long-term activities (LTA), which are
temporal combinations of STA. Examples are ‘fighting’, ‘meeting’, ‘moving’, etc.
When a rule that consists of temporal constraints on a set of STA is satisfied, it
leads to the recognition of LTA. Uncertainty is inherent in activity recognition.
For example, STA, typically detected by visual information processing tools
operating on video feeds, often have probabilities attached to them by low-
level classifiers, serving as a confidence estimates. In earlier work, we presented
an activity recognition system based on a probabilistic version of the Event
Calculus, hereafter Prob-EC, that computes the probability of an LTA at each
time-point [12]. We extend this approach by defining the probability of a maximal
interval and the credibility rate for such intervals. In contrast to time-point-based
activity recognition, our proposed method is robust to noisy LTA probability
fluctuations.


2     Background

2.1   Event Calculus

We restrict attention to a simple version of the Event Calculus where the time
model is linear and includes integer time-points. Variables start with an upper-
case letter, while predicates and constants start with a lower-case letter. Where
F is a fluent—a property that is allowed to have different values at different
points in time—the term F = V denotes that fluent F has value V . The domain-
independent axioms are presented below:
              holdsAt(F = V , T ) ←
                initiatedAt(F = V , Ts ), Ts < T ,                            (1)
                not broken(F = V , Ts , T ).
              broken(F = V , Ts , T ) ←
                                                                              (2)
                 terminatedAt(F = V , Tf ), Ts < Tf < T .
              broken(F = V , Ts , T ) ←
                                                                              (3)
                 initiatedAt(F = V 0 , Tf ), V 6= V 0 , Ts < Tf < T .

According to axiom (1), F = V holds at some time-point T if it has been initiated
by an event previously and has not been ‘broken’ in the meantime. This expresses
the law of inertia. F = V is ‘broken’ in (Ts , T ) if it is terminated (see axiom
(2)) or F = V 0 is initiated, for some V 0 6= V (see axiom (3)). The definitions
of initiatedAt and terminatedAt are domain-specific. Consider, for example, the
(partial) definition of moving from the domain of activity recognition:
                initiatedAt(moving(P1 , P2 ) = true, T ) ←
                   happensAt(walking(P1 ), T ),
                   happensAt(walking(P2 ), T ),                               (4)
                   holdsAt(close(P1 , P2 ) = true, T ),
                   holdsAt(similarOrientation(P1 , P2 ) = true, T ).
                terminatedAt(moving(P1 , P2 ) = true, T ) ←
                   happensAt(walking(P1 ), T ),                               (5)
                   holdsAt(close(P1 , P2 ) = false, T ).

moving is a long-term activity (LTA) expressed as a Boolean fluent, and de-
fined in terms of a set of short-term activities (STA) expressed as instanta-
neous events, and contextual information detected on video content. walking,
running, active and inactive are mutually exclusive STA detected on video
frames. Each such STA is accompanied by the coordinates and orientation of
the tracked entity in question. These are the input of the activity recognition
system. close(P1 , P2 ) is true when the distance between the tracked entities P1
and P2 is smaller than some pre-defined threshold of pixel positions. Similarly,
similarOrientation(P1 , P2 ) is true when the difference in orientation of P1 and
P2 is less than 45 degrees. According to rule (4), moving(P1 , P2 ) = true is said
to be initiated when both P1 and P2 are walking, they are close to each other
and have a similar orientation. Furthermore, moving(P1 , P2 ) = true is said to be
terminated when the two tracked persons walk away from each other (see rule
(5)). The remaining terminating conditions are defined in a similar manner [12].
    Note that initiatedAt(F = V, T ) does not necessarily imply that F 6=V at T .
Similarly, terminatedAt(F = V, T ) does not necessarily imply that F = V at T [3].
Suppose that F = V is initiated at time-points 10 and 20 and terminated at
time-points 25 and 30 (and at no other time-points). In that case F = V holds
at all T such that 10