Interval-based Activity Recognition Evangelos Makris1 , Alexander Artikis2,1 , and Georgios Paliouras1 Institute of Informatics and Telecommunications, NCSR “Demokritos”1 Department of Maritime Studies, University of Piraeus2 {vmakris,a.artikis,paliourg}@iit.demokritos.gr Abstract. Activity recognition refers to the detection of temporal com- binations of ‘low-level’ or ‘short-term’ activities on sensor data. Various types of uncertainty exist in activity recognition systems and this often leads to erroneous detection. Typically, the frameworks aiming to han- dle uncertainty compute the probability of the occurrence of activities at each time-point. We extend this approach by defining the probability of a maximal interval and the credibility rate for such intervals. 1 Introduction In activity recognition, multiple sources provide spatial and temporal data that can be used to detect various types of human behaviour. The input data are short-term activities (STA), such as ‘walking, ‘running, ‘active’ and ‘inactive’, indicating that a person is walking, running, moving his arms while in the same position, and so on. The output is a set of long-term activities (LTA), which are temporal combinations of STA. Examples are ‘fighting’, ‘meeting’, ‘moving’, etc. When a rule that consists of temporal constraints on a set of STA is satisfied, it leads to the recognition of LTA. Uncertainty is inherent in activity recognition. For example, STA, typically detected by visual information processing tools operating on video feeds, often have probabilities attached to them by low- level classifiers, serving as a confidence estimates. In earlier work, we presented an activity recognition system based on a probabilistic version of the Event Calculus, hereafter Prob-EC, that computes the probability of an LTA at each time-point [12]. We extend this approach by defining the probability of a maximal interval and the credibility rate for such intervals. In contrast to time-point-based activity recognition, our proposed method is robust to noisy LTA probability fluctuations. 2 Background 2.1 Event Calculus We restrict attention to a simple version of the Event Calculus where the time model is linear and includes integer time-points. Variables start with an upper- case letter, while predicates and constants start with a lower-case letter. Where F is a fluent—a property that is allowed to have different values at different points in time—the term F = V denotes that fluent F has value V . The domain- independent axioms are presented below: holdsAt(F = V , T ) ← initiatedAt(F = V , Ts ), Ts < T , (1) not broken(F = V , Ts , T ). broken(F = V , Ts , T ) ← (2) terminatedAt(F = V , Tf ), Ts < Tf < T . broken(F = V , Ts , T ) ← (3) initiatedAt(F = V 0 , Tf ), V 6= V 0 , Ts < Tf < T . According to axiom (1), F = V holds at some time-point T if it has been initiated by an event previously and has not been ‘broken’ in the meantime. This expresses the law of inertia. F = V is ‘broken’ in (Ts , T ) if it is terminated (see axiom (2)) or F = V 0 is initiated, for some V 0 6= V (see axiom (3)). The definitions of initiatedAt and terminatedAt are domain-specific. Consider, for example, the (partial) definition of moving from the domain of activity recognition: initiatedAt(moving(P1 , P2 ) = true, T ) ← happensAt(walking(P1 ), T ), happensAt(walking(P2 ), T ), (4) holdsAt(close(P1 , P2 ) = true, T ), holdsAt(similarOrientation(P1 , P2 ) = true, T ). terminatedAt(moving(P1 , P2 ) = true, T ) ← happensAt(walking(P1 ), T ), (5) holdsAt(close(P1 , P2 ) = false, T ). moving is a long-term activity (LTA) expressed as a Boolean fluent, and de- fined in terms of a set of short-term activities (STA) expressed as instanta- neous events, and contextual information detected on video content. walking, running, active and inactive are mutually exclusive STA detected on video frames. Each such STA is accompanied by the coordinates and orientation of the tracked entity in question. These are the input of the activity recognition system. close(P1 , P2 ) is true when the distance between the tracked entities P1 and P2 is smaller than some pre-defined threshold of pixel positions. Similarly, similarOrientation(P1 , P2 ) is true when the difference in orientation of P1 and P2 is less than 45 degrees. According to rule (4), moving(P1 , P2 ) = true is said to be initiated when both P1 and P2 are walking, they are close to each other and have a similar orientation. Furthermore, moving(P1 , P2 ) = true is said to be terminated when the two tracked persons walk away from each other (see rule (5)). The remaining terminating conditions are defined in a similar manner [12]. Note that initiatedAt(F = V, T ) does not necessarily imply that F 6=V at T . Similarly, terminatedAt(F = V, T ) does not necessarily imply that F = V at T [3]. Suppose that F = V is initiated at time-points 10 and 20 and terminated at time-points 25 and 30 (and at no other time-points). In that case F = V holds at all T such that 10