Goal-Based Person Tracking Using a First-Order Probabilistic Model Thomas Geier, Stephan Reuter, Klaus Dietmayer and Susanne Biundo Faculty of Engineering and Computer Science Ulm University, Germany forename.surname@uni-ulm.de Abstract scene, which is not possible when using completely propositional models or specialized template models This work addresses the problem of person like dynamic Bayesian networks [Murphy, 2002](which tracking using additional background infor- can scale only along the time axis). A model given in mation. We augment a particle filter-based a lifted representation also makes it possible to lever- tracking algorithm with a first-order prob- age structure information contained within the lifted abilistic model expressed through Markov formulation for more efficient inference [Gogate and Logic Networks to tackle the data associa- Domingos, 2011]; although this approach is not inves- tion problem in domains with a high occlu- tigated here. sion rate. Using a high-level model descrip- We motivate our work in the context of an indoor sit- tion allows us to easily integrate additional uation, where multiple persons move in a two-room of- information like a floor plan or goal informa- fice, containing the laser range finder and several areas tion into a joint model and resolve occlusion of interest like a printer or a coffee maker. We mea- situations that would otherwise result in the sure the quality of our model by the extent to which loss of association. We discuss the engineered it is able to correctly associate object tracks emerging model in detail and give an empirical evalu- from the tracking algorithm with the correct persons ation using an indoor setting. inside the scene. The rest of the paper is laid out as follows. After we 1 Introduction discuss related work, we give an overview of the ap- plied tracking algorithm and introduce the concept of This work concerns the problem of providing Markov Logic Networks. Then, we describe the inves- background-knowledge to the specialized application tigated problem in detail and discuss the used MLN of person tracking using a high-level probabilistic model. We give an empirical evaluation of the de- model. We demonstrate that a hand-crafted model, scribed setup and conclude with some possible exten- built using Markov Logic Networks (MLN) [Richard- sions to the model and an overall discussion. son and Domingos, 2006], can help in solving the data association problem in tracking during situations 2 Related Work where high occlusion prevents the correct association between past and new tracks. By leveraging additional In multi-object tracking, state dependent detection information, like a floor plan or knowledge about goals probabilities of objects are disregarded in most appli- of single persons, we can resolve otherwise opaque sit- cations. Thus, a track disappears shortly after en- uations. tering an occluded area and a new track is created The usage of a first-order probabilistic model like when the object leaves the occluded area again. Con- MLNs allows for an easier modeling task, because de- sequently, one object is represented by different track pendencies are represented by weighted first-order log- IDs. Especially in scenarios where persons interact ical formulas instead of, e.g., conditional probability several times with a system, changed track IDs lead tables for the case of directed models like Bayesian to the loss of the objects history. The multi-object Networks. In addition, the model is formulated in Bayes filter [Mahler, 2007] allows to integrate state a lifted form and can be instantiated for the desired dependent detection probabilities even if the scenario number of concurrent tracks or persons within the is characterized by a high object density [Reuter and Dietmayer, 2011]. In case of short term occlusions, the and the received measurement set. Thus, no data as- usage of state dependent detection probabilities leads sociation is necessary. to an improved track continuity. A direct integration Further, the multi-object Bayes filter allows to inte- of goals into the prediction of a persons’ state is cru- grate state dependent detection probabilities into the cial, since the persons’ action may be contradictory to filtering algorithm. In Reuter and Dietmayer [2011], the assigned goal. an approach to calculate state dependent detection Markov Logic Networks have been used by Sadilek and probabilities based on the occupancy grid mapping ap- Kautz [2010] for multi-agent activity recognition based proach [Thrun et al., 2005] is proposed. Thus, it is on GPS data in a game of capture the flag. While their possible to keep track of an object which is occluded work can leverage more expert knowledge (the rules of for the sensor for a short period of time. Using con- the game), they do not encounter the data association stant detection probabilities would lead to a track loss, problem present in the tracking scenario, since each if an object is not visible to the sensor for a few mea- person was carrying a personal GPS receiver. Tran surement cycles. We use the state dependent tracking and Davis [2008] apply Markov Logic Networks to a algorithm as a comparison for our final results. But parking lot surveillance scene using video data to rec- for input into the high-level model, we use state in- ognize which person enters which car. They also track dependent object tracking. This produces more track pedestrians across a scene and face the problem of data IDs for association, and in particular is less prone to association. Their sensory information emerges from false association, which we cannot correct in the upper image data and their focus lies in integrating differ- stage. ent information sources that are all extracted from An implementation of the multi-object Bayes filter is the video stream. Markov Logic Networks are also possible using Sequential Monte Carlo (SMC) methods used by Singla and Domingos [2006] for entity resolu- [Reuter and Dietmayer, 2011, Sidenbladh and Wirkan- tion in text mining. This is the problem of inferring der, 2003, Mahler, 2007]. In difference to well known which references refer to the same entity and it is sim- SMC implementations of the standard Bayes filter a ilar to the data association problem in tracking. The particle set, which represents a random finite set us- two latter works use an equals predicate for identity ing a finite number of state vectors, is used instead of maintenance, whereas we approach the problem using a standard particle. Further, the number of state vec- an association mapping to underlying entities. When tors in the particle set may change at each time step. grounding the model our approach only creates asso- In case of a SMC implementation, the integration of ciations between currently instantiated track IDs and the mentioned constraints is possible by reducing the their corresponding entity, wheres using an equals weight of a particle set. predicate will introduce relations between all objects, which does not seem reasonable in a dynamic domain. Since the multi-object Bayes filter does not perform a measurement to track association, an extraction of the individual objects out of the multi-object posterior 3 Multi-Object Tracking density function is necessary, e.g. using the k-means algorithm [Bishop, 2006]. Standard multi-object tracking algorithms often use object individual single-object trackers like the 4 Markov Logic Networks Kalman filter. The drawback of this multi-object tracking approach is the need of a data association Markov Logic Networks [Richardson and Domingos, step which assigns the received measurements to the 2006] are a member of the family of first-order proba- trackers using hard decisions or probabilistic meth- bilistic languages [de Salvo Braz et al., 2008] and their ods [Blackman and Popoli, 1999]. Especially in scenar- semantics are based on undirected graphical mod- ios characterized by a high object density, the data as- els (Markov networks). In contrast to propositional sociation is error-prone and degrades the performance models like Bayesian networks and Markov Networks, of the tracking system, since false associations are ir- where every random variable has to be specified ex- reversible. plicitly, in first-order models the random variables are A rigorous approach to multi-object tracking is the relations over objects and the model can be scaled multi-object Bayes filter proposed by Mahler [2007]. by providing the appropriate number of object con- The multi-object Bayes filter uses the random finite set stants. Moreover, MLNs allow the specification of statistics to represent the complete environment by a dependencies as weighted first-order logical formulas. single filter state. In the innovation step of the multi- Higher weights make those interpretations more likely, object Bayes filter, a multi-object likelihood function in which more groundings of the formula evaluate to calculates the affinity between the predicted state set true. We will now briefly cover the formal semantics of MLNs. the existence of such information depending on the ap- plication. Issued print jobs could be recognized by a A Markov Logic Network L = {(f1 , w1 ), . . . , (fn , wn )} special program installed on the PC, or visits to the for n ∈ N is a set of first-order formulas f1 , . . . , fn coffee maker could be predicted from personal habits. with given weights w1 , . . . , wn ∈ R. Together with a finite set of constants C, they define a probabil- There are one to three persons inside the scene simul- ity distribution over all interpretations (or possible taneously. Major occlusion caused by static objects, worlds). An interpretation maps each grounding of like walls, occurs when people enter the second room. each predicate to a truth value. The interpretation Minor static occlusion can occur near the coffee maker. of functions must be fixed. Probabilistic functions During the scenes with more than one person, dynamic can be emulated using predicates. Let gC (f ) be the occlusion occurs when persons are covered by other set of groundings of formula f obtained by replac- persons standing between them and the sensor. ing the free variables in f by all combinations of Using only the particle filter-based tracking algorithm constants from C. Given an interpretation x, then def to process the output of the laser range finder, prob- nC,i (x) = |{g | g ∈ gC (fi ) and x |= g}| is the num- lems arise when people produce no measures for an ber of groundings of formula fi that are true under x. extended period of time because they are inside the Then, the probability distribution PL,C that is defined separate room or because they are hidden by another by the MLN L with constants C is given as person. For shorter occlusion durations it is possible to 1 Y keep the track of a single occluded person alive for long def  PL,C (X = x) = exp wi nC,i (x) , (1) enough for the person to reappear and re-association is Z i completely handled by the tracking algorithm. If two persons enter the same occlusion area, their estimated where i ranges over all formulas in L, and Z is a nor- positions begin to mix spatially and once they emerge malizing constant. again, re-association becomes more and more arbi- Given a set of constants, a MLN can be converted to trary with increasing occlusion duration. In these sce- a Markov network, where nodes correspond to atoms narios, a direct integration of the Social Force model and each ground formula induces a clique over all into the prediction of the persons state of the tracking nodes whose atoms appear inside this formula. For algorithm may increase the performance of the sys- practical reasons, a sorted (or typed) logical language tem [Reuter and Dietmayer, 2010]. The Social Force is used to describe MLNs. Using sorted terms, we model aims at describing pedestrian movement by vir- can limit the size of the grounded network. Also, in tual forces exerted by other persons and environmental their basic form, MLNs do only allow restricted usage features [Helbing and Molnar, 1995]. Since the model of logical functions. Usually functions are simulated heavily depends on the destinations of the person, a by specially marked predicates, which enforce a func- tight integration with a high-level knowledge base, as tional dependency of one or more arguments on the re- described in this work, seems promising for such an maining arguments. We notate functional arguments approach. of predicates by underlining them. Such a predicate Figure 1b shows an example of the tracking results for can be translated to a multi-valued random variable. one of the sequences with three persons. The trajec- tories are illustrated by solid lines. Since the results 5 Problem Description are generated without the usage of the state dependent detection probability, the trajectories are interrupted We consider an indoor scene which resembles an office quite often in the area corresponding to region RA, setting. The corresponding floor plan is depicted in where a dynamic occlusion occurs. Figure 1a. A laser range finder is placed in one corner of the main room and provides distance information in a plane about one meter above ground. The beam 6 Description of the Model almost completely covers the main room, but there exists a second room that has virtually no sensor cov- erage. There is only one entrance to the room complex In this section, we describe the used MLN model and and the separate room has only a single exit, which is discuss some of the difficulties and design choices we the door to the main room. The setting contains sev- have encountered in its engineering. The complete eral features that may serve as goal destinations for model is factored into four modules. We begin with persons navigating inside the scene; like a printer or a discussion of two concepts that cannot be associated a coffee maker. Knowledge about destinations of per- with distinct model parts but influence nearly every sons is taken as given; although it is easy to motivate aspect – the representation of space and time. Office Coffee 0 -1 Printer RC Living -4 -3 RB -4 -1 RA Exit Laser Range Finder -5 0 (a) Floor plan (b) Example trajectories Figure 1: (a) Floor plan of the location used for the experiments. In addition the picture shows the eight discrete regions that are used for the MLN model, how they are connected via the handles and their static occlusion value in their lower right. (b) Example trajectories of a three person sequence: trajectories are illustrated by solid lines in different colors. The black stars are measurements of static objects like walls. From continuous to discrete space. The basic Representation of time. In order to model dy- MLN can only represent discrete random variables. namic domains, we assign a dedicated time sort, whose There exists an extension of MLNs to continuous vari- constants are elements from the natural numbers. One ables [Wang and Domingos, 2008], but no working im- usually aims to construct a model that fulfills the plementation is available. For this work, we reduce the Markov property, i.e., the state at time t + 1 only de- continuous spatial estimates obtained from the track- pends on the state at time t. This means that formulas ing algorithm to a few discrete regions. For ease of may only contain predicates of at most two different modeling and processing of data, we choose a rectan- times, which then must be successive. But in the pre- gular shape. We do not create a uniform grid, but try sented model, the predicates that represent the associ- to respect functional aspects of the environment con- ation of tracks to persons are not time-indexed, which cerning the problem. For example, it does not make makes them static. This makes it difficult to apply sense to further split the office into smaller areas if standard dynamic inference algorithms, which usually there is no distinction for the sensor (everything is one assume the Markov property. But a static variable connected occluded area) and there is only a single goal can be considered as a dynamic variable, for which inside. Even having two separate goals like two work the same value is enforced in every time step. For- stations might not justify the introduction of separate tunately, these static association are only referenced regions for each, as long as the rest of the model can over a limited period of time — the life-time of a track not discriminate between them. We have defined a to- — so they do not pile up over the course of the com- tal of eight regions, which are depicted in Figure 1a. plete sequence. For time resolution we have settled Sticking with a low number of regions also made an ex- for the duration of about one second, which seems like act evaluation of the final model feasible. Depending a good compromise between inference complexity and on the inference approach, there might be no signifi- accuracy for the given problem. cant overhead when using a larger state space for the spatial component. Although, the model engineering The tracking model. The basic functionality for may become more intricate when opting for more fine interfacing with the tracking algorithm is provided grained regions. Taking the characteristics of the sen- by a MLN module that contains objects of the sorts sor into account, a radial layout for regions seems like Track and Person. To notate variables of some sort, a promising approach, but this was not investigated. we use the initial letter of the sort name in lower case. For the sort Track, the letter ’m’ is used be- cause of the ambiguity with sort Time. For both static and dynamic occlusions occur, being caused by sorts Track and Time, there exist time-dependent walls or other persons, respectively. For our experi- predicates atT : Time × Track × Region and atP : ments, only static occlusion information is modeled. Time × Person × Region that give the current loca- This is done by assigning a certain probability to each tion of a track or person, respectively. The time- region that it may contain untracked persons. The independent predicate a : Track → Person associates probability is larger for areas of high static occlusion, Track objects to Person objects. The correspondence like the separate room. We also assign a higher occlu- of tracks to persons inside the MLN is similar to the sion to regions that are more likely to be dynamically correspondence of measurements to tracks inside the covered, like the region around the coffee maker. By tracking algorithm. The output of the tracking algo- assigning low occlusion probabilities to central regions rithm is converted to observations of the atT func- that have a good sensor coverage we penalize persons tion and an additional completely observed predicate silently slipping past the sensor. Formula 5 is provided act : Time × Track, whose purpose is mainly to be once for each region r. The weight wr is the occlusion able to prune groundings of formulas that depend on value, which is given in Figure 1a. inactive track IDs. The usage of this predicate is omit- wr atP (t, p, r) ∧ ¬∃m : (act(t, m) ∧ a(m, p)) (5) ted for clarity. Information about goals of persons can attach to the location of the person objects. The core Goals and their dynamics. The last model part tracking model then consists of the following two for- handles the goals of persons. We associate goals mulas. with regions and add another time-dependent func- 5 atT (t, m, r) ∧ a(m, p) ⇒ atP (t, p, l) (2) tion goal : Time × Person × Region, where goals are also allowed to assume the special location NULL to sig- −3 a(m1 , p) ∧ a(m2 , p) ∧ m1 6= m2 (3) nal that a person currently has no goal. The following Formula 2 probabilistically forces a person to be in formulas describe the dynamics of goals: the same region as its associated track. By design a ∞ goal(t, p, l) ∧ ¬atP (t, p, l) ⇒ goal(t + 1, p, l)(6) track can only be associated to one person at a time ∞ goal(t, p, l) ∧ atP (t, p, l) because the last parameter of the predicate a is de- clared functional. Formula 3 probabilistically enforces ⇒ goal(t + 1, p, l) ∨ goal(t + 1, p, NULL) (7) the association to also be a one-to-one relation. The 0.1 goal(t, p, NULL) (8) engineering of the formula weights is explained at the end of this section. This formula is limited to concur- Formulas 6 and 7 achieve that goals can only be cleared rently instantiated tracks using the act predicate (not when a person reaches their associated region. For- listed). mula 8 encodes the urge of people to clear their goal. Stating this rule in this particular form results in per- sons trying to reach their goal as soon as possible, since The floor plan. We use the static predicate adj : otherwise the penalty accumulates over time. Another Region × Region to encode the connectedness of the way to make people reach their goals is to make it more regions. All instances are fully observed and adhere likely for a person to be inside the region of their goal. to the floor plan given in Figure 1a. A single formula Both rules work equally well for our dataset but might forces persons to move between regions only according make a difference when applied to longer sequences or to the given layout: under a different setting. ∞ atP (t, p, l1 ) ∧ atP (t + 1, p, l2 ) ⇒ adj(l1 , l2 ) (4) Elicitation of weights. There exist two major ways to determine the weights of the probabilistic formulas: This formula is deterministic to prevent persons from Learning from data and elicitation from experts; where “teleporting” through the scene. If regions allow for a for common sense domains, like the one we are dealing traversal in less than a second (one time step), this rule with, everyone is usually an expert. Both the learn- becomes invalid. But since the association of tracks to ing of weights and the direct specification approaches persons allows for some slack, a person in the model have been followed in the literature. For the case of can “catch up” to the location of its real counterpart our related work, Sadilek and Kautz [2010] and Singla after some time steps, only violating Formula 2. and Domingos [2006] are employing learning and Tran and Davis [2008] specify the weights by hand. Due to The occlusion model. To prevent persons without the limited size and the common sense nature of our an associated track from wandering across the scene dataset we decided to specify the weights ourself. (since no track influences their current location), we need to express that persons usually have a track un- The approximation to consider the weight as the loga- less they are indeed occluded. In our setting both rithmic odds of the formula being true [Richardson and pl Domingos, 2006] can serve as a good starting point, logarithmic odds wl = log 1−p l , where pl is the relative but it only holds as long as formulas do not share pred- frequency of a particle of track m being in region l. icates. After assigning some reasonable initial values, we iteratively looked at predictions of the model for selected sequences and adjusted the weights if the pre- 8 The Inference Problem dictions did not conform with our expectations. We began by adjusting the weights for sequences with only Markov Logic Networks can be seen as template mod- one persons and switched to larger test sequences once els for undirected graphical models [Koller and Fried- the model made sensible predictions for the training man, 2009]. Their semantics are defined using the data at hand. Since MLNs are based on undirected ground version of these networks. As such the de- graphical models (which means they are locally un- scribed MLN represents an undirected version of a dy- normalized), there are no absolute correct values, but namic Bayesian network [Murphy, 2002]. The effort for the weights of different formulas have to be balanced exact inference in such models is usually exponential in against each other. the number of variables within one time slice, because most variables within one time step become dependent on each other after some steps in most models. 7 Preprocessing of Tracking The model described in this work also suffers from this Information problem. The cause that all variables of a time slice become dependent lies in the probabilistic data asso- We go on and describe how tracking data is processed ciation; which is a hard problem at its core. In our for input to the MLN model. After extraction of the case the problem of exact inference is exponential in individual objects in the multi-object Bayes filter, we mT + nP , where mT is the maximum number of simul- t obtain a set of single object particles Xm for each track taneous tracks and nP is the total number of persons ID m and time step t. We then apply two data reduc- in the model. Here mT stems from the association tion steps. First, the MLN model works on a coarser predicates and nP are the instantiations of the atP time scale of 1.25 steps per second, while the tracking predicate for one time step. algorithm runs with 12.5 steps per second. We drop the intermediate steps without further processing. A For our evaluation we perform exact inference on different approach might aggregate them, e.g., by av- the model by exploiting context-specific indepen- eraging, but this would also distort the meaning of the dence [Koller and Friedman, 2009, pp. 171]. Given data, because it cannot be considered a snapshot of an assignment to all association variables, the model the situation anymore. factorizes into components for each person and thus becomes tractable. Our largest sequence contained Depending on the quality of the tracking algorithm only 10 tracks, which results in 37 possible associa- and the used object model, there can be many false tions to three persons after observing the correct as- positive tracks, e.g., when people spread their arms sociation for three initial tracks. After conditioning away from their body, crossing the plane of laser on the association variables we calculate the partition beams. To reduce these false tracks, we use the ex- function for each association using variable elimination istence probability to eliminate insignificant tracks. It along a min-degree variable ordering. This approach is t is given by |Xm |/N ; the number of particles for track not suited for online filtering. For this purpose a rao- ID m divided by the total number of particles N . We blackwellized particle filter, which collapses all but the drop all tracks from a time step whose existence prob- association variables, seems like a good solution [Koller ability is below 0.5. For our test sequences, the output and Friedman, 2009, pp. 526]. Evaluating the per- of the tracking algorithm usually contains about thirty formance of this inference approach on the presented tracks per sequence, but only less than ten remain af- model is open for future work. ter applying both reduction processes. For each time step t and each track ID m that survive 9 Evaluation the described process we add the track as active to our MLN model via observation of the act predicate. We We recorded 9 sequences in total; three sequences with then bin the single object particles into the discrete one, two and three persons, respectively. The duration regions. Most of the time all particles are contained of each sequence is about one minute. The course of in a single region and we create an observation of the events is the same among sequences with the same atT function. In cases where the particles of a track number of persons; the sequences vary during the part m spread over several regions we reflect this as proba- where multiple persons wander around the main room. bilistic evidence by adding a formula (wl , atT (t, m, l)) for each location l and calculating the weight as the The setups with one person only feature static occlu- Unassigned Tracks Model M Model MO Model MOG Sequence Persons ↓pD = c ↓pD (x) ↓a− ↑ptrue ↓a− ↑ptrue ↓a− ↑ptrue S1-1532 1 1 1 0 0.34 0 0.99 0 1.00 S1-1640 1 1 1 0 0.34 0 0.99 0 1.00 S1-1737 1 1 1 0 0.34 0 1.00 0 1.00 S2-2056 2 3 2 2 0.10 2 0.27 0 0.54 S2-2207 2 4 4 3 0.00 4 0.01 3 0.03 S2-2329 2 3 3 2 0.09 2 0.26 0 0.51 S3-4628 3 5 1 3 0.25 0 0.45 0 0.89 S3-4734 3 5 3 2 0.15 0 0.52 0 0.55 S3-5306 3 7 4 1 0.13 1 0.12 1 0.25 Table 1: For each of the nine sequences used for evaluation, we give the number of invented tracks minus the number of persons for tracking with constant detection probability pD = c and with state dependent detection probability pD (x). For the three MLN models, we give the number of false associations a− and the probability of the true assignment ptrue . sion caused by walls, caused by the single person stay- person and evaluate how well we can associate new ing inside the office for several seconds. This results tracks. In our dataset, the number of tracks that re- in its track being reinvented upon entering the main main unassigned after labeling the starter tracks varies room again. With two persons there is dynamic occlu- between one and seven. sion, where one person covers the other person. Both The results of our evaluation are given in Table 1. For persons enter the office together and thus cannot be each sequence, we give the number of false track as- distinguished once they reappear. Goal information signments of the most probably association. In ad- for one person can resolve this issue and we can ob- dition, we provide the probability of the correct joint tain a good association again. In the scenes with three association. This is interesting for cases where no false persons, one person enters the office while the two re- associations were made even with a simpler model, and maining persons stay inside the main room. Dynamic can show an improved significance of the correct asso- occlusion occurs while all three persons are walking ciation when using a more sophisticated model. around in front of the sensor. Tracks are lost and recre- ated often, which can also be observed in Figure 1b in- To provide a baseline, the number of track confusions side the area corresponding to region RA. When two and losses of a state-of-the-art multi-object Bayes fil- persons are simultaneously shadowed by the third one, ter with state dependent detection probability (pD (x)) it is not possible to associate the reappearing tracks to and the ones using the same filter with constant detec- the correct persons just by means of the sensor. In this tion probability (pD = c) are given [Reuter and Diet- case goal information can be used to identify the cor- mayer, 2011]. For both filters, the number of persons rect association. inside the scene is subtracted from the total number of significant tracks. If the tracking algorithm works We have evaluated three models that incorporate an perfectly, this number will be zero. The output of the increasing amount of domain knowledge. The first pD = c filter is used as input for the MLN stage; so model M uses only the basic tracking model and the this number equals the number of associations made floor plan. The second model MO comprises all of the by the high-level model and thus equals the possible first model and the static occlusion model. The third maximum number of false associations. model MOG adds information about personal goals. One person visits the coffee maker in each sequence. We as- We observe that the algorithm with the state depen- sign the Coffee region as goal for this person in every dent detection probability reduces the number of unas- sequence. All other persons have no goal assigned. signed tracks dramatically in the scenarios with three The goal information is able to resolve confusions that persons, where a lot of short term occlusions occur. happen before the designated person visits its goal re- In the scenarios with one or two persons, where the gion. This does not happen in every sequence. long-term occlusions due to static objects dominate, the usage of the state dependent detection probability The MLN is instantiated for three persons in every has nearly no influence on the number of unassigned setup, regardless of the number of persons appearing tracks. The high-level model MO using only the floor in the scene. For each model and each sequence, we ob- plan and the static occlusion model delivers results serve the correct association for the first track of each that are at least on par with the state dependent track- R. de Salvo Braz, E. Amir, and D. Roth. A survey of ing algorithm. Using goal information for one person first-order probabilistic models. In D. Holmes and can further improve the results. By our judgment, this L. Jain, editors, Innovations in Bayesian Networks, is not possible by relying solely on the data obtained Studies in Computational Intelligence. Springer, from the laser range finder in general. 2008. One of the two person sequences (S2-2207) shows very V. Gogate and P. Domingos. Probabilistic theorem bad performance of the MLN model for all three cases. proving. In Proceedings of the 27th Conference on Our investigation has shown that an outstretched arm Uncertainty in Artificial Intelligence, pages 256– has caused a significant track that made it through the 265. AUAI Press, 2011. data reduction process. The relatively high weight on D. Helbing and P. Molnar. Social force model for Formula 3 prevented the association of this track to the pedestrian dynamics. PHYSICAL REVIEW E, 51: owner of the arm. Thus, a third person was forced by 4282–4286, 1995. the model to appear at this spot and remained present D. Koller and N. Friedman. Probabilistic Graphical over the course of the scene. Models: Principles and Techniques. MIT Press, 2009. 10 Conclusion R. P. Mahler. Statistical Multisource-Multitarget Infor- mation Fusion. Artech House Inc., Norwood, 2007. We have described an approach to solving the data association problem for person tracking with a high- K. Murphy. Dynamic Bayesian Networks: Representa- level probabilistic model described using Markov Logic tion, Inference and Learning. PhD thesis, University Networks. We showed how to map the output of a of California, 2002. regular tracking algorithm into a discrete spatial rep- S. Reuter and K. Dietmayer. Adapting the state un- resentation, which makes it easy to attach additional certainties of tracks to environmental constraints. In information, e.g., personal goals, and allows the use of Proceedings of the 13th International Conference on inference techniques for discrete probabilistic models. Information Fusion, pages 1–7, 2010. Especially in scenarios with long-term occlusions, S. Reuter and K. Dietmayer. Pedestrian tracking using where even the multi-object Bayes filter is not able to random finite sets. In Proceedings of the 14th Inter- continue to track hidden objects, the association using national Conference on Information Fusion, pages MLN outperforms the tracking-based approach when 1–8, 2011. using exact evaluation. On the other hand, a sophisti- M. Richardson and P. Domingos. Markov logic net- cated tracking algorithm is adequate in scenarios with works. Machine Learning, 62(1-2):107–136, 2006. occlusions of no more than one second and might be A. Sadilek and H. Kautz. Recognizing multi-agent able to scale more easily to larger domains. activities from GPS data. In Proceedings of 24th In future, we plan to integrate more information out AAAI Conference on Artificial Intelligence, pages of the knowledge-base directly into the tracking algo- 1134–1139, 2010. rithms like, e.g., probabilistically modeled destinations H. Sidenbladh and S.-L. Wirkander. Tracking random or goals of a person. sets of vehicles in terrain. In Proceedings of the Con- ference on Computer Vision and Pattern Recogni- Acknowledgements tion, page 98, 2003. P. Singla and P. Domingos. Entity resolution with This work is done within the Transregional Collab- markov logic. In Data Mining, 2006. ICDM’06. orative Research Centre SFB/TRR 62 “Companion- Sixth International Conference on, pages 572–582, Technology for Cognitive Technical Systems” funded 2006. by the German Research Foundation (DFG). S. Thrun, W. Burgard, and D. Fox. Probabilis- tic Robotics (Intelligent Robotics and Autonomous References Agents). The MIT Press, 2005. C. M. Bishop. Pattern Recognition and Ma- S. Tran and L. Davis. Event modeling and recogni- chine Learning (Information Science and Statistics). tion using markov logic networks. Computer Vision– Springer, 2006. ECCV 2008, pages 610–623, 2008. S. Blackman and R. Popoli. Design and Analysis of J. Wang and P. Domingos. Hybrid markov logic net- Modern Tracking Systems. Artech House Publishers, works. In Proceedings of the 22th AAAI Conference 1999. on Artificial Intelligence, pages 1106–1111, 2008.