AdsISee: Advertisement Detection and Tracking
                     for Sponsorship Evaluation in Soccer Matches
         Alexander Westermann∗                                            Philipp Krayer∗                                  Andreas Weiler
       Institute of Applied Information                         Institute of Applied Information                  Institute of Applied Information
                  Technology                                               Technology                                        Technology
         Zurich University of Applied                             Zurich University of Applied                      Zurich University of Applied
                    Sciences                                                 Sciences                                          Sciences
           Winterthur, Switzerland                                  Winterthur, Switzerland                           Winterthur, Switzerland
         alw.westermann@gmail.com                               philipp.krayer@protonmail.com                         andreas.weiler@zhaw.ch


    Figure 1: Example of AdsISee with three ads in the soccer match Sweden against Switzerland at the World Cup 2018.
ABSTRACT                                                                               to the global advertising market and the sponsors paid up to US
In this work, we present AdsISee, a real-life application for the                      $200 million for a sponsorship package [3]. The target audience
detection and tracking of advertisements in broadcasts of soccer                       of these advertisements are primarily the on-site visitors. How-
matches for supporting business analysts in the task of spon-                          ever, also the television viewers are important receivers of these
sorship evaluation and reporting. Our approach is based on dif-                        kind of advertisements. For example, an average of about 191
ferent combinations of several techniques for object detection                         million viewers watched the soccer matches at the World Cup
and tracking in images. In contrast to other works which use                           2018 as live broadcast [4]. Additionally, millions of viewers are
the technology of neural networks, we use alternative solutions                        watching full or recapped recordings of soccer matches at any
to detect advertisements based on provided pre-defined image                           time. In respect to these numbers, every second is important in
templates and without any training period. Hereby it was possi-                        which the advertisements can be seen on screen by the viewers.
ble to build an application which can be executed on standard                             In this work, we present AdsISee, a real-life application for
hardware by still providing a feasible performance. Furthermore,                       the detection and tracking of pre-defined advertisements in the
our evaluations show that we can achieve comparable results                            frames of soccer match broadcasts. By using AdsISee, we are able
against other existing approaches, which use neural networks,                          to generate a sponsorship report to support business analysts
for sponsorship evaluation.                                                            in their task of sponsorship evaluation and further analysis. We
                                                                                       use different combinations of several techniques for object de-
                                                                                       tection and tracking like the FLANN [9] matcher or the MOSSE
1    INTRODUCTION AND MOTIVATION                                                       [7] tracker. In contrast to other works, in which neural networks
The market of advertisements in sport events is tremendous.                            [1, 8] or the Haar cascade detector [11] are being used, which
For example, advertisers had to pay US $5.25 million to air a 30-                      need to be trained on the detectable object beforehand, we use
second long commercial during the Super Bowl 2019 broadcast                            alternative solutions to detect advertisements based on provided
[12]. However, besides the advertisements, which are explicitly                        templates and without any training period. One advantage of
shown to the television viewers, there are advertisements, which                       our implementation1 is that it can be executed on standard hard-
are directly placed in the sport events itself. These advertise-                       ware by still providing a feasible performance. Therefore, the
ments are shown on the margins of the playing field as perimeter                       main goal of this work is to evaluate various technologies in
advertising, on the clothes of the players, or somewhere else in                       the field of visual computing and applying them in a domain
the real-world environment of the sports event. For example, the                       specific area in order detect objects without the use of machine
FIFA World Cup 2018 brought an additional of US $2.4 billion                           learning methods. This offers the advantages of less configura-
∗ Both authors contributed equally to this research.                                   tion, no manual annotation of advertisements and no learning
                                                                                       process in order to successfully detect the advertisements. Our
© 2020 Copyright for this paper by its author(s). Published in the Workshop Proceed-   experiments (see section 3) show that it is possible to detect and
ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen,
Denmark) on CEUR-WS.org. Use permitted under Creative Commons License At-
                                                                                       track advertisements in different quality levels. To evaluate our
tribution 4.0 International (CC BY 4.0)
                                                                                       1 https://github.com/AdsISee/AdsISee (November 20, 2019)
work, we created several case studies for different soccer matches.
Furthermore, we compare our application against the solution
of Orpix ComputerVision Inc. [5], which is the market leader
in the area of sponsorship evaluation who uses trained neural
networks in order to detect the advertisements in broadcasted
soccer matches. The detailed evaluations provide insights into
the advantages and disadvantages of the used technologies in           Figure 2: The three possible perspectives of an banner ad.
order to detect and track advertisements.


2     METHODOLOGY
Our developed solution detects advertisements on the margins             Figure 3: Example of detected features in a template.
of the playing field in broadcasts of soccer matches based on
templates in the format of pre-defined images. The broadcast of        2.2    Feature Matching
the soccer match is divided into its individual frames, which are
                                                                       After extracting the features from the template and target, each
used as target images and are processed in a streaming fashion
                                                                       feature from the template has to be searched for a match with a
one after the other. In each of the target images our solution tries
                                                                       feature from the target. An example can be seen in Figure 4. The
to detect the template image for the matching advertisements. In
                                                                       exact position of the advertisement in the target image can be cal-
a first step, relevant technologies for object detection in the do-
                                                                       culated as soon as the number of matches exceeds a threshold. We
main of advertisements were evaluated. By analyzing a common
                                                                       use in our implementation the Fast Library for Approximate Near-
set of advertisements in soccer matches, we were able to discover
                                                                       est Neighbors (FLANN ) in order to calculate matches between the
certain properties, which can be used for the recognition of the
                                                                       template and target image. The FLANN matcher calculates the
ads in the target images. One of the main characteristics are the
                                                                       nearest neighbors between the properties of the detected features
clearly distinguishable colors with high contrast which supports
                                                                       which are represented by a distance. By applying a threshold
the viewers and our application to better recognize the exposed
                                                                       for acceptable distances, we are able to distinguish between cor-
brands. By analyzing and extracting the different color compo-
                                                                       rect matches, which belong to the advertisement, and incorrect
nents of the template it is possible to filter out large areas of
                                                                       matches.
the target image which do not correspond with the colors of the
target image. Accordingly, the search area for the advertisements
can already be severely reduced.


2.1    Feature Detection
Another unique characteristic of advertisements on banners are
the simple geometric properties of the exposed logos. Edges, cor-
ners and flat surfaces can be extracted out of the advertisement
templates in order to match them in the target images. Therefore,
two different methods have been applied to detect those unique          Figure 4: Example of feature extraction and matching.
features with their positions from the templates and target im-
ages. One of the main challenges with this approach is, that in           The Brute-Force matcher offers a faster alternative to the
most cases the perspectives and sizes of the advertisements in         FLANN matcher, which compares each feature from the template
the target images do not correspond to the advertisements in the       with all the extracted features from the target image and matches
templates. The perspectives of the advertisements in the target        the features with the smallest difference [6]. Our evaluations
images depend on the angle and position of the camera recording        show that the Brute-Force matcher was able to perform faster
the sport event. We solved this issue by using the Scale Invariant     than the FLANN matcher, however, the accuracy was slightly
Feature Transformations (SIFT) in our implementation, which is         reduced. Accordingly, we our solution applies the Brute-Force
a technology that was originally designed for panoramic image          matcher additionally to the FLANN matcher for cases where the
stitching [2]. By applying the SIFT algorithm on the advertise-        performance has to be maximized.
ment templates and target images, unique features (cf. Figure 3)
such as corners, edges and flat areas can be extracted regard-         2.3    Matching Multiple Advertisements
less of their scaling and perspective. To improve the accuracy of
                                                                       During our evaluations we figured out that the matching has
the feature detection, each advertisement template is scaled to
                                                                       major issues by detecting more than one identical advertisement,
one of the three main perspectives, in which the banners occur
                                                                       which is visible in the target image. For example in Figure 1 the
during the broadcasts. These perspectives include the frontal
                                                                       application needs to detect the advertisements of McDonalds
directly visible and the positions on the left or right of the field
                                                                       and Visa more than once. In this case the approach of using the
(cf. Figure 2).
                                                                       Brute-Force and the FLANN matcher detects identical features
                                                                       of the ads accordingly and therefore it is impossible for feature
matching to distinguish the individual features between the same
advertisements. To solve this problem, we have developed our
own solution which allows to differentiate extracted features
between the same advertisements. After an advertisement has
been detected, the search for the same advertisement is repeated,
excluding the features of the already matched advertisement. This
procedure is repeated until no new advertisements are detected.
This allows our solution to allocate each feature from the template
advertisement to multiple features in the target image according
to the amount of the identical visible advertisements.


2.4    Tracking
The perquisite for feature detection and matching is that the
searched object has to be sharp. However, due to the movement
of the camera, in most cases the advertisements often appear
blurry in the target image. As a result, no sharp edges or corners
can be extracted out of the target image and the application is not
able to detect the advertisement. For these cases we used a com-
bination of tracking technologies, the Median Flow [13] - and the
MOSSE [7] tracker, in our solution. In order to detect blurred ad-
vertisements nevertheless, we have applied tracking technologies
to follow the movement of advertisements in the target image
once they have been detected. After evaluating various tracking
technologies, we decided to implement the MEDIANFLOW- and
the MOSSE tracker which show high performance and accuracy
as our experiments have shown. Once an advertisement has been
detected through SIFT and feature matching, it will be registered
in a tracker. For all following frames, the tracker is updated which
determines the exact position of the tracked advertisement. By
calculating a matrix for the perspective transformation out of the
previously detected advertisement, we were able to reconstruct               Figure 5: Visualization of empty area tracking.
the exact place and perspective of the tracked advertisement.
   As an added benefit, this approach increases the performance
of the entire software. The detection of advertisements through
feature matching is a relatively time-consuming process. Once
an advertisement has been detected in a frame, it can be tracked
for all subsequent frames and thus no longer has to be discovered
again. Besides of advertisement tracking, this technology can                   Target compression: Yes          No
also be used to track advertisement-free areas. As soon as no                   Average accuracy:      71%       88%
advertisements have been detected in a frame, the complete target               No. of errors:         1         0
image is marked for tracking, which will be excluded for all                    Average time:          1.03 sec. 4.12 sec.
the following advertising searches (cf. Figure 5). The impact the              Table 1: Evaluation of target compression
accuracy and performance of the advertisement and empty area
tracking will be tested in a detailed evaluation (cf. Section 3).


3     EXPERIMENTS
In a detailed evaluation we tested both the accuracy and the
performance of our application in advertisement detection in              The results in Table 1 show that reducing the resolution of
broadcasts of sport events. In the first series of experiments, the    the target image by 50% results in an increase in performance
influence of the individual components has been evaluated. All         by 400%. However, the accuracy decreases by 20%. Reducing
the experiments have been performed on an Ubuntu-Machine               the accuracy erases some detectable features since small edges
with 2.01 GHz, 8 CPU cores and 16 GB RAM.                              and corners disappear. Accordingly, fewer features have to be
                                                                       matched in order to detect the advertisement which decreases the
                                                                       search process. If the advertisement is poorly visible, not enough
3.1    Evaluation of Functionalities                                   features can be extracted for a successful match, which explains
In a first experiment we have tested the impact of compressing         the slight reduction in accuracy.
the target in a pre-processing step. We reduced the resolution of         In another experiment we have tested the impact of each
the target image by 50% and compared it with a test run without        individual feature on the performance and accuracy of the adver-
target compression.                                                    tisement detection.
   Color     Matching    Tracking Average Average
 filtering algorithm algorithm         time     accuracy
     Off    Brute-Force     MF       4.75       84%
     On     Brute-Force     MF       4.65       82%
     Off    Brute-Force   MOSSE      4.47       82%
     On     Brute-Force   MOSSE      4.61       83%
     Off       FLANN        MF       5.25       86%
     On        FLANN        MF       4.78       85%
     Off       FLANN      MOSSE      5.40       85%
     On        FLANN      MOSSE      5.06       85%
      Table 2: Comparison of various functionalities


    This evaluation shows that filtering out the irrelevant colors             Figure 6: Accuracy with different thresholds
in a pre-processing step increases the performance slightly, how-
ever, resulting in a slight decrease in accuracy (cf. Table 2). This
can be explained by the color filter interfering with the features
of advertisements in the target images. The evaluation of the two
different feature-matchers shows that the FLANN matcher per-
forms in terms of accuracy better than the Brute-Force matcher.
Though, the FLANN matcher took on average 0.5 seconds longer
than the Brute-Force matcher to calculate the matches between
the target advertisement and the template image. In addition to
the matching algorithms, the two different tracking methods have
been compared. The MOSSE tracker showed a slightly better per-
formance without a significant reduction in accuracy compared
with the MEDIANFLOW tracker. Accordingly, to these results,
all features have been implemented in our solution and the user
can decide whether the performance or the accuracy should be
enhanced for the advertisement evaluation.                                   Figure 7: Partially covered ad could be detected

3.2    Ideal Matching Difference                                       been selected and tested on this software. 71% of all ads have
In order to successfully match the features from the advertise-        been successfully detected without any incorrectly detected non-
ment templates with the extracted features from the template,          existing advertisements. In some cases, the advertisements could
it is necessary to filter out the wrong matches. Each extracted        be detected, although they were partially covered (cf. Figure 7).
feature from the template advertisement will be matched with              In a second run AdsISee is evaluated on various video scenarios
the most similar feature from the target image and the difference      of live broadcast soccer matches. In each video clip the advertise-
between the two features is calculated. If there are no advertise-     ments were visible with different properties, which tested the
ments in the target image, a non-existent advertisement will still     limitations of the software.
be detected, but with features whose differences are much higher
compared with those who matched a correct advertisement. To                              Sudden         Sudden
prevent this, a threshold is defined for the highest acceptable          Camera
                                                                                       appearance disappearance Accuracy:
difference in matches between features. A too high threshold            movement:
                                                                                          of ad:         of ad:
would result in matches which do not belong to an advertisement             Slowly         No             No         98%
and accordingly with a too low threshold correct matches would              Slowly         Yes            No         97%
be filtered out. The following experiment has been performed for            Slowly         Yes            Yes        98%
the purpose of finding the ideal threshold for acceptable matching           Fast          No             No         85%
differences. The results in Figure 6 show that the ideal threshold
                                                                             Fast          No             Yes        91%
for acceptable matching differences should be between 0.7 and
                                                                             Fast          Yes            Yes        70%
0.75 to achieve the best results. This threshold is implemented
                                                                             Table 3: Comparison of various video scenarios
accordingly in our solution.

3.3    Ground Truth Evaluation
In this evaluation phase we run several tests for optimizing the          In Table 3 it is visible, that on average our solution could detect
configuration of AdsISee for maximal accuracy for advertise-           the advertisements with high accuracy of 90%. The tracker was
ment detection. The goal was to compare this software with             able to track the movements of the advertisements of even fast
excerpts from live broadcasts of sport events and determine its        camera movements. Sudden appearances of the advertisements
overall accuracy. In a first run, the software was tested without      were always correctly recognized by the feature detection and
the use of tracking. This determined the accuracy of the plain         matching component after a maximum of 3 frames. However,
detection process of advertisements. 30 frames with clearly visi-      in the last test video, the advertisement was slowly faded away
ble advertisements and 30 frames without advertisements have           by an animation on the banner-screen. Since the used tracker
technologies cannot detect the disappearance of an object slowly
fading away, the position of the advertisement was continued
being tracked even though the actual advertisement already dis-
appeared. This resulted in a sharp drop of the measured accuracy.

3.4    Comparative Evaluation
Orpix ComputerVision Inc. offers a cloud-based solution for eval-
uations of advertisement occurrences during live broadcasts of
sport events. This solution uses a state-of-the-art convolutional
neural networks [10] to detect the advertisements and process the
target images in a frame rate of 1 FPS. Neural networks provide
excellent results in object recognition, but have the disadvan-
tage that they need to be trained by an elaborate process on the
object beforehand. This involves annotating advertisements in           Figure 8: Example of an advertisement that is tagged twice
hundreds of example templates by hand.                                  by Orpix.
   In this test, the accuracy of the product of Orpix is compared
to that of our solution. The goal is to show the advantage of Ad-
sISee, that it can perform a sponsorship evaluation by providing        of Orpix detected the advertisements twice or more, resulting
only proper advertisement templates, without having to train            in an inaccurate tagging of the corresponding advertisement
any algorithm or making any configurations beforehand. Orpix            (cf. Figure 8). In some cases, several adjacent advertisements
provides one free online example of a sponsorship evaluation of         have been marked as one single advertisement. Additionally, our
the final game France versus Croatia at the FIFA World CUP 2018,        solution is able to determine the positions of the advertisement
which will be used for the comparison against our solution. The         with better precision than the solution of Orpix. Unfortunately,
computational performance of their solution is not mentioned            the report about the final match at the FIFA World Cup 2018
by Orpix. Accordingly, no accurate comparison in performance            was the only report, which Orpix provided and therefore our
can be made between AdsISee and the solution of Orpix.                  comparable evaluation is just based on this single event and
   Based on 5 different advertising templates, the software was         report.
tested on randomly selected video sequences of the World CUP
2018 final game. The tagged advertisements were compared with           3.5     Problems Encountered
the individual frames from the example evaluation from Orpix.
                                                                        In order to extract as many different features as possible from
                                                                        the advertising template, the images have to be provided in a
                           Accuracy Accuracy
              Template:                                                 good quality. During our search for advertisement templates, we
                            Orpix:   AdsISee:
                                                                        were not able to find any high-quality templates that matched the
                Wanda      93.75 %   90.00 %
                                                                        commercials which appeared on the banners. Accordingly, we ex-
               Hyundai     88.50 %   73.25 %                            tracted the advertisements from broadcasts of sport events man-
            Qatar Airways 83.25 %    91.00 %                            ually, acquiring templates with slightly reduced quality. Thus,
                Hisense    99.50 %   92.75 %                            the tests were not performed with the best prerequisites that
               Gazprom     96.25 %   40.00 %                            could have been possible. Therefore, it can be assumed that Ad-
      Table 4: Comparison between Orpix and AdsISee.                    sISee can achieve even better results by providing high-quality
                                                                        advertisement templates as input.
                                                                            Our applied tracking technologies are ideal for tracking the
   All tests were performed on an Ubuntu system (2.01 GHz, 8            movement throughout the screen of detected advertisements.
CPU’s, 16 GB RAM). Our application needs 500 MB RAM and                 The tracker is able to recognize if the advertisement suddenly dis-
processes the frames on average with 0.981 seconds per frame.           appears and terminates accordingly the tracking phase. However,
The results (cf. Table 4) show that our software performs with          if the banner changes the advertisement by an animation, the
an average accuracy of 77.40 % and is therefore slightly lower          tracker is not able to detect this and the wrong advertisement is
than the solution from Orpix, which had an average accuracy             continuously tracked. This resulted in some of our experiments
of 92.25 %. This difference can be explained by the occurrences         in a reduced rate of accuracy.
of animations in the advertisement’s banners, which could only              During the evaluation phase, we observed that in some rare
be observed in this particular game. Normally most of the times         cases non-existent advertisements have been falsely detected (cf.
static banners are used in sport events. As a result, the tracker was   Figure 9). We were able to partially solve this issue by imple-
not able to detect the change of advertisements on the banners          menting a filter which checks the positions and relations of the
and false advertisements were tracked in all subsequent frames.         detected advertisements. This filter prevented most of the false-
In addition, the advertisement templates were only available in         detected advertisements we encountered during our evaluations.
reduced quality which limited the amount of extractable features        Te filter validates the calculated frame of the advertisement be-
(as explained in Section 3.5). Also, the solution of Orpix was          fore the detected object is registered for the tracking. We defined
in contrast to AdsISee able to detect advertisements that were          the following conditions, which have to be met by the detected
far away from the camera and even hardly recognizable by the            object to recognize it as an advertisement:
human eye.                                                                    • The edges of the marked object must not cross each other.
   Nevertheless, in some cases our software was able to detect                • The aspect ratios of the detected object must match those
the advertisements more precisely than Orpix. Often, the solution               of the template advertisement.
                                                                       Figure 11: Detection and Tracking of ads in hockey games
                                                                       with AdsISee.


                                                                       5    CONCLUSIONS AND FUTURE WORK
    Figure 9: Example of a falsely detected advertisement.
                                                                       In this work, we have demonstrated that it is possible to detect
                                                                       advertisements in broadcasts of sport events, especially soccer
                                                                       matches, for sponsorship evaluation and analysis. We were able
4     DEMONSTRATION                                                    to apply alternative technologies for object detection in a specific
To demonstrate the functionality and usefulness of AdsISee we          domain and show certain advantages compared to the state-of-
created a video compilation2 with examples of the detection and        the-art technologies. We have successfully implemented a proto-
tracking of several advertisements in soccer and hockey matches.       type that can detect advertisements in broadcasts of sport events
As templates we use a selection of 18 different advertisements (cf.    by only providing templates of advertisements. AdsISee extracts
Figure 10) of various companies. We tried to cover a wide range        unique features such as color, edges and corners out of the tem-
of colors and shapes with our template collection. Our demonstra-      plate and matches those with each individual frame from the live
tion provides examples for the detection and tracking of single        broadcast. With the implemented tracking technologies, it allows
or multiple advertisements in static and also fast-moving frames.      to track the movement of the advertisements throughout the
We also demonstrate in the video that AdsISee occasionally has         screen. We tested the prototype for accuracy and performance
problems (see Section 3.5) with the detection and tracking of the      in an extensive evaluation phase. In addition, AdsISee was com-
correct objects. For example, the country name “USA” in combi-         pared with a product of Orpix which is the leader in the area
nation with the blue color and the banner format (see timeframe        of sponsorship evaluation in sport events, where similar results
0:39 to 1:30 in the video) is detected and tracked as the Visa         have been measured. Our solution showed some advantages over
advertisement.                                                         the product of Orpix, for example, the tagged position of the ad-
                                                                       vertising was calculated which a better precision than the product
                                                                       of Orpix.
                                                                           For future work we plan to extend our approach to further
                                                                       sport events and also to other detection and tracking areas in the
                                                                       frames, where ads could be placed. For example, Figure 11 shows
                                                                       the result of our approaches to detect and track advertisements
                                                                       in hockey games on the margins of the playing field, as well as on
                                                                       the jerseys of the players. Additionally, many improvements can
                                                                       still be implemented and tested in future. The filter for detecting
                                                                       false positives could be improved by comparing the colors of the
                                                                       matched advertisement with the templates. The advertisement
                                                                       will not be tagged if the colors do not match the template, which
                                                                       would also eliminate the issue with animations. Furthermore,
                                                                       since most advertisements contain a large part of text, adding an
Figure 10: Advertisement templates used for the demon-                 additional text recognition feature could improve the accuracy
stration video.                                                        of AdsISee.

                                                                       REFERENCES
   Additionally, to the video output with the marked advertise-         [1] Khaled Almgren, Murali Krishnan, Fatima Aljanobi, and Jeongkyu Lee. 2018.
ments, AdsISee creates a report as a standard text file containing          AD or Non-AD: A Deep Learning Approach to Detect Advertisements from
                                                                            Magazines. Entropy 20, 12 (2018).
the information about the sequences, frames, and the detected           [2] Matthew Brown and David G. Lowe. 2006. Automatic Panoramic Image
ads. This report can be used to visualize (cf. Figure 1) and analyze        Stitching using Invariant Features. International Journal of Computer Vision
complete broadcasts of sport events in a compact view. For exam-            74, 1 (2006), 1–15.
                                                                        [3] Marketing Charts. 2018. World Cup 2018 Stats. https://www.marketingcharts.
ple, the report can be grouped by time sequences, advertisements,           com/industries/sports-industries-83790. [Online; accessed 25-October-2019].
or the number of detected advertisements. This can support ana-         [4] Fifa.com. 2019.        More than half the world watched record-
                                                                            breaking 2018 World Cup).             https://www.fifa.com/worldcup/news/
lysts in figuring out the sequences with the most advertisements            more-than-half-the-world-watched-record-breaking-2018-world-cup.
or comparing the on-screen time of the own advertisement with               [Online; accessed 25-October-2019].
others.                                                                 [5] Orpix ComputerVision Inc. [n.d.].              Sponsorship Analytics and
                                                                            Evaluation-Orpix-Computer Vision.                http://www.orpix-inc.com/
                                                                            sponsorship-valuation-analytics/. [Online; accessed 25-October-2019].
                                                                        [6] Amila Jakubovic and Jasmin Velagic. 2018. Image Feature Matching and Object
2 https://youtu.be/KReFUcKiw4E (November 27, 2019)
                                                                            Detection Using Brute-Force Matchers. 2018 International Symposium ELMAR
     (2018).
 [7] Peter Janku, Karel Koplik, Tomas Dulik, and Istvan Szabo. 2016. Comparison
     of tracking algorithms implemented in OpenCV. MATEC Web of Conferences
     76 (2016), 1—-6.
 [8] Shervin Minaee, Imed Bouazizi, Prakash Kolan, and Hossein Najafzadeh. 2018.
     Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement De-
     tection In Videos. CoRR (2018).
 [9] Marius Muja and David G. Lowe. 2009. Fast Approximate Nearest Neighbors
     with Automatic Algorithm Configuration. In Proc. of the Fourth International
     Conference on Computer Vision Theory and Applications (VISAPP 2009). 331–
     340.
[10] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN:
     Towards Real-Time Object Detection with Region Proposal Networks. IEEE
     Transactions on Pattern Analysis and Machine Intelligence 39, 6 (2017), 1137—-
     1149.
[11] Sander Soo. 2014. Object detection using Haar-cascade Classifier. Institute of
     Computer Science, University of Tartu.
[12] Statista.com. 2019. Super Bowl average costs of a 30-second TV advertisement
     from 2002 to 2019 (in million U.S. dollars). https://www.statista.com/statistics/
     217134/total-advertisement-revenue-of-super-bowls/. [Online; accessed
     25-October-2019].
[13] Kalal Zdenek, Mikolajczyk Krystian, and Matas Jiri. 2010. Forward-Backward
     Error: Automatic Detection of Tracking Failures. In 20th International Confer-
     ence on Pattern Recognition (ICPR). 2756–2759.