=Paper=
{{Paper
|id=Vol-3181/paper16
|storemode=property
|title=Overview of the EEG Pilot Subtask at MediaEval 2021: Predicting Media
Memorability
|pdfUrl=https://ceur-ws.org/Vol-3181/paper16.pdf
|volume=Vol-3181
|authors=Lorin Sweeney,Ana Matran-Fernandez,Sebastian Halder,Alba Garcia Seco de
Herrera,Alan Smeaton,Graham
Healy
|dblpUrl=https://dblp.org/rec/conf/mediaeval/SweeneyM0HSH21
}}
==Overview of the EEG Pilot Subtask at MediaEval 2021: Predicting Media
Memorability==
Overview of the EEG Pilot Subtask at MediaEval 2021: Predicting Media Memorability Lorin Sweeney1 , Ana Matran-Fernandez2 , Sebastian Halder2 , Alba G. Seco de Herrera2 , Alan Smeaton1 , Graham Healy1 1 School of Computing, Dublin City University 2 School of Computer Science and Electronic Engineering, University of Essex lorin.sweeney8@mail.dcu.ie,{amatra,s.halder,alba.garcia}@essex.ac.uk,{alan.smeaton,graham.healy}@dcu.ie ABSTRACT researchers, allowing them to explore and leverage EEG features The aim of the Memorability-EEG pilot subtask at MediaEval’2021 without any of the requisite domain knowledge, but also increase is to promote interest in the use of neural signals—either alone the interdisciplinary interest in the subject of memorability more or in combination with other data sources—in the context of pre- broadly. dicting video memorability by highlighting the utility of EEG data. Applying EEG to the question of whether an experience will be The dataset created consists of pre-extracted features from EEG subsequently remembered or forgotten is a well researched area recordings of subjects while watching a subset of videos from Pre- [7, 9, 12, 15]. Memorability, however, has been shown to be distinct dicting Media Memorability subtask 1. This demonstration pilot from subsequent memory effects [1, 14], and received little inter- gives interested researchers a sense of how neural signals can be disciplinary attention. Additionally, even though the application used without any prior domain knowledge, and enables them to do of machine learning to EEG is an active area of interest—allowing so in a future memorability task. The dataset can be used to support for the automation or augmentation of neurological diagnostics the exploration of novel machine learning and processing strategies [3–5, 10], and the classification of emotional states [16], mental for predicting video memorability, while potentially increasing in- tasks [11], and sleep stages [2]—the use of EEG to predict visual terdisciplinary interest in the subject of memorability, and opening memorability has yet to be firmly established, and was previously the door to new combined EEG-computer vision approaches. limited to static content [6]. To the best of our knowledge, this paper outlines the first application of EEG to video memorability. 1 INTRODUCTION AND RELATED WORK 2 EXPERIMENT DESIGN AND STRUCTURE Even though the nature and constitution of people’s memories re- The stimuli used in the study are a subset of the subtask 1 data mains elusive, and our understanding of what makes one thing (i.e., the short-term video memorability prediction task) in Media- more/less memorable than another is still nascent, combining com- Eval’2021 [8], and consists of 450 videos, 96 of which were des- putational (e.g., machine learning) and neurophysiological (e.g., ignated as targets and selected to reflect the bottom and top 50 electroencephalography; EEG) tools to investigate the mechanisms memorable videos from the TRECVid dataset, 200 were selected to (formation and recall) of memory may offer insights that would be reflect the next top and bottom 100, and 100 were selected to reflect otherwise unobtainable. While EEG is not a tool that can directly ex- the middle 100 memorable videos (95 selected + 5 duplicates) from plain what makes a video more/less memorable, it can help us trim the set of subtask videos. EEG data was collected from 11 subjects the umbral undergrowth surrounding the subject, shedding light while they completed a short-term memory experiment, which was and offering a potential leap forward in our understanding of the used to annotate the videos for memorability. EEG data acquisition2 interplay between the mechanisms of memory and memorability. was carried out in two separate locations using a shared experimen- The purpose of this pilot study at MediaEval’2021 [8] was to tal procedure, and each location annotated the same set of videos. collect enough EEG data for proof of concept and demonstration Rather than being split into separate encoding and recognition purposes, showcasing what could be done in subsequent work on phases, the experiment was continuous in nature. predicting media memorability. The study involved the collection, Before the experiment was carried out, participants were given filtering, and interpretation of neurophysiological data, and the a verbal description of the experiment procedure, presented with use and evaluation of machine learning methods to enable the a set of written instructions, and taken through a practice run of assessment of EEG data as a predictor of video memorability. The 3 videos to familiarise them with the experiment. The experiment study has culminated in a demonstration of the utility of EEG in used a total of 450 videos, 192 of which were the target videos (96 the context of video memorability, along with the public release of targets, shown twice), and the remaining 258 videos were the fillers. processed EEG features for others to explore1 . This study has the The experiment was broken into 9 blocks of 50 videos, where a potential to not only broaden the research horizons of computing 1 Dataset and examples of use, as well as the code to replicate the results in this paper, are available at https://osf.io/zt6n9/ 2 Data collection for participants 1–5 was carried out at Dublin City University (DCU) with approval from the university’s Research Ethics Committee (DCUREC / 2021 / Copyright 2021 for this paper by its authors. Use permitted under Creative Commons 171), and for participants 6–11 at the University of Essex (UoE) with approval from License Attribution 4.0 International (CC BY 4.0). the Ethics Committee (ETH2122-0001). Data at DCU was collected using a 32-channel MediaEval’21, December 13-15 2021, Online ANT Neuro eego system with a sampling rate of 1000 Hz. Data at UoE was collected using a 64-channel BioSemi ActiveTwo system at a sampling rate of 2048 Hz. MediaEval’21, December 13-15 2021, Online Sweeney et al. fixation cross was displayed for 3–4.5s, followed by the video pre- Table 1: Mean AUC values obtained for each participant sentation for its ~6 second duration, followed by a “get ready to across all folds, separately for ERP and ERSP features. answer” prompt of 1–3 seconds, followed by a 3s period for recog- nition response (repeated video or not). The time per block was Participant ERP-based classification ERSP-based classification approximately 700 seconds (~12 minutes) without accounting for 1 0.564 ± 0.09 0.522 ± 0.09 30-second closed/open eye baselines and breaks, which occurred 2 0.585 ± 0.11 0.558 ± 0.07 between blocks. In order to account for recency effects, the first 50 3 0.520 ± 0.07 0.532 ± 0.07 videos presented did not include targets, but had 5 filler repeats, 4 0.666 ± 0.07 0.626 ± 0.09 and the presentation positions of targets between each of the partic- 5 0.714 ± 0.06 0.649 ± 0.08 ipants was pseudo-randomised, with the distances between target 6 0.555 ± 0.11 0.522 ± 0.10 and repeat videos roughly fitting a uniform distribution, and the 7 0.601 ± 0.10 0.525 ± 0.08 position of each block aside from block 1 being rotated by 1 for 8 0.590 ± 0.08 0.674 ± 0.08 each participant. 9 0.609 ± 0.09 0.489 ± 0.06 10 0.628 ± 0.06 0.618 ± 0.09 3 ANALYSIS AND RESULTS 11 0.477 ± 0.08 0.611 ± 0.12 EEG data from both locations were processed in the same way for Mean 0.591 ± 0.06 0.575 ± 0.06 the 30 channels that were common across the two setups: data were first referenced using a common average and band-pass filtered between 0.1–30 Hz using a symmetric linear-phase FIR filter. Inde- pendent Component Analysis (ICA) was used to remove artifacts, and trial rejection using subject-specific thresholds was applied. A To establish a baseline using features extracted from the time domain, the EEG was low-pass filtered with a cutoff frequency of 15 Hz and downsampled to 30 Hz. We applied baseline correction to the average of the 250-ms pre-stimulus interval and extracted the data corresponding to the first second of each repeated clip, from each of the 30 channels, and concatenated it to form a feature vector. We term these the Event-Related Potential (ERP) features. A second set of features were extracted from the EEG, this time from the time-frequency domain, which we refer to as ERSP (Event- B Related Spectral Perturbation) features. For this, we extracted 4- second long epochs and computed a trial-by-trial time-frequency representation using Morlet wavelets for frequencies between 2- 30 Hz. For this set of features, we used data from only 4 channels, namely Fz, Cz, Pz, and O1. Since there were very few forgotten clips, in this task we dif- ferentiate between the first and the second viewing of clips that were successfully remembered based only on EEG data. To establish a baseline, we standardised the data to have mean zero and unit Figure 1: Grand-averaged butterfly plot showing differences standard deviation, and used scikit-learn’s Bayesian Ridge regressor in EEG activity for the second minus first presentation for with default parameters. Results were obtained through 20-fold videos for the first second (top-A). Averaged time-frequency cross-validation with a 20% train-test split, separately for ERP and differences in power for the second presentation minus that ERSP features. The individual classification results for each partici- for the first presentation of videos for the first 3 seconds for pant are shown in Table 1, measured using Area Under the Receiver channels Fz and Pz (bottom-B left and right, resp.). Operating Characteristic Curve [13]. 4 DISCUSSION AND OUTLOOK This was an exploratory pilot task to guide the development of collected with a revised experimental protocol and more partici- a future experimental protocol for capturing EEG signatures re- pants to support a future fully-fledged task for predicting video lating to successful memory encoding and retrieval to be used in memorability. The preprocessed EEG data captured is released to predicting video memorability. While our experimental protocol the research community. resulted in too little data to examine differences between success- ful and unsuccessful encoding, we show EEG-related differences exist between the encoding and recognition phases of previously ACKNOWLEDGMENTS seen videos. These results indicate that EEG signatures relating to This work was part-funded by NIST Award No. 60NANB19D155 and memory processes for video are present, and thus suitable to be by Science Foundation Ireland under grant number SFI/12/RC/2289_P2. Predicting Media Memorability MediaEval’21, December 13-15 2021, Online REFERENCES [16] Xiao-Wei Wang, Dan Nie, and Bao-Liang Lu. 2014. Emotional state [1] Wilma A Bainbridge, Daniel D Dilks, and Aude Oliva. 2017. Memora- classification from EEG data using machine learning approach. Neu- bility: A stimulus-driven perceptual neural signature distinctive from rocomputing 129 (2014), 94–106. memory. NeuroImage 149 (2017), 141–152. [2] Farideh Ebrahimi, Mohammad Mikaeili, Edson Estrada, and Homer Nazeran. 2008. Automatic sleep stage classification based on EEG signals by using neural networks and wavelet packet coefficients. In 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 1151–1154. [3] Denis A Engemann, Federico Raimondo, Jean-Rémi King, Benjamin Rohaut, Gilles Louppe, Frédéric Faugeras, Jitka Annen, Helena Cassol, Olivia Gosseries, Diego Fernandez-Slezak, and others. 2018. Robust EEG-based cross-site and cross-protocol classification of states of consciousness. Brain 141, 11 (2018), 3179–3192. [4] Behshad Hosseinifard, Mohammad Hassan Moradi, and Reza Rostami. 2013. Classifying depression patients and normal subjects using ma- chine learning techniques and nonlinear features from EEG signal. Computer methods and programs in biomedicine 109, 3 (2013), 339–345. [5] Cosimo Ieracitano, Nadia Mammone, Amir Hussain, and Francesco C Morabito. 2020. A novel multi-modal machine learning based approach for automatic classification of EEG recordings in dementia. Neural Networks 123 (2020), 176–190. [6] Sang-Yeong Jo and Jin-Woo Jeong. 2020. Prediction of visual memo- rability with EEG signals: A comparative study. Sensors 20, 9 (2020), 2694. [7] Demetrios Karis, Monica Fabiani, and Emanuel Donchin. 1984. “P300” and memory: Individual differences in the von Restorff effect. Cognitive Psychology 16, 2 (1984), 177–216. [8] Rukiye Savran Kiziltepe, Mihai Gabriel Constantin, Claire-Hélène Demarty, Graham Healy, Camilo Fosco, Alba García Seco de Herrera, Sebastian Halder, Bogdan Ionescu, Ana Matran-Fernandez, Alan F. Smeaton, and Lorin Sweeney. 2021. Overview of The MediaEval 2021 Predicting Media Memorability Task. In Working Notes Proceedings of the MediaEval 2021 Workshop. [9] Wolfgang Klimesch. 1999. EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis. Brain research reviews 29, 2-3 (1999), 169–195. [10] Christoph Lehmann, Thomas Koenig, Vesna Jelic, Leslie Prichep, Roy E John, Lars-Olof Wahlund, Yadolah Dodge, and Thomas Dierks. 2007. Application and comparison of classification algorithms for recogni- tion of Alzheimer’s disease in electrical brain activity (EEG). Journal of neuroscience methods 161, 2 (2007), 342–350. [11] Nan-Ying Liang, Paramasivan Saratchandran, Guang-Bin Huang, and Narasimhan Sundararajan. 2006. Classification of mental tasks from EEG signals using extreme learning machine. International journal of neural systems 16, 01 (2006), 29–38. [12] Eunho Noh, Grit Herzmann, Tim Curran, and Virginia R de Sa. 2014. Using single-trial EEG to predict and analyze subsequent memory. NeuroImage 84 (2014), 712–723. [13] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Pret- tenhofer, Ron Weiss, Vincent Dubourg, and others. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825–2830. [14] Michael D Rugg and Tim Curran. 2007. Event-related potentials and recognition memory. Trends in cognitive sciences 11, 6 (2007), 251–257. [15] Thomas F Sanquist, John W Rohrbaugh, Karl Syndulko, and Donald B Lindsley. 1980. Electrocortical signs of levels of processing: Perceptual analysis and recognition memory. Psychophysiology 17, 6 (1980), 568– 576.