<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Prediction of visual memorability with EEG signals: A comparative
study. Sensors</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>dataset: A P300</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rachelle Hamelink</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CEUR Workshop Proceedings</institution>
          ,
          <addr-line>CEUR-WS.org</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Radboud University</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>20</volume>
      <issue>9</issue>
      <fpage>11</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>In modern and more specifically social media, most advertisements nowadays consist of very short, few second videos. This leads to an increase of interest in recognizability and rememberability of visual media. The present study analyzed event-related potentials (ERP's) across 1000 trials of the Memento10k dataset to explore neurophysiological differences between videos that are later remembered or not when shown again to the same subject. The posterior brain region was analyzed across three channels and the right temporal cortex was analyzed in one channel. A significant difference, measured as p &lt; .05, in amplitude was found in the 340 408 ms window after onset between remembered and not remembered videos as well as around the 476 ms Timepoint across all trials, channels and participants in the visual cortex. In the right temporal cortex, a significant difference, measured as p &lt;.05, in amplitude was observed in the 306 - 816 ms window. These results suggest, in line with previous literature, that a stronger P300 component can be found for Remembered videos than Not Remembered videos in the right temporal lobe.In the visual cortex, an opposite effect was found, as a higher positivity was observed for Not Remembered videos.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>INTRODUCTION</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>The posterior brain regions are correlated with the visual cortex and are in close proximity to the hippocampus, respectively
important for the processing of visual imput and for initial consolidation into memory systems. The right temporal lobe of the
brain is involved in the processes of learning and remembering non-verbal information, such as music and videos.</p>
      <p>Polich [11] explains the P300 component by amplitude and latency. The amplitude is defined by the difference
between the mean baseline voltage and the largest positive peak of the ERP wavefrom within a time window, in this study
300-1000ms after stimulus onset based on [5]. Latency is defined as the time in ms from stimulus onset to the point of
maximum positive amplitude within a time window, 300-1000 ms in this study based on [5]. Analyzing ERP amplitudes for
memory is an interesting method for declarative memory, as this method has shown that formation differences can already be
observed that can later predict memorability [5]. Classic declarative memory studies would have to rely on a consolidation
period of time and possible sleep factors in order to measure successful memorization of the stimulus input.</p>
      <p>This study aims to investigate whether this P300 effect can be observed in videos of the Memento10k dataset. The
aim is to research whether a significant P300 difference effect can be found between Remembered and Not Remembered
videos in the encoding phase of the old/new paradigm in this experiment. This data is then used to analyze whether a P300
effect in the encoding phase of an old/new paradigm can predict successful video memorability. ERP’s rely on the onset of a
stimulus to observe a difference in brain function. EEG recordings were collected for the first second of the videos [2].
Therefore, as it can be argued that a one second video does not differ severely from a still image, the main hypothesis in this
paper is that within the 300-1000 ms time window in the data, a significant P300 component can be observed.
3
3.2</p>
    </sec>
    <sec id="sec-3">
      <title>APPROACH</title>
      <sec id="sec-3-1">
        <title>Participants, materials and procedure</title>
        <p>The participants, materials and procedure of the Predicting Video Memorability task data collection can be found in the
overview paper [2]. For this analysis, the posterior brain region or visual cortex was analyzed by means of the EEG cap
channels Oz, O1 and O2. In addition, the right temporal lobe was analyzed by means of the EEG cap channel P8. Both these
regions were analyzed based on previous results using MEG by Osipova et al. [5]. The posterior brain regions correspond with
the visual cortex, which is mostly activated when processing multimedia input and has close proximity to the hippocampus,
which is a key are for memorization. The right temporal lobe is considered involved in learning and remembering of
nonverbal stimuli, such as music and videos. Both these areas were analyzed separately to observe a difference in Remembered
versus Not Remembered videos. Participants had to indicate on second viewing whether they remembered the presented video
or not. Only trials that included a Memorability Score for that video were included in final analysis.
3.3</p>
      </sec>
      <sec id="sec-3-2">
        <title>Data cleaning and statistical analysis</title>
        <p>Preprocessing steps were undertaken to create the final dataset [2]. In addition to the preprocessing steps described in the
overview paper of the task, outlier amplitudes across all participants, the four included channels and all trials were removed
based on 1.5 standard devation below the first quartile of the data and above the third quartile of the amplitude data. A linear
mixed effect model analysis with Memorability Score, Channel and Timepoint as fixed effects and Subject as random effect
was used to analyze the dataset in the visual cortex. An additional linear effect model analysis with Memorability Score and
Timepoint as fixed effects and Subject as random effect was performed to analyze the dataset in the right temporal cortex. All
data was processed and plotting and analyzing of the data was performed using RStudio [12].
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>RESULTS AND ANALYSIS</title>
      <p>A linear mixed effects model with fixed effects Timepoint, Channel and Memorability Score was performed on the Amplitude
ERP data, with an added random factor of Subject. Table 1 represents the significant Timepoint and Memorability Score
interactions discovered in the data, indicating that the time window of ~ 340 ms - ~ 408 ms and the Timepoint at ~ 476 ms
showed a significant difference in ERPs between Remembered and Not Remembered videos across all trials and participants.
These differences in ERP amplitudes can be observed in Figure 1. The largest differences can be observed at Timepoint 10
(374 ms) and Timepoint 13 (475 ms) in Figure 1. All other timepoints showed no significant difference between Remembered
and Not Remembered videos (all p’s &gt; 0.05).</p>
      <p>A linear mixed effects model with fixed effects Timepoint and Memorability Score was performed on the Amplitude ERP
data, with an added random factor of Subject. Table 2 represents the significant Time and Memorability Score interactions
discovered in the data, indicating that the time window of ~ 306 ms - ~ 816 ms showed a significant difference in ERPs
between Remembered and Not Remembered videos across all trials and participants. These differences in ERP amplitudes can
be observed in Figure 2. The largest differences can be observed at Timepoint 13 (475 ms) in Figure 2. All other Timepoints
showed no significant difference between Remembered and Not Remembered videos (all p’s &gt; 0.05).
Memorability Score * Timepoint 8 (~ 306 ms)
Memorability Score * Timepoint 9 (~ 340 ms)
Memorability Score * Timepoint 10 (~ 374 ms)
Memorability Score * Timepoint 11 (~ 408 ms)
Memorability Score * Timepoint 12 (~ 442 ms)
Memorability Score x Timepoint 13 (~ 476 ms)
Memorability Score x Timepoint 14 (~ 510 ms)
Memorability Score x Timepoint 15 (~ 544 ms)
Memorability Score x Timepoint 16 (~ 578 ms)
Memorability Score x Timepoint 17 (~ 612ms)
Memorability Score x Timepoint 18 (~ 646 ms)
Memorability Score x Timepoint 19 (~ 680 ms)
Memorability Score x Timepoint 20 (~ 714 ms)
Memorability Score x Timepoint 21 (~ 748 ms)
Memorability Score x Timepoint 22 (~ 782 ms)
Memorability Score x Timepoint 23 (~ 816 ms)</p>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSIONS</title>
      <p>The aim of this study was to investigate whether a difference in amplitude of ERP’s could be observed within the first second
after onset of thousand three-second videos in the Memento10k dataset, between videos that participants would later indicate
as Remembered versus Not Remembered. This was done as part of the MediaEval 2022 Predicting Video Memorability task
[2]. A significant difference in amplitude was found at four Timepoints, 340ms, 374ms, 408ms and 476ms in the posterior
brain region. This difference was found in the opposite direction than was hypothesized, as the neural oscillations were of less
positivity for Remembered videos than for Not Remembered videos. Contrary to the hypothesis based on [5], that showed a
greater positivity for Remembered videos than for Not remembered videos. In addition, a significant difference in amplitude
in the right temporal lobe was observed, in the 306 – 816 ms time window. These results are in line with earlier studies
[5][7][8][9] that have observed a greater positive difference in ERP amplitude around the 300ms mark after onset of stimulus,
up until one second after the onset of the stimulus. The results from this project add evidence that suggests successful
remembrance of a video later on can be predicted by looking for a P300 component after the onset of the video in the right
temporal lobe. In the visual cortex, a positive peak was observed for the not-remembered videos. Future studies could explore
the interaction effect between this positivity and negativity around the 300 ms time window after onset of a video.</p>
      <p>This study adds to the body of literature in a way that it provides evidence for the importance of a P300 component
for memorability beyond the scope of image research. In addition, the paradigm in which the videos were presented to the
participants, can be argued as mimicking how a person would normally perceive visual imput via, e.g. social media. Most
media platforms nowadays consist of an endless stream of very short videos and only certain grab and hold the attention of
the viewer beyond the point of retrieval. This is in line with the increase in interest described in the introduction with regard
to predicting video memorability in modern media.</p>
      <p>However, further studies would be needed to investigate further how video relates to image literature beyond the
scope of this paper. As only neurophysiological data of the first second of the videos was recorded in this paper, a limitation
to this study is to draw any bridging conclusions between the literature of image memorability to video memorability. Even
though the dataset consists of moving pictures, it can be argued that one second of viewing is not extensive enough to compare
video and image media. Full recordings of the entire video could provide more insight as to where the spark of activation
occurs over time and potentially over the biological structure of the brain. Future studies would have to indicate whether other
brain regions such described in [7][8][9] (e.g., anterior prefrontal cortex, parietal cortex and the medial-frontal areas) show
importance during a video viewing task and thus extending this body of knowledge beyond the scope of image research. It
could be interesting to combine the behavioral data and EEG data in this study with diffusion magnetic resonance imaging
(dMRI), to deepen the understanding of the interplay between those brain regions that might be a factor in the later
remembrance of a short video.</p>
      <p>In addition, future research is necessary to investigate the features that play a role in memorability. This study only
included the first second of the video in final analysis and it was assumed that since the videos were only three-seconds long,
the main element of the subject of the video would already present itself in the first second of the video. Feature extraction
could, however, analyze this statement in more depth to investigate whether there are common subject features that can be
extracted from the Memento10k dataset that ensure memorability. This has the potential to answer the questions from the
introduction section of this paper, to investigate what, if any, features of a short video can predict memorability. If a common
benchmark of easily remembered features in videos can be established, this has a wide range of practical implementations for
advertisers, influencers and other stakeholders that might have an interest of getting short videos remembered.</p>
      <p>To conclude, this present paper provides evidence for a P300 effect differences in the posterior brain regions and
right temporal lobe at retrieval for the predictability of memorability of videos from the Memento10k dataset [3]. More
research is needed to understand how this data relates to image studies in the past and feature extraction could potentially
identify common features that correlate with an increase in memorability.</p>
    </sec>
    <sec id="sec-6">
      <title>ACKNOWLEDGMENTS</title>
      <p>I want to thank dr. Martha Larson for her guidance and critical eye in this project, as well as allowing me to explore a topic
that suited my personal interests. I would also like to thank dr. Alba García Seco de Herrera and the entire research team that
collected this ERP and covariate data. Lastly, I would like to specifically thank Yana van de Sande, Jasper de Meijer and
Floris Cos for their help with data cleaning and analyzing steps, as well as ongoing support during the task.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>