<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Allowing Exploratory Search from Podcasts: the Case of Secklow Sounds Radio</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ilaria Tiddi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuele Bastianelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martino Mensio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Motta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Knowledge Media Institute, The Open University</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present here the Secklow Sounds Radio App that was developed as one of the demonstrators in the context of the MK:Smart project. Secklow Sounds is a community radio based in the city of Milton Keynes (MK), providing digital recordings of their broadcasts online, where local issues are often discussed. We developed a mobile-friendly web app that, besides o ering live streaming of the radio, allows users to perform an entity-based search of past broadcasts enhanced with information (e.g. areas, museums, topics, etc) provided by the city's centralised repository. The paper presents rst the data processing workow which integrates a number of existing solutions, such as the Google Speech-To-Text API, Neural Networks, DBpedia Spotlight, and the DiscOU search engine, and nally shows how results were integrated into a web and mobile application providing an exploratory search service for radio podcasts.</p>
      </abstract>
      <kwd-group>
        <kwd>Audio processing</kwd>
        <kwd>Exploratory search</kwd>
        <kwd>Data integration</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The paper presents the Secklow Sounds Radio App and how this was developed
as a demonstrator for the MK:Data Hub1, the data integration and sharing
platform of the MK:Smart project2. MK:Smart is a large collaborative initiative
started in 2014 with the goal of developing innovative technology solutions to
support the economic growth of the city of Milton Keynes (MK). The project
promotes the idea that the creation of a common infrastructure to e ciently
manage, integrate, and re-deliver information from local data sources (energy/water
consumption, transport, satellites, social/economic sources, social media etc.)
facilitates the deployment of data-intensive applications, enabling intelligent data
processing mechanisms for citizens and service providers [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>To demonstrate the exploitability of the MK:Data Hub, the Secklow Sounds
Radio App was built as a use-case in collaboration with the Secklow Sounds
Radio3. As a MK-based community radio, Secklow Sounds provides digital
recordings of their broadcasts, where local issues are also discussed. Besides o ering an</p>
    </sec>
    <sec id="sec-2">
      <title>1 http://www.datahub.mksmart.org 2 http://www.mksmart.org 3 http://www.secklow1055.org/</title>
      <p>app with live streaming of their episodes, the radio aims to provide its audience
with advanced browsing facilities for their published data.</p>
      <p>As a result, the Secklow Sounds Radio App was developed to allow users to
perform an entity-based search and discovery of the broadcasts, whose contents
(local areas, museums, topics, etc.) were semantically enhanced with information
centralised in the MK:Data Hub. In order to achieve this, we built a data
processing work ow to transform audio podcasts into explorable data, through the
integration of a number of o -the-shelf solutions addressing the di erent tasks
to achieve { namely, speech understanding, named entity recognition, semantic
indexing and data augmentation. The nal output was then integrated into a
mobile-friendly application publicly available to end-users and radio listeners.
Here we present rst the developed pipeline, with a description of its
components in details, and then show how the processed data were deployed in the
Radio App.
2</p>
      <sec id="sec-2-1">
        <title>Extracting Data from Podcasts</title>
        <p>The Secklow Sounds Radio App was implemented using the process depicted in
Figure 1, taking as input the .mp3 of the radio broadcasts, and returning its text
annotated with entities from DBpedia and the MK:Data Hub. The work ow is
divided in: (i) Audio Processing, concerning the set of tasks to obtain texts from
the audio les; (ii) Text Annotation, i.e. the tasks for augmenting the episode
texts with external data and (iii) Data Aggregation, where results are aggregated
and wrapped into the Radio App.</p>
        <p>Audio
Processing .mp3</p>
        <p>Text
Annotation</p>
        <p>Data
Aggregation</p>
        <p>Chunking</p>
        <p>Pydub
Enriching</p>
        <p>ECAPI
Episode
Browsing</p>
        <p>Filtering</p>
        <p>RCNN
Indexing</p>
        <p>DiscOU
Episode
Exploration</p>
        <p>Speech2Text</p>
        <p>Google API</p>
        <p>NER
DBpedia Spotlight</p>
        <p>
          Entity
Exploration
Audio Processing. Audio chunking, ltering and speech-to-text are performed
to obtain text from an initial .mp3 le. As radio episodes consist in 60-180 mins
ca. of both music and talks, we chunk audios based on silences using the Python
Pydub4 library, in order to reduce them in length, and to be able to recognise the
spoken chunks (to transcribe into texts) from the music ones. Filtering is then
performed using a Recurrent-Convolutional Neural Network (RCNN) inspired by
[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The original RCNN was designed for Music Genre Classi cation (50 classes),
but was adapted here to produce a binary classi cation of the audio chunks
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4 https://github.com/jiaaro/pydub</title>
      <p>(speech or music), through tuning and training the model over a dataset of 1910
audio les tagged by three annotators. Finally, we use the Google Speech-to-Text
Cloud API5 to obtain the transcriptions. The API provides a number of facilities,
including pre-trained models for several languages, and the possibility of feeding
the model with a context-speci c vocabulary to improve the recognition. As
Secklow Sounds is a UK-based radio, we used the en-GB language model, and fed
the model with MK-speci c entities, e.g. its wards (Wolverton, Bletchley), local
libraries and museums (Bletchley Park, Woburn Sands library ) and presenters'
names.</p>
      <p>
        Text Annotation. Text annotation is performed over the transcriptions of the
podcasts, and includes the tasks of named entity recognition, semantic indexing,
and data enrichment. DBpedia Spotlight6 is used rst to obtain the list of named
entities from the episodes' transcriptions. The resulting semantic descriptions
are then indexed using the DiscOU Semantic Indexer7 following the idea of [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Based on the Lucene search engine, the Indexer uses the semantic annotations in
order to allow resource search by semantic, rather than textual, similarity. Once
the index is built, resources are annotated with the relevant DBpedia entities
and their occurrence score. If a mapping between DBpedia and the MK:Data
Hub exists (see, for instance, the ward of Wolverton8), an additional annotation
is also provided. This allows entities to be explored through the MK:Data Hub
Entity Centric API (ECAPI [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]), which aggregates relevant data (wards, estates,
buildings, bus stops etc.) from multiple data sources.
      </p>
      <p>Data Aggregation. The last step is the aggregation of the annotated audio
contents and the implementation of the Radio App. The application, available
online9, is developed using the Angular Mobile UI framework10, to allow usage
from laptops and mobiles independently from the operating system. Three
activities can be performed with the app: (i) browsing episodes related to a speci c
entity; (ii) exploring the content of a selected episode; and (iii) exploring a
speci c entity discussed during an episode. In Figure 2 for example, the user is
searching through the search box for episodes about Wolverton, promptly
returned by the indexer (Figure 2a)11. The selected episode, whose podcast can
be listened, can be explored through its enriched content (Figure 2b), where the
annotated entities obtained during the enriching task are visualised as a word
cloud with di erent sizes and shades depending on their frequency { for example,
the episode Lifestyle MK of April 2016 discussed food-related topics. Entities
5 https://cloud.google.com/speech-to-text/
6 https://www.dbpedia-spotlight.org/
7 https://github.com/the-open-university/discou-indexer
8 DBpedia entity: http://dbpedia.org/resource/Wolverton, MK:Data Hub entity
https://data.mksmart.org/entity/ward/wolverton
9 https://data.mksmart.org/apps/secklow-sounds-app/
10 http://mobileangularui.com/
11 Note that the set of entities proposed in the facet are only suggestions, while users
can freely search for any entity (also not related to MK).
in pink are the ones that can be explored through ECAPI, e.g. in Figure 2c
the user is visualising crowdsourced pictures and socio-demographic information
(population age and projection, economical activities) about Wolverton.
Similarly, Figure 2d shows that users can also obtain practical information about
local activities (the Milton Keynes Museum), such as location, phone number,
opening times, etc.
(a) Browsing (b) Exploring a sin- (c) Exploring enti- (d) Exploring
entiepisodes facet. gle episode. ties (1). ties (2).</p>
      <sec id="sec-3-1">
        <title>Demonstration</title>
        <p>During the demonstration we will present the Secklow Sounds Radio App with
two main goals, namely to show the bene ts of a data integration facility (such
as the MK:Data Hub) to deploy lightweight applications, and how o -the-shelf
solutions can be integrated together into a simple but e ective work ow to o er
exploratory search over radio broadcasts.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fazekas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sandler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.:</given-names>
          </string-name>
          <article-title>Convolutional recurrent neural networks for music classi cation</article-title>
          .
          <source>In: Acoustics, Speech and Signal Processing (ICASSP)</source>
          ,
          <year>2017</year>
          IEEE International Conference on. pp.
          <volume>2392</volume>
          {
          <fpage>2396</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Daga</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adamou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
          </string-name>
          , E.:
          <article-title>Addressing exploitability of smart city data</article-title>
          .
          <source>In: Smart Cities Conference (ISC2)</source>
          , 2016 IEEE International. pp.
          <volume>1</volume>
          {
          <issue>6</issue>
          .
          <string-name>
            <surname>IEEE</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allocca</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Collins,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Discou: A exible discovery engine for open educational resources using semantic indexing and relationship summaries</article-title>
          .
          <source>In: Proceedings of the 2012th International Conference on Posters &amp; Demonstrations TrackVolume 914</source>
          . pp.
          <volume>13</volume>
          {
          <fpage>16</fpage>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davies</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
          </string-name>
          , E.:
          <article-title>Smart cities' data: Challenges and opportunities for semantic technologies</article-title>
          .
          <source>IEEE Internet Computing</source>
          <volume>19</volume>
          (
          <issue>6</issue>
          ),
          <volume>66</volume>
          {
          <fpage>70</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>