<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A linked research network that is Transforming Musicology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Terhi Nurmikko-Fuller</string-name>
          <email>terhi.nurmikko-fuller@oerc.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kevin R. Page</string-name>
          <email>kevin.page@oerc.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Oxford e-Research Centre, University of Oxford</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>73</fpage>
      <lpage>78</lpage>
      <abstract>
        <p>Semantic Web technologies o er solutions for bridging discrete and even disparate datasets. Linked Data has been seen in several Digital Humanities projects, but through the alignment of instance-level entities rather than the capture of work ows, which have yet to become part of the publication paradigm for reporting on completed research. In this paper, we assess the functional requirements of digital Musicology research questions, and propose ways for using the inherent semantics of work ow descriptions alongside instance data to link them. We report on the design of a linked research network for Musicology.</p>
      </abstract>
      <kwd-group>
        <kwd>Musicology</kwd>
        <kwd>linked research network</kwd>
        <kwd>semantic requirements</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Collaborative scholarship brings together academics, diverse datasets, and
different research foci. An example of this is Transforming Musicology,1 an
exploration into the ways digital technologies can in uence the future development
of scholarship on music, whether it is represented as sound, score, or symbol.
This interdisciplinary endeavour bridges projects from 14 Universities, all with
idiosyncratic methodologies, work ows, research agendas, and data. We report
on the iterative process of assessing the needs and requirements of an underlying
linked research network, which uses Semantic Web technologies to connect these
projects by drawing in elements from di erent sources, resulting in a
complementary combination of resources for the scholars involved, and beyond.</p>
      <p>
        This use of Semantic Web technologies to capture work ow is not without
precedent [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], but whilst the value of reproducible investigative processes has
been noted in Natural Sciences and Bioinformatics [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], it has yet to be adopted as
the norm in the publication of research in the Digital Humanities. Using work ow
metadata as the semantic glue within the linked research network helps by-pass
the \knowledge burying" problem described by Mons [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], who critiques the
prevalent practice of publishing nal analysed datasets only. The importance of
work ow capture for the purposes of reproducibility in the sciences has been
noted by Bechhofer, et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and the bene ts of doing so extend to the reuse
      </p>
      <sec id="sec-1-1">
        <title>1 http://transforming-musicology.org/</title>
        <p>of processes developed for one project in the context of another (e.g. to alleviate
labour intensivity).</p>
        <p>
          In Transforming Musicology, we enrich instance level data connections (see
Section 2) with the semantics of work ows. The methodologies of each
constitutent project were recorded and systematically assessed for opportunities of
support and reuse. Work ows were divided into four consecutive, tripartite steps:
data preparation, data capture, summarizing, and visualization. Each has input
data, a process, and resulting output. Metadata semantics capture the
relationships, provenance, and other aspects of each part of the work ow, including
dependencies and causation (e.g. prov:wasDerivedFrom from Prov-O [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]).
        </p>
        <p>
          There are eight areas of study (AS). The core (AS1 { 3) are under
development by the Universities of Oxford and London, Goldsmith's College { these
are supplemented by investigations at other institutions (AS4 { 7):
AS1: 16th century lute and vocal music that combines tablature with audio [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ];
AS2a: Analysis of leitmotivs within the compositions of Richard Wagner [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ];
AS2b: The psychological e ects these leitmotivs can have on the listener [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ];
AS3: Social media of Musicology, concentrating on Genius2 and Echonest;3 and
AS4: Medieval Music, Big Data and the Research Blend (Southampton) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ];
AS5: Characterising stylistic interpretations through automated detection of
ornamentation in Irish traditional music recordings (Birmingham; Birmingham
City; and the Dundalk Institute of Technology)[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]; the other multi-institutional
AS6:In Concert: Towards a Collaborative Digital Archive of Musical Ephemera
(Cardi ; Birmingham; British Library; Goldsmiths College; and Illinois) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]; and
AS7:Large-scale corpus analysis of historical electronic music using MIR tools:
Informing an ontology of electronic music and cross-validating content-based
methods (Durham).
2
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Semantic Overlap</title>
      <p>(AS3), (AS5), and (AS7) overlap in the temporal scope of the datasets; (AS4) is
an isolate. (AS6) can bridge (AS1) with (AS2) (see Figure 1). They share data
types such as .csv and .jpeg; (AS1), (AS2a), (AS3), (AS4), and (AS6) all analyse
text and content, whilst (AS1), (AS5), and (AS7) contain an audio component.
(AS2), (AS3), and (AS6) contain known instances of shared entity-level data.
All but (AS3) and (AS4) largely focus on resource metadata at the data capture
stage of the work ow.</p>
      <p>Methodological parallels are limited to similar tools, e.g. (AS5) uses Sonic
Visualiser,4 (AS1) utilises Sonic Annotator.5 The extent to which automated
process are relied on varies from one (AS) to another { they are most actively
used in (AS5). (AS6) has exports in JSON; (AS1) in XML. (AS4) data is stored</p>
      <sec id="sec-2-1">
        <title>2 http://genius.com/</title>
      </sec>
      <sec id="sec-2-2">
        <title>3 http://the.echonest.com/</title>
      </sec>
      <sec id="sec-2-3">
        <title>4 http://www.sonicvisualiser.org/</title>
      </sec>
      <sec id="sec-2-4">
        <title>5 http://www.vamp-plugins.org/sonic-annotator/</title>
        <p>in an instantiation of ePrints,6 and metadata can be exported in a number
of di erent formats, including JSON, XML, and RDF (mapped to a custom
ontology). The projects make use of a range of existing repositories (e.g. ePrints),
at les, spreadsheets, and relational databases (MySQL). Whilst the shared
aim is to publish Linked Open Data (LOD), the necessary mapping and data
conversion methods di er.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Illustrative Musicological Research Questions</title>
      <p>
        Following Bechhofer, et al.[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we produced ve hypothetical scenarios for
illustrative purposes to describe possible research questions (RQ). These arise and
encompass elements from more than one (AS):
RQ1: Alice discovers Bob used the NNLS Chroma plug-in7 for Sonic
Annotator to extract features from 16th century lute music. She needs access to Bob's
dataset to verify his results, and to the tool to repeat the work ow on her data.
RQ2: Casey studies the publication paradigms and prosopography of printers in
the 16th century: are there patterns, hubs of activity, and genre-specializations?
RQ3: David nds lyrics sung by Siegfried (a character in Richard Wagner's Der
Ring des Nibelungen) on Genius. He needs complementary information (text
companions, audio, notations, images) to establish an interpretative framework.
RQ4: Edward is interested in communities of practice around digital
Musicology. He wants to identify pioneering institutions, preeminent scholars, to nd
answers to frequently asked questions, and to receive guidance on best practice.
RQ5: Frankie has annotation data captured during a live operatic performance.
He is looking to represent the semantics of the annotations as RDF, and merge
them with existing data already in a triplestore.
      </p>
      <p>The functional requirements (FR) of the (RQ) were systematically
assessed through an iterative process in response to a Request for Proposal: the
details of each scenario were identi ed, and possible solutions proposed. O
-theshelf tools and resources are recommended where available (see Section 4). The
aim was to nd commonalities between the needs of the (RQ): addressing these
enables the integration between disparate datasets, but also between the raw
data and the user, who is free to analyse and interpret data in the context of</p>
      <sec id="sec-3-1">
        <title>6 http://eprints.soton.ac.uk/</title>
      </sec>
      <sec id="sec-3-2">
        <title>7 http://isophonics.net/nnls-chroma</title>
        <p>their own research agenda. Scholars are in a position to bene t from the output
of other (AS) for their analyses.</p>
        <p>In the absence of a centralised structure for the sharing and amalgamation
of information, Semantic Web technologies support access to, and the exchange
of, data across all areas of study. The idea of a system incorporating a number
of di erent types of servers (image, document, audio, etc.) bridged by a data
sharing platform began to form. The vision of a coherent collection of metadata
for all resources, data, tools, and code, emerged.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>
        As illustrated by Table 1, many of the FRs outlined above can be addressed with
existing, o -the-shelf tooling (T). Repositories are an example of this: ePrints
(T1), where fully and semi-automated processes allow for metadata extraction
as RDF; Zotero8 (T2), a solution for the archiving and long-term storage of code
and tooling with the added bene t of the establishined work ow for importing
from GitHub, which is used as a development environment with version control;
triplestores as metadata repostiories (T4); and ResearchSpace9 (T6), which
provides a graphical user interface to a triplestore, allowing Musicologists to query of
the underlying RDF metadata without using SPARQL [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Although con gured
to use Blazegraph10 and the CIDOC CRM [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], ResearchSpace is both triplestore
8 https://www.zotero.org/
      </p>
      <sec id="sec-4-1">
        <title>9 http://www.researchspace.org</title>
        <p>
          10 https://www.blazegraph.com/
and ontology agnostic, and can be used with Virtuoso,11 and a purpose-built
ontology (T8) that incorporates classes and properties from a number of known
OWL ontologies, such as (but not limited to) the Music Ontology[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], Event [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ],
Timeline [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], Prov-O, and Research Objects [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], and is designed to be su
ciently exible to allow for the future integration of the structure designed as
part of (AS2a). For the audio repository (T3), Tranforming Musicology is in
a position to bene t from earlier Musicological projects [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]; for images (T5),
IIIF12-compliancy is highly desirable, making Loris13 (an open source,
Pythonbased image server) the repository of choice. Known social networking analysis
tools (T12) can support (AS3) and any Musicological prosopography occuring in
other (RQ). Where applicable, instance level alignments to external authorities
such as VIAF14 and Musicbrainz15 can be implemented. Visualization techniques
used in (AS6) can be reapplied (T11) to support other (AS).
        </p>
        <p>
          Some aspects of the linked research network require new development. These
include identifying necessary APIs (T7) and establishing their interaction with
any future graphical user-interface implementation; an over-arching ontology, as
described above (T8), to connect smaller, more domain-speci c models (T10);
and for (RQ2), a natural language processing tool (T9), which builds on an
earlier prototype by Khan et al [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>This assesment of (FR) illustrates the large numbers of readily available
existing tools, and pinpoints those circumstances where new builds are necessary.
Such assesments are valuable in the planning and implementation of research
projects, helping maximise potential linkage (e.g. through shared schema) and
to minimise development overlap. The resulting linked research network will
aggregate the entirety of the wealth of expertise and skill within Transforming
Musicology. Captured metadata for all internal relationships and for each of
the work ow stages results in a graph much richer than that produced through
instance-level alignments alone.
11 http://virtuoso.openlinksw.com/
12 http://iiif.io/
13 https://github.com/loris-imageserver/loris
14 https://viaf.org/
15 https://musicbrainz.org/</p>
        <p>Although developed in the context of musicological investigation, the
exbility of the system - bar the niche ontologies themselves - has strong applicability
across the Digital Humanities, breaking down barriers of information discovery
between disciplines, supporting both innovative and traditional scholarship, and
encouraging the re-use of tooling, data, and research methodologies.
Acknowledgments. This work was part of the UK AHRC Transforming
Musicology project (AH/L006820/1). The authors acknowledge their colleagues on
this project, especially Carolin Rind eisch, and Richard Lewis.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bechhofer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al :
          <article-title>Why Linked Data is not enough</article-title>
          .
          <source>Generation Computer Systems</source>
          ,
          <volume>29</volume>
          ,
          <issue>2</issue>
          , pp.
          <volume>599</volume>
          {
          <fpage>611</fpage>
          .
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bechhofer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al.:
          <article-title>Computational analysis of the Live Music Archive</article-title>
          .
          <source>ISMIR</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bekiari</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.:
          <article-title>FRBR object-orientated de nition and mapping from FRBRER, FRAD and FRSAD (version 2)</article-title>
          .
          <source>International Working Group on FRBR and CIDOC CRM Harmonisation</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Belhajjame</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et al :
          <article-title>Using a suite of ontologies for preserving work ow-centric research objects</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Cantum pulcriorem invenire: Conductus Database: http://catalogue.conductus. ac.uk/#m-columnbrowser@
          <article-title>||m-informationcontrol@url=html/home</article-title>
          .php (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Crawford</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Early Music Online and The Electronic Corpus of Lute Music</article-title>
          .
          <source>MEI</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dix</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>Authority and judgement in the digital archive</article-title>
          .
          <source>DLfM</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fazekas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          et al :
          <article-title>An overview of Semantic Web activities in the OMRAS2 project'</article-title>
          .
          <source>Journal of New Music Research</source>
          ,
          <volume>39</volume>
          (
          <issue>4</issue>
          ):
          <volume>295311</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Gonzalez-Beltran</surname>
            <given-names>A</given-names>
          </string-name>
          , et al.
          <article-title>From peer-reviewed to peer-reproduced in scholarly publishing: The complementary roles of data models and work ows in bioinformatics</article-title>
          .
          <source>PLoS ONE</source>
          <volume>10</volume>
          (
          <issue>7</issue>
          ): e0127612 (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Jancovic</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al.:
          <article-title>Automatic transcription of ornamented Irish traditional ute music using hidden Markov models</article-title>
          .
          <source>ISMIR</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , et al :
          <article-title>BABY ElEPHa~T - Building an analytical bibliography for a prosopography in early English imprint data</article-title>
          .
          <source>iConference</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Missier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et al :
          <article-title>Janus: from work ows to semantic provenance and Linked Open Data</article-title>
          .
          <source>IPAW</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mons</surname>
          </string-name>
          , R.:
          <article-title>Which gene did you mean?</article-title>
          .
          <source>BMC Bioinformatics</source>
          , vol
          <volume>6</volume>
          . p.
          <volume>142</volume>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Mullensiefen,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , et al :
          <article-title>Recognition of leitmotives in Richard Wagner's music: chroma distance and listener expertise</article-title>
          .
          <source>ECDA</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Oldman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          : Contextual search design video: https://www.youtube.com/watch? v=
          <source>VUGMlDc9B5w</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Raimond</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          et al :
          <article-title>The timeline ontology. OWL-DL ontology (</article-title>
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Raimond</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          et al :
          <article-title>The event ontology: Technical report (</article-title>
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <article-title>Rind eisch, C: The Eternal Question to Fate, Surging up from the Fepth: Richard Wagner's Descriptions of his Leitmotives in Changing Contexts of Communication</article-title>
          .
          <source>RMA</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. World Wide Web Consortium:
          <string-name>
            <surname>PROV-O: The PROV Ontology</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>