<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Meta-data and Methodology : Standards in the digital archive</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Priyanka Suresh</string-name>
          <email>priyanka.suresh@research.iiit.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Navjyoti Singh</string-name>
          <email>navjyoti@iiit.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Exact Humanities International Institute of Information Technology</institution>
          ,
          <addr-line>Hyderabad</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <fpage>77</fpage>
      <lpage>82</lpage>
      <abstract>
        <p>To support advancements in digital archival and curation processes, developing tools and capabilities that can accommodate cultural objects and their interpretations are necessary, in compliance with meta-data standards. This helps in establishing contexts within and among collections, along with pedagogical discourse and an extension of themes. In this paper, we present Curarium, a novel platform for curation and collaboration with large cultural data-sets. It explores collections-data through tools for in-depth object specific research, annotations, interpretive meta-data and data-visualizations. Moreover, it ventures into adopting methodology standards of knowledge preservation using data and lifecycle management in cultural understanding to retain authenticity and integrity.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Cultural data-collections are inherently interpretive and susceptible to dynamic changes based
on the curator, archival guidelines and available assets. Even though standardization of
metadata during import is useful to maintain regulated knowledge, a formal space for interpretation is
required. This needs a setting for endurance of ideas and emergence of discussions withstanding
technical and resource obsolescence. For this purpose we perform development and user studies
of the platform Curarium [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] (https://curarium.com/), currently in its beta version. The users are a
group of alpha and beta testers, who performed the iterative process in a classroom and academic
setting.
      </p>
      <p>
        For the purpose of this paper, we refer to a cultural artifact as an artifact, the digital
representation of this artifact (“digital surrogate” [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) as an object, and an associated digital surrogate to
an object as a surrogate. We attempt a streamlined model to develop academic arguments with
cultural data while aiding knowledge preservation. This process has cultural objects as the subject
matter, and it establishes relations between objects and the collection, and among different
collections in the archive. A systematic overview, necessary to retain the objective of the collections
imported, formalized by Curarium has the following schema:
1. Import collections and structure initial meta-data
2. Filter and query collections to explore specific objects
3. Curate meta-data extensions for individual objects
      </p>
      <sec id="sec-1-1">
        <title>4. Annotate individual objects for in-depth themes</title>
      </sec>
      <sec id="sec-1-2">
        <title>5. Generate visualizations to represent objects and inter-relations</title>
        <p>6. Create and categorize sub-collections with objects from across collections</p>
      </sec>
      <sec id="sec-1-3">
        <title>7. Publish documentation and insights</title>
        <p>
          8. Develop collaborative arguments over shared annotations
There are many proprietary and open-source tools that support digital archival and curation
processes on the web. Contentdm by OCLC [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] is a powerful and easy cataloging tool but expensive.
Open-source tools such as Omeka [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] provide capabilities limited to meta-data analysis and
indexing but have no support for collaboration, and others such as Collective Access [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] requires
advanced technological capabilities for curation and collaboration. Pachyderm by New Media
Consortium was highly interactive but did not provide extensive cataloguing capabilities and required
download for individual hosting. In comparison, Curarium handles meta-data cataloguing and
collaboration from a curator and user side, is free and easy to use, packaged without the
requirement of browser downloads and highly interactive. For meta-data curation purposes, Curarium
follows the Resource Description Framework [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] (RDF) model, which is a W3C recommendation
for defining and sharing meta-data among communities.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Data preparation</title>
      <p>
        Curarium is a web application built on Ruby on Rails framework. Based on the
model-viewcontroller standard, Curarium uses PostgreSQL as the database server and JSON for data transfer.
Curarium also heavily utilises JavaScript and CoffeeScript, thereby making JSON the ideal choice
for a lightweight data-interchange format [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. During import, the collection provider follows an
organized protocol for meta-data by filling out the fields for general information - name,
description, provider detail, a sample of the collection and a zipped file containing all the records in
JSON format. This is further parsed to provide a list of name/value objects making the
information highly human-readable. The collection provider is then able to drag-and-drop required values
from the parsed JSON data into fields of interest (Examples include “Unique identifier”, “Title”,
“Thumbnail” etc). There is a provision for the collection provider to add custom fields, adding
flexibility to curation for the meta-data to be very collection-specific.
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>User study</title>
      <p>
        Iterative alpha and beta testing was carried out for observing User Experience among users with
different levels of software and curative expertise. A quick guide is presented within the tool to
aid users in understanding the usage and associated vocabulary. The tool has been tested for use
following the schema in Figure 1 with use cases for (1) academicians and curators in research
settings, (2) students in classroom settings with the course instructor as moderator and (3) workshop
participants with organizers as demonstrators for the purpose of ideation. In these use cases, users
were provided special access to the website with generated accounts. The platform currently hosts
around 8 collections, sourced and imported successfully with a large focus on curation through
import. Villa I Tatti’s Homeless Paintings of the Italian Renaissance [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] collection was the first to be
imported into Curarium, followed by collections made available and hosted by Harvard Art
Museum, Japan Digital Archive [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and others. Within the classroom setting, the available collections
were useful for students in developing academic arguments through publishing and discourse on
message boards promoting engagement within established contexts. Within the workshop setting,
the collections are provided for presentation in broader contexts for pedagogical dialogue through
visualizations, documentation and useful content. Through the agile model of development and
version control on Github [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], many illustrations and iterations were incorporated for release.
Curarium is now available with user profiles, a provision of adding users into research-specific
circles and login credentials authenticated by Mozilla’s Persona [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The platform is linked to
Redmine [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], a project management tool for reporting issues and feedback.
5
5.1
      </p>
    </sec>
    <sec id="sec-4">
      <title>Standardization design</title>
      <sec id="sec-4-1">
        <title>Technical standards for materials</title>
        <p>The qualification of an artifact (and thereby an object) as possessing cultural value requires
categorization in order to fit it into archival themes. The first step to this is determining cultural artifacts
on the basis of their utilitarian value [Prown 13] and the second step as identifying materials and
their digital representations [Portela 14]. This allows for thematic extensions as well as a seamless
transition and interaction between objects.</p>
        <p>
          In Figure 2, the utilitarian chart helps distinguish the various artifacts in a collection based on
their usage. This not only helps establish an overall utilitarian theme for the collection itself but
also individual objects in the collection; and then recognise the diverse materials of the artifacts
and their technical representations supported by the web platforms. Example : The Lighting
Devices [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] collection currently hosted on Curarium would belong to the “Devices” section and
contains “Photographic materials” of Lighting Devices.
5.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Interpretive spaces</title>
        <p>We create an extension space for exploration of themes where individual objects are subject to
supplementary meta-data creation. This provides contextual detail to the existing meta-data
structures. On Curarium, upon clicking on an object, a custom client-side extension is present over
the preferred set of primary fields that were selected during import. By making additions through
this interface, the user with approval of the curator, is able to build relations between objects in
a collection, make sub-collections and find overlaps in cross-collection contexts. The dimensions
of contexts [Beaudoin 16] for object-specific curation are diagnosed for further investigation and
collaboration.</p>
        <p>
          Figure 4 (on the next page) displays a record from the “Photoscreen Prints” [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] collection
currently hosted on Curarium. On clicking the record, a panel on the left presents a tool-kit to
annotate a portion of the image and transcribe associated information. Similarly, Figure 5 (on
the next page) displays the collection “Photoscreen Prints” [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] currently hosted on Curarium in
visualization mode [18] of the object map. The panel on the left displays the available types of
visualizations, properties of viewing, and filters associated to the visualization.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Strengths and Limitations</title>
      <sec id="sec-5-1">
        <title>Strengths of the Platform</title>
        <p>Curarium makes the process of curation and collaboration highly intuitive and interactive, aiding
pedagogy and enhancing expertise in the inter-disciplinary domain of Digital Humanities.
Moreover, it enables data-visualization of large and complex cultural data-sets which enables
recognising patterns, building relations and grasping sophisticated concepts. Additionally, it comes with
a wide-range of customizable in-house tools for archiving and open to public feedback for
improvement of the products. Lastly, the image and thumbnails are in the form of links provided,
so that the object of the collection can be mirrored into the platform from its respective home-site
rather than domestically hosted. This makes Curarium highly scalable in terms of importing and
working with large cultural data-sets.
6.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Limitations of the Platform</title>
        <p>For the sake of converting other data-formats (CSV, XML) into JSON, Curarium uses an in-house
third party API. Integration of this API into the platform would make it more versatile in handling
a variety of data-sets. Also, Curarium currently archives only photographic material. Providing the
capacity for audiographic and videographic material would enrich the archive and enable diverse
relations among objects and collections.
7</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>Curarium is supported by metaLAB(at)Harvard associated to the Berkman Klein Center for
Internet and Society. We thank our colleagues who provided the support, insight and expertise that
greatly assisted the research.
8</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>In conclusion, a space for interpretive meta-data is required to preserve the multi-dimensional
knowledge contained in a cultural object. The biggest challenge occurs in the form of defining
boundaries for cultural objects as the features are reliant on each other for meaning, and provide
a means for an abundance of hybrid information. Through the technical affordances of Curarium,
we have developed a mechanism for capturing structure using tools for the curator, while allowing
flexibility for interactive interpretation of the objects during collaboration for the user.</p>
      <sec id="sec-7-1">
        <title>Computer</title>
      </sec>
      <sec id="sec-7-2">
        <title>Library</title>
      </sec>
      <sec id="sec-7-3">
        <title>Center,</title>
        <p>Inc. (URL
:
(2016),</p>
      </sec>
      <sec id="sec-7-4">
        <title>Curarium (URL ::</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Curarium</surname>
          </string-name>
          (
          <year>2005</year>
          )
          <article-title>MetaLAB(at</article-title>
          )Harvard. (URL : https://curarium.com/)
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Rabinowitz</surname>
          </string-name>
          ,
          <string-name>
            <surname>Adam</surname>
          </string-name>
          (
          <year>2015</year>
          ),
          <article-title>A Work of Archeology in the Age of Digital Surrogacy</article-title>
          .
          <source>In Visions of Substance : 3D Imaging in Mediterranean Archaeology</source>
          , The Digital Press at The University of North Dakota pp.
          <fpage>27</fpage>
          -
          <lpage>43</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Contentdm</surname>
          </string-name>
          (
          <year>2016</year>
          ) OCLC Online http://www.oclc.org/contentdm.html/)
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Omeka</surname>
          </string-name>
          (
          <year>2007</year>
          )
          <article-title>Roy Rosenzweig Center for History and New Media</article-title>
          , George MAson University. (URL : http://omeka.org/)
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>CollectiveAccess</surname>
          </string-name>
          <year>v1</year>
          .
          <volume>6</volume>
          (
          <year>2016</year>
          )
          <article-title>(URL : http://collectiveaccess</article-title>
          .org/)
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Resource</given-names>
            <surname>Description</surname>
          </string-name>
          <string-name>
            <surname>Framework</surname>
          </string-name>
          ,
          <article-title>W3 Consortium (URL : https://www</article-title>
          .w3.org/RDF/)
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>JSON</surname>
          </string-name>
          (
          <year>2013</year>
          ),
          <source>Standard ECMA-404, 1st Edition</source>
          , ECMA International
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Berenson</surname>
          </string-name>
          ,
          <string-name>
            <surname>Bernard</surname>
          </string-name>
          (
          <year>1969</year>
          )
          <article-title>Homeless Paintings of the Renaissance</article-title>
          , London, Thames &amp; Hudson
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Japan</given-names>
            <surname>Disasters Digital Archive</surname>
          </string-name>
          , Reischauer Institute of Japanese Studies, Harvard University
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10] Curarium on Github, Inc. (
          <year>2016</year>
          )
          <article-title>(URL : https://github</article-title>
          .com/berkmancenter/curarium)
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Mozilla</given-names>
            <surname>Persona</surname>
          </string-name>
          (
          <year>2011</year>
          ),
          <article-title>Mozilla Foundation (URL : https://developer</article-title>
          .mozilla.org/Persona)
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12] Redmine v.
          <volume>3</volume>
          .
          <issue>3</issue>
          (
          <issue>2016</issue>
          ), Redmine (URL : http://www.redmine.org/)
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Prown</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jules</surname>
          </string-name>
          (
          <year>1982</year>
          ),
          <article-title>Mind in Matter: An Introduction to Material Culture Theory and Method</article-title>
          .
          <source>In Winterthur Portfolio</source>
          , Vol.
          <volume>17</volume>
          , No. 1
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Portela</surname>
          </string-name>
          ,
          <string-name>
            <surname>Manuel</surname>
          </string-name>
          (
          <year>2014</year>
          ),
          <article-title>Multimodal Editing and Archival Performance: A Diagrammatic Essay on Transcoding Experimental Literature</article-title>
          .
          <source>In Digital Humanities Quarterly</source>
          , Vol.
          <volume>8</volume>
          , No. 1
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Lighting</given-names>
            <surname>Devices</surname>
          </string-name>
          (
          <year>2016</year>
          ), Curarium (URL : https://curarium.com/collections/3)
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Beaudoin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Joan</surname>
          </string-name>
          (
          <year>2012</year>
          ),
          <article-title>Context and Its Role in the Digital Preservation of Cultural Objects</article-title>
          .
          <source>In The Magazine of Digital Library Research</source>
          , Vol.
          <volume>18</volume>
          , No.
          <volume>11</volume>
          /12
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Photoscreen</given-names>
            <surname>Prints</surname>
          </string-name>
          (
          <year>2016</year>
          ), Curarium (URL : https://curarium.com/collections/9)
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>