<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Building CORTUPP: a digital collection of technical reports with semantic features</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ma. Auxilio Medina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Argelia B. Urbina N</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ajera</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Benitez R.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>J. de la Calleja</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>opez D.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rebeca Rodr</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>guez H.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidad Polit</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The construction of a digital collection from the beginning implies technical decisions such as choosing format and design of documents, the selection of search and browsing mechanisms to access data and metadata, and the use of an architecture which support collaborative work of authors. This paper describes CORTUPP, a digital collection of technical reports. CORTUPP uses REC, an external service to support collaborative labeling and ranking of documents.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Semantic digital libraries</title>
      <p>
        Semantic digital libraries refer to systems build upon research on digital libraries,
semantic web, social networking and human computer interaction: they integrate
knowledge organization systems, delivered by classic digital libraries, with the
semantic web and social networking (Web 2.0) technologies [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Authors believe that semantic web technologies can support the development
of valuable collections and services required in educational institutions. The
particular interest is a semantic digital library, that according to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is formed by
materials, tools and meanings. Some of the goals of a semantic digital library
are the following ones:
{ Anyone can use it
{ Knowledge is accessible from the semantic digital library
{ Resources are available with the modality anytime anywhere
{ Friendly and multi-modal interfaces
{ Multiple connected devises
      </p>
      <p>Although freely distributed software exists around the world to construct
semantic digital libraries such as Greenstone 1 or Jerome DL 2, we decided to
implement an independent component in order to take into account the work
°ows implemented at the UPPuebla. Authors believe that CORTUPP can serve
as a basis to construct a semantic digital library for the UPPuebla.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Architecture of CORTUPP</title>
      <p>
        CORTUPP collection consists of a database, a web interface, assessment
instruments, a common structure of documents and search mechanisms. This is
available at http://server3.uppuebla.edu.mx/cortupp/. Figure 1 shows the
architecture of our collection. This is an adaptation of an architecture proposed
by [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>The content of CORTUPP is formed by technical reports, registers of
assessment committees, assessment instruments and calendar of activities. The data
about users and count of users are also part of the content. The main users
are teachers and students at the UPPuebla that make use of services of access,
storage and search. Next sections describe the components of the architecture.
3.1</p>
      <sec id="sec-3-1">
        <title>Structure of technical reports</title>
        <p>We propose a common structure of the documents that introduces a common
semantic by itself. This structure is formed by the following mandatory
chapters: 1)research propose, 2)theoretical marc, 3)research design,
4)implementation, 5)results and 6)conclusions. Support material of the research project such
as interviews, questionnaires or large tables can be added in appendixes.</p>
        <p>Authors are free to propose the structure of each chapter, except the ¯rst
one that refers to the research propose which is formed by the following
mandatory sections: introduction, general objective, speci¯c objectives, justi¯cation,</p>
        <sec id="sec-3-1-1">
          <title>1 http://www.greenstone.org/</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>2 http://www.jeromedl.org/</title>
          <p>chronogram of activities, hardware and software requirements and scopes and
limits of the research project. The document structure has been de¯ned as a
Latex template. The BibTex ¯le format is used to create the bibliography 3. A
technical report is described itself as a techreport entry. .</p>
          <p>We identify internal users that belong to the UPPuebla community, they are
students, teachers, managers and sta® of the di®usion department; and
external users who are members of another academic communities or visitors of the
collection.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Keyword-based services</title>
        <p>CORTUPP has a web-based interface to access the documents stored at the
database, this makes use of hyperlinks to explore documents, to download the
assessment instruments or to access to relevant web pages. Technical reports are
stored as PDF ¯les in the web server.</p>
        <p>At CORTUPP users can carry on two types of searches: 1) keyword-based
searches and 2) authority search (search by author or by a participant of the
assessment committee). An assessment committee is formed by three teachers
who play the role of advisor, secretary and vocal. This committee validates the
content of the document. Figure 2 shows the interface of CORTUPP.</p>
        <sec id="sec-3-2-1">
          <title>3 http://www.kfunigraz.ac.at/ binder/texhelp/bibtx-7.html</title>
          <p>CORTUPP uses existing legal metadata in semantically enabled libraries.
Technical reports are described with the Dublin Core (DC) elements of Table 1. These
elements are associated to the elements of the Latex template.
DC element Description
Creator Indicates the name of the ¯rst author
Date Indicates the delivery date
Description Contains the abstract of the technical report
Identifier This is a number used to identify the technical report</p>
          <p>in the collection
Language Language of the content (Spanish)
Publisher Contains the name of the university as the entity responsible
for publishing the technical report
Subject Keywords of the technical report according to a research area
Title A given title to the technical report</p>
          <p>CORTUPP is represented in a structure called ontology of records that
maintain an organization by content. This is a hierarchical structure that provides
a unique and unambiguous interpretation of the document elements. This has
concept-term relationships useful for search based on free text. The main
characteristics of an ontology of records are the following ones:
1. Technical reports are clustered by similarity
2. Clusters in the k -level have labels of k -terms
3. All documents of a cluster share the terms of its label</p>
          <p>
            The features of the ontology of records can be found in [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]. Then, semantic
information is represented by metadata attached to each document and by the
ontology of records. CORTUPP design corresponds to the levels of knowledge
proposed by [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]:
1. Organization of the information in databases
2. Organization of the information in the documents
3. Organization of the metadata
4. Organization of the topics treated in the documents
5. Organization of the concepts, terms and relations
5
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>REC: an external service with semantic features</title>
      <p>
        Adding semantic features for digital collections is a topic of interest in research
areas such as collaborative labeling, web 2.0 and semantic digital libraries. For
example, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] describes the potential of tagging systems to support knowledge
organization or [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] investigate social book marking in digital libraries and derive
the design requirements to incorporate social book marking.
      </p>
      <p>
        A tag is a keyword that acts like a subject or category for the associated
content [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Tags are user added metadata, tagging is the establishment of a
relationship between an online information resource and a user.
      </p>
      <p>In social contexts, such as Flickr 4, facebook 5, del.icio.us. 6 and Soboleo 7,
traditional measures of information retrieval are not important, else the opinion
and experience of previous users. In this sense, we have decided to integrate
REC, an open software that allow users of CORTUPP to add tags to take into
account subjective information of users.</p>
      <p>
        REC [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] makes use of the \induced tagging" technique design to improve
the quality of automatic markers. It o®ers collaborative labeling through the
resulting tags that produce recommendations and a ranking documents service
where labels are useful for helpful content recommendation.
      </p>
      <p>REC allow domain experts and members of the community to assign meaning
labels to the technical reports of CORTUPP. Using REC, users construct a
di®erent organization of the documents. The integration of REC to CORTUPP
makes it a community information space through functionality for selection,
annotation, authoring/contribution and collaboration.</p>
      <sec id="sec-4-1">
        <title>4 http://www.°ickr.com/</title>
      </sec>
      <sec id="sec-4-2">
        <title>5 http://www.°ickr.com/</title>
      </sec>
      <sec id="sec-4-3">
        <title>6 http://delicious.com/</title>
      </sec>
      <sec id="sec-4-4">
        <title>7 http://www.soboleo.com/</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>CORTUPP allows users to reuse the content of documents, the collection
integrates research activities of students under the supervision of a team of teachers.
This has the following advantages: distribution of assessment instruments,
extension of document descriptions with labels. However, there are several challenges
in the construction of a semantic digital library such as providing for more
usability and inference mechanisms. At the date, CORTUPP can be perceived as a
result of collaborative content production at the UPPuebla.</p>
      <p>Currently, search services at CORTUPP are keyword-based. As future work,
we plan to expand those services in order to have semantic search engines, such
engines can be used to improve the quality of keyword-based search engines
by taking into account the meaning of the words. We conclude with further
possibilities of organization and recommendation that arise from the use of REC.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We thank to the sta® of Programa Acad¶emico de Ingenier¶³a en Inform¶atica at
the UPPuebla for their help and cooperation in the construction process. This
work is partially supported by PROMEP grant Biblioteca Digital Sem¶antica de
Recursos Educativos /103-5/09/4023.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. I., H.:
          <article-title>Role of information technologies in teaching learning process: perception of the faculty</article-title>
          .
          <source>Turkish Online Journal of Distance Education - TOJDE</source>
          <volume>9</volume>
          (
          <issue>2</issue>
          ) (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Lindley</surname>
            ,
            <given-names>W.I.</given-names>
          </string-name>
          :
          <article-title>Constraints and potentials of training mid-career extension profesionals in africa, part 2 (</article-title>
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kruk</surname>
            ,
            <given-names>S.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McDaniel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          : Semantic Digital Libraries. Springer-Verlag, Berlin, Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Medina</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>A., S¶anchez</article-title>
          ,
          <string-name>
            <surname>J.A.</surname>
          </string-name>
          :
          <article-title>Ontoair: A method to construct lightweight ontologies from document collections</article-title>
          .
          <source>In: ENC '08: Proceedings of the 2008 Mexican International Conference on Computer Science</source>
          , Washington, DC, USA, IEEE Computer Society (
          <year>2008</year>
          )
          <volume>115</volume>
          {
          <fpage>125</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
          </string-name>
          , S.C.Y.:
          <article-title>Collaborative tagging applications and approaches</article-title>
          .
          <source>IEEE Multimedia</source>
          <volume>15</volume>
          (
          <year>2008</year>
          )
          <volume>14</volume>
          {
          <fpage>21</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Puspitasari</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lim</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goh</surname>
            ,
            <given-names>D.H.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          , Zhang, J.,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Theng</surname>
            ,
            <given-names>Y.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chatterjea</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Social navigation in digital libraries by bookmarking</article-title>
          .
          <source>In: ICADL'07: Proceedings of the 10th international conference on Asian digital libraries</source>
          , Berlin, Heidelberg, Springer-Verlag (
          <year>2007</year>
          )
          <volume>297</volume>
          {
          <fpage>306</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Sa¶nchez,
          <string-name>
            <given-names>J.A.</given-names>
            ,
            <surname>Arzamendi-P¶etriz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Valdiviezo</surname>
          </string-name>
          ,
          <string-name>
            <surname>O.</surname>
          </string-name>
          :
          <article-title>Induced tagging: promoting resource discovery and recommendation in digital libraries</article-title>
          .
          <source>In: JCDL '07: Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries</source>
          , New York, NY, USA, ACM (
          <year>2007</year>
          )
          <volume>396</volume>
          {
          <fpage>397</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>