=Paper= {{Paper |id=Vol-1611/paper2 |storemode=property |title=From Users to Systems: Identifying and Overcoming Barriers to Efficiently Access Archival Data |pdfUrl=https://ceur-ws.org/Vol-1611/paper2.pdf |volume=Vol-1611 |authors=Nicola Ferro,Gianmaria Silvello |dblpUrl=https://dblp.org/rec/conf/jcdl/FerroS16 }} ==From Users to Systems: Identifying and Overcoming Barriers to Efficiently Access Archival Data== https://ceur-ws.org/Vol-1611/paper2.pdf
        From Users to Systems: Identifying and Overcoming
            Barriers to Efficiently Access Archival Data

                                Nicola Ferro                                               Gianmaria Silvello
                Department of Information Engineering                              Department of Information Engineering
                        University of Padua                                                University of Padua
                            Padua, Italy                                                       Padua, Italy
                         nicola.ferro@unipd.it                                        gianmaria.silvello@unipd.it

ABSTRACT                                                                     given the “dramatic increase” [3] in the number of people ac-
Digital archives are one of the pillars of our cultural her-                 cessing them. A recent user study [11] analyzing the user in-
itage and they are increasingly opening up to end-users by                   teraction patterns with finding aids highlighted that “[they]
focusing on accessibility of their resources. Moreover, digi-                focus on rules for description rather than on facilitating ac-
tal archives are complex and distributed systems where in-                   cess to and use of the materials they list and describe” and
teroperability plays a central role and efficient access and                 that many archive’s users have serious issues using finding
exchange of resources is a challenge.                                        aids [1]. Common and frequent user interaction patterns
   In this paper, we investigate user and interoperability re-               with finding aids are navigational and thus they require to
quirements in the archival realm and we discuss how next                     browse the archival hierarchy to make sense of the archival
generation archival systems should operate a paradigm shift                  data; for instance, two common interaction patterns are [11]:
bringing a new model of access to archival resources which                   top-down where users “start at the highest level, gain back-
allows to better address these needs.                                        ground and context, and work down to the most specific level
   To this end, we employ the data structures and query                      of detail ” and bottom-up where users “start at the most de-
primitives based on the NEsted SeTs for Object hieRarchies                   tailed level seeking specific information, and then move back
(NESTOR) model to efficiently access archival data over-                     to the higher levels”.
coming the identified barriers and limitations.                                 From this new point-of-view, digital finding aids (i.e. EAD)
                                                                             constrain user orientation of archives because several key op-
                                                                             erations are not possible nor efficient, given that it is prob-
Keywords                                                                     lematic to: (i) let the user access a specific item on-the-fly,
set-based data models, archival data, XPath, XML                             whereas we have to define fixed access points to the archival
                                                                             hierarchy [8]; (ii) let the user reconstruct the context of an
                                                                             item without requiring to browse the whole archival hierar-
1.    INTRODUCTION                                                           chy [2]; and, (iii) present the user with only selected items
   Archives, along with libraries and museums, are one of                    from an archive, whereas we have to give them the archive
the main cultural institutions encompassed by Digital Li-                    as a whole [7, 18].
braries (DL). Archives represent the trace of the activities                    From the technological perspective, the presented limi-
of a physical or legal person in the course of their busi-                   tations also affect the interoperability of archives in dis-
ness which is preserved because of their continued value over                tributed environments, thus preventing the exchange of re-
time. They are composed of unique documents interlinked                      sources by means of standard DL technologies such as the
with each other as well as with their production and preser-                 Open Archives Initiative Protocol for Metadata Harvest-
vation environments. The main characteristic of archives                     ing (OAI-PMH)1 [8, 15]. Indeed, a single EAD file describes
lies in the hierarchical structure used to retain the context                a whole archive and thus it is not possible to share or ex-
and the full informational power of archival data.                           change in a distributed environment only a subset of records;
   The hierarchical structure shaping archives is a founda-                  for archives, it is common to be required to exchange only
tional feature of traditional paper-based archival description               the high-level descriptions (e.g., fonds and sub-fonds) or to
– the so-called finding aid. This is reflected in its digital                exchange only the records open to public disclosure. This
counterpart, the Encoded Archival Description (EAD) [14]                     problem affects the possibility to exchange finding aids with
eXtensible Markup Language (XML) format, which is the                        variable granularity by means of OAI-PMH forcing archival
key brick for managing, finding and accessing archival data.                 institutions to share whole archives or nothing. EAD pro-
   Over the last decade, thanks to the centrality of the Web                 vides archivists with many degrees of freedom in tagging
for information access and the rapid evolution of DL ser-                    practice exacerbating the differences in how XML elements
vices, we have witnessed a major shift towards a “radical                    are used and nested one inside the other [10]. This makes
user orientation” [12] of archives, where usability and find-                it difficult to know in advance how an institution will use
ability of resources are becoming number one priorities [20]                 the hierarchical elements and then to define general rules
                                                                             and paths to access EAD elements; for instance, there is no
In Proceedings of 1st International Workshop on Accessing Cultural Her-      guarantee that an XML Path Language (XPath) expression
itage at Scale (ACHS’16), June 22, 2016, Newark, NJ, USA. Copyright
                                                                             returning all the series or the units in a given EAD file will
2016 for this paper by its authors. Copying permitted for private and aca-
demic purposes.                                                              1
                                                                                 http://www.openarchives.org/pmh/
                                                                                                                                            [...]
work with a different file in another collection or even in the                                                                                   [...] 
                                                                                                                                               
same one.                                                                                                                                          [...]
                                                                       Archival record 1        Archival record 13
   In this paper, we stem from the above observations about                                                            Archival record 11
                                                                                                                                                 
                                                                                                                                                    [...]
the user and interoperability needs in the archival realm to                                 SUB-
                                                                                                                       Archival record 12
                                                                                                                                                 
                                                                                           FONDS B                                               
discuss how next generation archival systems should operate                                                 SERIES D                               [...]
a paradigm shift bringing a new model of access to archival            FONDS                                                                       
                                                                         A                                             Archival record 10
resources which allows to better address these needs. In par-                                                                                         [...]
                                                                                                                                                   
ticular, the contribution of the paper is to turn the above                                  SUB-
                                                                                           FONDS C
                                                                                                            SERIES E                                
                                                                                                                                                      [...]
                                                                                                                              UNIT
requirements into specific access use cases to archival re-                                                                    G                    
                                                                                                                                                    
sources, discussing how and why current approaches rep-                        Archival record 2
                                                                                                                              UNIT
                                                                                                                               H
                                                                                                                                                      [...]
                                                                                                            SERIES F                                     
resent a barrier to their complete fulfillment, and showing                    Archival record 3                              UNIT                             […]
                                                                                                                               I                         
how our proposed solution, called NEsted SeTs for Object                                                                      UNIT                       
                                                                                                                               L                               […]
hieRarchies (NESTOR) [8, 9], represents a step forward.                                            Archival record 4
                                                                                                   Archival record 5                                     
   Indeed, NESTOR [8] defines an alternative way to rep-                                                               Archival record 6
                                                                                                                                                         
                                                                                                                                                               […]
resent hierarchical data by expressing the relationships be-                                                           Archival record 7                 
                                                                                                                                                         
                                                                                                                       Archival record 8
tween objects through the inclusion property between sets,                                                             Archival record 9                 
                                                                                                                                                               […]

in contrast to the binary relation between nodes exploited by                                                                                      […]

the tree which is the typical model used to represent archival                                (a) Archival Tree                             (b) EAD representation
data. NESTOR has been instantiated by three data struc-
tures on which query primitives, proven to be highly efficient    Figure 1: A sample archive and its EAD represen-
in a wide spectrum of cases, have been realized [9]. NESTOR       tation.
represents a paradigm shift with respect to state-of-the-art
solution to access hierarchical data because it answers query
primitives – e.g., descendants and children to deal with the      shown in Figure 1 on the right, which is an XML descrip-
top-down interaction pattern and ancestors and parent to          tion of a whole archive, reflects the archival structure, holds
deal with the bottom-up one – by exploiting basic set op-         relations between entities and retains context.
erations which do not require to browse and navigate the             EAD follows the traditional archival paradigm where ex-
hierarchy.                                                        perts know exactly what they are looking for and, for ex-
   Moreover, in order to fully understand the difference be-      ample, they browse EAD to know the location of physical
tween NESTOR and state-of-the-art navigational (i.e., based       records [12]. By contrast, in the new user-oriented paradigm
on XPath) approaches, we conducted a case study evalua-           enabled by digital archives “users no longer have to be de-
tion based on ten real-world heterogeneous EAD files repre-       pendent on the physical presence of archivists to identify,
senting different key challenges for the identified access use    review, and retrieve materials” [23], but they need effective
cases, where we discuss the main drawbacks of a navigation-       means for performing information seeking activities. As a
based access approach and how they are addressed by the           matter of fact, EAD turns out to be problematic in: (i)
NESTOR set-based one. We also show how the intrinsic              supporting user-oriented information access; (ii) supporting
differences between NESTOR and traditional navigational           flexible control access policies; (iii) enabling interoperabil-
approaches are also consistently reflected in the query exe-      ity between digital archives working in distributed environ-
cution times, which are a quantitative proxy for appreciating     ments.
the paradigm shift represented by NESTOR and its impact.
   The rest of the paper is organized as follows: Section 2
provides relevant background information; Section 3 dis-          2.2         XPath: A Navigational Approach
cusses the examined use cases; Section 4 presents the ex-            XPath2 is widely adopted for searching and selecting por-
perimental outcomes. Finally, Section 5 draws some conclu-        tions of EAD files. XPath is a language for addressing parts
sions.                                                            of an XML document; it provides basic facilities for manip-
                                                                  ulation of several data types and adopts a path notation
2.    BACKGROUND                                                  for navigating through the hierarchical structure of an XML
                                                                  document. “Location path” is a common kind of XPath ex-
2.1    Digital Archives                                           pression, which selects a set of nodes relative to a given node
                                                                  and as output returns the node-set containing the nodes se-
   Archives are composed by “unique records of corporate          lected by the location path. Each part of an XPath ex-
bodies and the papers of individuals and families” [14]. The      pression can be composed of three parts: (i) an axis, which
original order – i.e. the principle of provenance – of the doc-   specifies the tree relationship between the nodes; (ii) a node
uments within an archive is preserved because the context         test, which specifies the node type and expanded-name of
and the physical order in which the documents are held are        the selected nodes; and (iii) zero or more predicates that
as valuable as their content [6].                                 can further refine the selected set of nodes.
   According to the International Standard for Archival De-          As it emerges from the previous discussion, archival sys-
scription (General) (ISAD(G)), archival description (i.e. the     tems typically rely on third-party and standard libraries
finding aids) proceeds from general to specific as a conse-       for XPath processing. Since the NESTOR data structures
quence of the provenance principle and has to show, for ev-       and query primitives are implemented in Java and work in-
ery unit of description, its relationships and links with other   memory, we are interested in comparing to state-of-the-art
units and to the general fonds, taking the form of a tree
as shown in Figure 1 on the left. The digital encoding of
                                                                  2
ISAD(G) is the Encoded Archival Description (EAD) [14],               http://www.w3.org/TR/xpath/
                                                                 UNIT L                                 NESTOR can be instantiated by three data structures [9]:
                                                                                     UNIT I
                                                                  SERIES
                                                                                                     Direct Data Structure (DDS), Inverse Data Structure (IDS)
                                                                                          UNIT
       FONDS A       SUB-FONDS C
                                                                      F                    H         and Hybrid Data Structure (HDS). Each one of these struc-
                                   SERIES F




                                                                       SUB-FONDS C
                                        UNIT H
                                                                                              UNIT
                                                                                               G
                                                                                                     tures is composed by three dictionaries, one containing the
                               UNIT G
SUB-FONDS   SERIES    SERIES                     SUB-FONDS B
                                                               FONDS
                                                                 A
                                                                                                     materialization of the sets, one containing the direct subsets
    B         D         E
                               UNIT I   UNIT L
                                                                                       SERIES E
                                                                                                     of each set and the last one containing all the supersets of
                                                                                                     each set. DDS is a structure built around the constraints
                                                                  SERIES D
                                                                                                     defined by the NS-M, IDS is a structure built around the
                                                                                                     constraints of INS-M and HDS can be seen as a mixture
                                                                                                     between DDS and IDS [9].
    (a) Euler-Venn representation                (b) DocBall representation                             When we deal with a collection of sets defined by NESTOR,
        of the NS-M                                  of the INS-M                                    we can distinguish between set-wise and element-wise prim-
                                                                                                     itives. The former ones enable us to query the structure of
Figure 2: The archive of Figure 1 modeled with the                                                   an archive, whereas the latter ones query the content of the
NS-M and the INS-M.                                                                                  archive (i.e., the archival records). For instance, by means
                                                                                                     of the set-wise primitives we can ask for all the series of a
                                                                                                     specific sub-fonds, whereas with the element-wise primitives
in-memory Java-based solutions. Xalan3 , Jaxen4 and JX-                                              we can ask for all the archival records belonging to the series
path5 are the three most used state-of-the-art Java libraries                                        of that sub-fonds.
for XPath processing.                                                                                   NESTOR primitives (i.e., Descendants, Ancestors, Chil-
                                                                                                     dren and Parent) are efficient alternative implementations of
2.3       NESTOR: A Set-Based Approach                                                               XPath primitives as shown in [9] where we conducted an ex-
   The NESTOR model is defined by two set-based data                                                 tensive evaluation on five EAD collections, Wikipedia and
models: The Nested Set Model (NS-M) and the Inverse Set                                              two synthetic XML datasets and we compared NESTOR
Data Model (INS-M) [8]; they are formally defined in the                                             with state-of-the-art XPath engines. In [9] we evaluated
context of set theory as a collection of subsets. The most                                           NESTOR on average performances by testing the primitives
intuitive way to understand how these models work is to re-                                          on thousands of files and then presenting mean execution
late them to the archival tree. In Figure 2a we can see how                                          times; in this paper we investigate how NESTOR primitives
the archive shown in Figure 1 is mapped into an organization                                         behave with specific digital archives and how efficiently they
of nested sets based on the NS-M.                                                                    answer to common and frequent archival operations.
   From Figure 2a we can see that the NS-M adopts a bottom-
up approach: (i) each set corresponds to an archival division;                                       3.    USE CASES
(ii) the innermost sets are the leaves of the hierarchy, e.g.
the units; (iii) you create supersets as you climb up the hi-                                          We present three user-oriented use cases derived from com-
erarchy, e.g. the series, sub-fonds and fonds. The archival                                          mon interaction patterns individuated in the archival do-
records are represented as elements belonging to the sets.                                           main and four interoperability use cases based on the ex-
With the NS-M an archive is modeled as a collection of sub-                                          change of archival data in distributed environments.
sets where there is a set – i.e. “fonds” – which contains all                                        3.1    User-oriented Use Cases
the subsets – i.e. “subfonds”, “series”, “units” – of the archive
and where two subsets at the same level – e.g. two “series”                                          Use Case 1: identifying and selecting relevant material
– cannot have common elements, thus their intersection is
empty.                                                                                               This use-case is related to the “searching for known material ”
   As shown in Figure 2b, the INS-M adopts a top-down ap-                                            information seeking activity investigated by Duff and John-
proach: (i) each set corresponds to an archival division; (ii)                                       son in [5]. This activity may be performed by researchers
the innermost set is the root of the hierarchy, i.e. the fonds;                                      at the beginning of a project to establish a context and de-
(iii) you create supersets as you climb down the hierarchy,                                          tect relevant information and it may be re-iterated several
e.g. sub-fonds, series and then units. As for the NS-M, also                                         times to “reevaluate information that has suddenly gained
in this case the archival records are represented as elements                                        new significance” [5]. Such activities can be associated to
belonging to the sets. With the INS-M an archive is modeled                                          the top-down pattern of interaction identified by Freund and
as a collection of sets where there exists an archival division                                      Toms in [11] where the users “start at the highest level [of
shared by all other divisions; in our example, the “fonds” is                                        an archival description], gain background and context, and
the archival division common to all the other divisions in                                           work down to the most specific level of detail ”.
the archive.                                                                                           In Figure 3 we can see a graphical representation of this
   This vision overcomes EAD limitations because in NESTOR                                           use case. We consider an archival system that answers a
each archival record is an element belonging to a set which                                          user query that starting from a given context node requires
can be selected and managed independently from the other                                             to return a list of archival records. From this list the user
records; thus, we can return to the users a list of records be-                                      then selects the description of, say, sub-fonds C; in this case
longing to different archival divisions at any level allowing                                        two frequent queries to be answered are: to return the sub-
them to access and consult the records hiding the complexity                                         divisions (series D, series E, series F, unit G, unit H, unit I
of the whole archival structure.                                                                     and unit L) which are part of this sub-fonds – i.e a structural
                                                                                                     query – and to return all the records (the actual records
3
  http://xml.apache.org/xalan-j/                                                                     or their descriptions contained by the three series and four
4
  http://jaxen.codehaus.org/                                                                         units which are children of sub-fonds C) associated to this
5
  http://commons.apache.org/proper/commons-jxpath/                                                   sub-fonds – i.e a content query.
                                                                                                                                                            Use-case 1: Identifying and selecting relevant material
                                                                                                                                                         Structural Operation: What are the sub-divisions composing Sub-Fonds C?
                                                                                                                                                         Content Operation: Which records belong to Sub-Fonds C?

                                                      Archival record 1             Archival record 13
                                                                                                                                                                                                                                                                                                   UNIT L
                                                                                                            Archival record 11
                                                                                                            Archival record 12                                                                                                                                                                                      UNIT I
                                                                              SUB-
                                                                            FONDS B                                                                                                                                                                                                                SERIES                UNIT
                                                                                                 SERIES D                                                                                                                                                                                              F                  H




                                                                                                                                                                                                                                                                                                      SUB-FONDS C
                                                      FONDS                                                                                                                                                                                                                                                                  UNIT
                                                        A                                                   Archival record 10                                                                                                                                                                                                G
                                                                                                                                                                              FONDS A
                                                                                                                                                                                              SUB-FONDS C      SERIES F                                                   SUB-FONDS B       FONDS
                                                                              SUB-
                                                                                                 SERIES E                                                                                                                                                                                     A
                                                                            FONDS C
                                                                                                                  UNIT                                                                                      UNIT      UNIT
                                                                                                                   G                                                    SUB-FONDS    SERIES    SERIES E      G         H                                                                                              SERIES E
                                                                                                                                                                            B          D
                                                                                                                  UNIT
                                                                                                                   H                                                                                        UNIT I   UNIT L
                                                                Archival record 2                                                Archival record 6
                                                                                                 SERIES F
                                                                Archival record 3                                 UNIT           Archival record 7                                                                                                                                                  SERIES D
                                                                                                                   I
                                                                                                                                 Archival record 8
                                                                                                                  UNIT
                                                                                                                   L             Archival record 9
                                                                                      Archival record 4
                                                                                      Archival record 5
                                                                                                                                                              Descendants Structural                               Descendants           Descendants Structural                                                            Descendants
                       Structural expression:                                                                     Content expression:                               Operation:                                         Content                  Operation:                                                                     Content
                  /fondsA/subfondsC/                                                                               /fondsA/subfondsC/                          Get all the subsets of                                Operation:            Get all the supersets                                                             Operation:
                 descendant-or-self::*                                                                                 descendant-or-                              sub-fonds C                                   Get all the elements        of sub-fonds C                                                              Get all the elements
                                                       Archival record 11
                                                                                                                           self::*/text()                                                                           belonging to                                                                                            belonging to
                                                       Archival record 12
                                                                                                                                                                                                                     sub-fonds C                                                                                          sub-fonds C and
                                            SERIES                                                                                  Archival record 2                                                                                                                                                                       its supersets
                                              D
                                                                                                                                    Archival record 3




                                                                                                                                                                                                                                                                   UNIT




                                                                                                                                                                                                                                                                             SERIES E
                                                                                                                                                                 SUB-FONDS C SERIES F




                                                                                                                                                                                                                                                                    G
                                                                                                                                    Archival record 11




                                                                                                                                                                                                                                                            UNIT
                                                       Archival record 10
                                                                                                                                    Archival record 12                            UNIT




                                                                                                                                                                                                                                                              H
                                                                                                                                                                            UNIT
                         SUB-               SERIES
                                                                                                                                    Archival record 10        SERIES SERIES  G      H




                                                                                                                                                                                                                                                       UNIT I
                       FONDS C                E
                                                             UNIT
                                                                                                                                                                D




                                                                                                                                                                                                                                                                                        SERIES D
                                                              G                                                                     Archival record 4                  E
                                                                                                                                                                              UNIT I UNIT L




                                                                                                                                                                                                                                                 SERIES
                                                             UNIT                                                                   Archival record 5




                                                                                                                                                                                                                                                     F
                                                                                                                                                                                                                                        UNIT L
              Archival record 2            SERIES F
                                                              H
                                                                            Archival record 6
                                                                                                                                    Archival record 6                                                                                                              SUB-FONDS C
              Archival record 3                              UNIT           Archival record 7
                                                              I
                                                                            Archival record 8                                        Archival record 7
                                                             UNIT
                                  Archival record 4           L             Archival record 9                                        Archival record 8
                                  Archival record 5
                                                                                                                                     Archival record 9
                                                                                                (a) Tree                                                                            (b) Nested Sets Model                                                                 (c) Inverse Nested Sets Model



                                                                       Figure 3: Use-case 1: Identifying and selecting relevant material.


   With a navigational approach based on XPath, the struc-                                                                                                                                        tion [22]. To address this aspect we need to return to the
tural query corresponds to the following XPath expression:                                                                                                                                        user all and only the archival divisions from the selected unit
/fondsA/subfondsC/descendant-or-self::*; and the con-                                                                                                                                             up to the root.
tent query corresponds to: /fondsA/subfondsC/descendant-                                                                                                                                             If we consider the case presented in Figure 4 where we
or-self::*/text(). Both these expressions require to nav-                                                                                                                                         need to reconstruct the context of “Unit L”, we can see that
igate the archival tree to the sub-fonds C division and then                                                                                                                                      a structural query needs to return all the archival divisions
to visit all of its descendants.                                                                                                                                                                  up to the root – i.e., the ancestors of unit L which are series
   In Figure 3 we see that the NS-M answers the structural                                                                                                                                        F, sub-fonds C and fonds A – and the content query returns
query by returning all the subsets of sub-fonds C (i.e. all                                                                                                                                       all the records or descriptions contained by these divisions.
its descendants), whereas the INS-M answers it by return-                                                                                                                                            With an XPath-based approach, the structural query (e.g.,
ing all the supersets of the sub-fonds (i.e. all its ancestors).                                                                                                                                  /fondsA/subfondsC/seriesF/unitL/ancestor-or-self::*)
The content query is answered by NS-M by returning all the                                                                                                                                        requires to navigate the archival tree from the leaf “unit L”
elements belonging to sub-fonds C, whereas INS-M has to                                                                                                                                           up to the root; the output of this query is a sub-tree with
return the union of all the elements belonging to sub-fonds                                                                                                                                       the same root of the original tree, but containing only those
C and its supersets. We can see that the NS-M and the                                                                                                                                             nodes on the path between “Fonds A” and the leaf “unit L”.
INS-M answer the queries by exploiting two different prim-                                                                                                                                        The content query (/fondsA/subfondsC/seriesF/unitL/ancestor-
itives, the first is based on the subsets of a set, whereas the                                                                                                                                   or-self::*/text()) does the same operation but selects
second is based on its supersets. In NS-M the descendants                                                                                                                                         only the data nodes that are then returned to the user.
of an archival node, say sub-fonds C, are the subsets of the                                                                                                                                         As shown in Figure 4, the NS-M answers the query about
set representing sub-fonds C; whereas, in INS-M the descen-                                                                                                                                       the context by exploiting a set-wise primitive which returns
dants are the supersets of the given set.                                                                                                                                                         all the supersets of the selected division, whereas the INS-M
                                                                                                                                                                                                  does so by returning all its subsets. This operation also has
Use Case 2: building contextual knowledge                                                                                                                                                         an element-wise counterpart answering the content query
                                                                                                                                                                                                  and in this case, NS-M returns all the elements belonging to
“Building context is the sine qua non of historical research” [5]                                                                                                                                 the union of the supersets of the selected unit, whereas the
and one of the main functions of archives. As we described                                                                                                                                        INS-M simply returns the elements belonging to the set of
above, the context of an archival record is required to dis-                                                                                                                                      the unit.
close its full informational power and thus, reconstructing
the knowledge of a record or of an archival division is one of
the most common and important operation an archival sys-                                                                                                                                          Use Case 3: seeking unknown archival material
tem has to provide. This operation can be associated with                                                                                                                                         This use-case is related to the “becoming oriented to a new
the bottom-up pattern of interaction identified also by [11]                                                                                                                                      archive or collection” information seeking activities inves-
where the users “start at the most detailed level seeking spe-                                                                                                                                    tigated in [5]. It analyses a common scenario where users
cific information, and then move back to the higher levels                                                                                                                                        have not a clear idea about what they are looking for and
to make sense of the information and place it in context if                                                                                                                                       may proceed systematically from an archival division to the
necessary”.                                                                                                                                                                                       other. This use case is also related to the two previous ones
   Figure 4 presents the operations required to “build contex-                                                                                                                                    because, among other operations, it may require to analyze
tual knowledge” of an archival description. To better guide                                                                                                                                       the descendants of a given archival division or record as well
the user when exploring the archive the more accurate the                                                                                                                                         as to climb up the hierarchy. Indeed, we can see this use
contextual information returned are, the better; indeed, if                                                                                                                                       case as a combination of the top-down and the bottom-up
we return the whole archive to the user then s/he might be                                                                                                                                        patterns and can be associated to the “systematic interro-
disoriented by the large amount of heterogeneous informa-                                                                                                                                         gation” interaction [11], where the users “develop hypotheses
                                                                                                                                                  Use-case 2: Building contextual knowledge

                                                                                                                                               Structural Operation: What is the context of unit L?
                                                                                                                                              Content Operation: Which records are related to record 9?

                                     Archival record 1         Archival record 13
                                                                                         Archival record 11                                                                                                                                                                                      UNIT L
                                                                                         Archival record 12
                                                           SUB-                                                                                                                                                                                                                                                         UNIT I
                                                         FONDS B
                                                                                                                                                                                                                                                                                                   SERIES                      UNIT
                                                                           SERIES D
                                                                                                                                                                                                                                                                                                       F                        H
                                   FONDS                                                                                                                                   FONDS A




                                                                                                                                                                                                                                                                                                          SUB-FONDS C
                                     A                                                   Archival record 10                                                                                 SUB-FONDS C        SERIES F                                                                                                          UNIT
                                                                                                                                                                                                                                                                                                                                  G
                                                                                                                                                                                                           UNIT         UNIT                                                           FONDS
                                                           SUB-
                                                                           SERIES E                                                                                SUB-FONDS       SERIES     SERIES        G            H                                              SUB-FONDS B
                                                         FONDS C                                                                                                                                                                                                                         A
                                                                                               UNIT                                                                    B             D          E
                                                                                                G                                                                                                                      UNIT L
                                                                                                                                                                                                           UNIT I                                                                                                         SERIES E
                                                                                               UNIT                                                                                                                      9
                                             Archival record 2                                  H                Archival record 6
                                                                           SERIES F
                                             Archival record 3                                 UNIT              Archival record 7
                                                                                                I                                                                                                                                                                                                       SERIES D
                                                                                                                 Archival record 8
                                                                                               UNIT
                                                                                                L                Archival record 9
                                                                 Archival record 4
                                                                 Archival record 5                                                                          Ancestors Structural                               Ancestors Content           Ancestors Structural                                                         Ancestors Content
                                                                                                                                                                 Operation:                                        Operation:                   Operation:                                                                  Operation:
                  Structural expression:                                                      Content expression:
                                                                                                                                                                 Get all the                                   Get all the elements         Get all the subsets                                                         Get all the elements
                /fondsA/subfondsC/                                                                   /fondsA/subfondsC/
                                                                                                                                                                supersets of                                   belonging to unit L               of unit L                                                              belonging to unit L
               seriesF/unitL/                                                                        seriesF/unitL/
               ancestor-or-self::*                                                                   ancestor-or-self::*/                                          unit L                                       and its supersets
                  FONDS                                                                              text()                                                                                                                                         UNIT L
                    A

                                    SUB-                                                                           Archival record 1
                                  FONDS C
                                                                                                                                                                       SUB-FONDS C                                                                  SERIES F
              Archival record 1                                                                                    Archival record 2                    FONDS                     SERIES F
                                                                                                                   Archival record 3                      A




                                                                                                                                                                                                                                                          SUB-FONDS C
                                                SERIES F
                         Archival record 2
                                                                                                                   Archival record 4                                                 UNIT L
                         Archival record 3
                                                                                                                   Archival record 5
                                                                                                                                                                                                                                                FONDS
                                      Archival record 4
                                                              UNIT
                                                                                                                   Archival record 9                                                                                                              A
                                                               L
                                      Archival record 5
                                                                     Archival record 9


                                                                       (a) Tree                                                                                               (b) Nested Sets Model                                                                     (c) Inverse Nested Sets Model



                                                                                     Figure 4: Use-case 2: Building Contextual Knowledge.

                                                                                                                                                Use-case 3: Seeking unknown archival material
                                                                                                                                         Structural Operation: Which divisions are related to unit L?
                                                                                                                                         Content Operation: Which records are related to unit L?
                                        Archival record 1          Archival record 13
                                                                                            Archival record 11
                                                                                                                                                                                                                                                                                      UNIT L
                                                                                            Archival record 12                                                                                                                                                                                          UNIT I
                                                               SUB-
                                                             FONDS B                                                                                                                                                                                                                  SERIES F
                                                                                                                                                                                                                                                                                                                        UNIT
                                                                               SERIES D                                                                           FONDS           SUB-FONDS C          SERIES F                                                                                                          H
                                                                                                                                                                    A




                                                                                                                                                                                                                                                                                          SUB-FONDS C
                                      FONDS
                                        A                                                   Archival record 10                                                                                    UNIT          UNIT                                                                                                      UNIT
                                                                                                                                                           SUB-FONDS     SERIES                    G             H                                                              FONDS                                      G
                                                                                                                                                                                   SERIES E                                                               SUB-FONDS B
                                                               SUB-
                                                                               SERIES E
                                                                                                                                                               B           D                                                                                                      A
                                                             FONDS C
                                                                                                   UNIT                                                                                                        UNIT L
                                                                                                    G
                                                                                                                                                                                                  UNIT I                                                                                                  SERIES E
                                                                                                   UNIT
                                                Archival record 2                                   H                Archival record 6
                                                                               SERIES F
                                                Archival record 3                                  UNIT             Archival record 7                                                                                                                                                  SERIES D
                                                                                                    I
                                                                                                                    Archival record 8
                                                                                                   UNIT
                                                                                                    L               Archival record 9
                                                                     Archival record 4
                                                                     Archival record 5                                         Parent and Children                                                      Parent and Children            Parent and Children                                               Parent and Children
                                                                                                                              Structural Operations:                                                   Structural Operations:         Structural Operations:                                            Structural Operations:
                  Structural expression:                                                          Content expression:
                                                                                                                                Get all the subsets                                                     Get all the elements           Get all the supersets                                             Get all the elements
              /fondsA/subfondsC/                                                                           /fondsA/subfondsC/
             seriesF/unitG/parent::*                                                                                              of the superset                                                       of the subsets of the            of the subsets of                                                 of the supersets of
                                                                                                         seriesF/unitG/               of unit L                                                            superset of unit L                                                                             the subset of unit L
                                                                                                         parent::*/text()                                                                                                                       unit L
              /fondsA/subfondsC/
             seriesF/child::*                                                                                        /fondsA/subfondsC/
                                                                                                                    seriesF/child::*/                                                                                                  UNIT L
                         UNIT
                          G                                                                                         text()                                                                                                                       UNIT I
                         UNIT                                                                                                                            UNIT          UNIT
                          H              Archival record 6                                                                                                              H
                                                                                                                       Archival record 6                  G
                         UNIT            Archival record 7                                                                                                                                                                                                  UNIT
                          I                                                                                            Archival record 7                                                                                                                     H
                                         Archival record 8
                         UNIT                                                                                          Archival record 8                 UNIT I    UNIT L
                          L              Archival record 9
                                                                                                                       Archival record 9
                                                                                                                                                                                                                                                                        UNIT
                                                                                                                                                                                                                                                                         G


                                                                          (a) Tree                                                                                      (b) Nested Sets Model                                                             (c) Inverse Nested Sets Model



                                                                             Figure 5: Use-case 3: Seeking unknown archival material.


as to where in the finding aids structure the information is                                                                                                                           returning all the direct subsets (i.e. the children) of the su-
most likely to be and check each one in turn”.                                                                                                                                         perset (i.e. the parent) to which the selected unit belongs;
   In Figure 5 we show this use case where the user selects an                                                                                                                         as usual, the INS-M reverses this logic and answers by re-
archival division or a record and then asks for all the archival                                                                                                                       turning all the direct supersets of the subset to which the
divisions (structural or set-wise) or all the records (content                                                                                                                         selected unit belongs. The element-wise primitive takes the
or element-wise) at the same level of the selected element                                                                                                                             sets outputted by the set-wise one and then returns all the
(e.g. the siblings of this element). For instance, if the user                                                                                                                         elements belonging to them.
selects one of record descriptions represented by “Unit L” in
the figure, this operation allows her/him to retrieve all the
other descriptions connected to it (e.g. all the sibling units
of “Unit L” or the elements belonging to them).
   We can see that to answer this interrogation, both from                                                                                                                             3.2                    Interoperability-oriented Use Cases
the structural and the content viewpoints, the navigational                                                                                                                               As described above and reported in [15], digital finding
approach requires two XPath expressions where the first one                                                                                                                            aids based encoded by the EAD standard represent a bar-
returns the parent node of the given node and the second,                                                                                                                              rier towards the very interoperability this standard aims to
starting from this last one node, returns all of its children;                                                                                                                         enable. Indeed, as we see below, with EAD there are sev-
note that to do this, navigational approaches need to visit                                                                                                                            eral OAI-PMH functions which cannot be used by archival
each child node and thus the higher the number of children,                                                                                                                            systems. On the other hand, NESTOR set-based operations
the higher the complexity of this operation.                                                                                                                                           can be straightforwardly employed by archival systems to
   The NS-M answers the query with a set-wise primitive by                                                                                                                             use all OAI-PMH functionalities with digital finding aids [8].
Use Case 4: Get Records
                                                                       Table 1: Statistics of ten selected EAD files.
This use case is based on the a common OAI-PMH request                         Size                           max     average
where a service provider requests all the records belonging                   (MB)     # nodes    depth    fan-out    fan-out
to an archive. This use case can be addressed also by navi-          EAD-01    0.368      7,316       10        823       4.33
                                                                     EAD-02    1.853     21,355       10      1,610       1.62
gational approaches just by exchanging the whole EAD file            EAD-03    3.131     42,123       13      2,453       1.49
via OAI-PMH.                                                         EAD-04    3.866     75,094        9     10,271       1.73
   NESTOR addresses this case by relying on the descendant           EAD-05    4.043     51,946       12      1,320       1.80
                                                                     EAD-06    5.310     73,372       12      3,663       1.87
content operation shown in Figure 3 with a slight variation;         EAD-07    6.017     57,362       14        565       1.91
indeed in the figure we ask for all the descendants of sub-          EAD-08    9.242    103,703       18        340       1.62
fonds C, whereas in this case we are asking the NS-M to              EAD-09    9.746    160,031       14      8,930       2.01
                                                                     EAD-10   15.512    188,862       17        696       1.62
return the set representing “Fonds A” which contains all the
records in the archive, and the INS-M to return the union of
all records belonging to the set “Fonds A” and its supersets.
                                                                    DDS, IDS and HDS are compared to widely-adopted ready
Use Case 5: Get Sub-hierarchy                                    to use solutions based on the XPath for operating of the
                                                                 structure and the content of EAD files: Xalan, Jaxen and
This use case is a specification of the previous one where the
                                                                 JXPath, which represent the state-of-the-art solutions for
service provider requests only those records belonging to the
                                                                 dealing with EAD files7 .
sub-hierarchy rooted in a given archival division. Naviga-
                                                                    The main characteristic of EAD files representing a chal-
tional approaches do not apply to this case, whereas NESTOR
                                                                 lenge for XPath libraries is the number of nodes in each
can address it by means of the descendant content operation
                                                                 file; the selected files are of increasing sizes to show that
as shown in Figure 3.
                                                                 navigational-based solution performances depend by the num-
Use Case 6: Get Context                                          ber of nodes and the overall dimension of the EAD files,
                                                                 whereas this does not apply for the set-based operations im-
In this case the service provider requests all the records be-   plemented by NESTOR. Indeed, in Figure 6 we can see that
longing to a specific division, say “Unit L”, and to all the     all the XPath libraries answer in linear time with respect
related divisions up to the root as shown in Figure 4.           to the size of the EAD file because they need to navigate
  As in the previous case, navigational approaches do not        big hierarchies by visiting a great number of nodes. On the
apply to this case, whereas NESTOR addresses it by em-           other hand, we can see that IDS answers the descendant
ploying the ancestor content primitive which for the NS-M        structural operation in constant time for all the EAD files
returns the union of all the records belonging to “Unit L”       and it is five orders of magnitude faster than XPath-based
and its supersets and for the INS-M returns all the elements     solutions. DDS and HDS show some dependence on the size
belonging to the “Unit L”.                                       of the EAD file; indeed, they need to perform some set op-
Use Case 7: List Sets                                            erations (more nodes mean more operations) that require
                                                                 some time, even though for the descendant content oper-
This use case is related to the “listSets” OAI-PMH verb “used    ation, they are several orders of magnitude more efficient
to retrieve the set structure of a repository” and allows the    than navigating the archival hierarchy. Overall, IDS is the
service provider to know the structure of a local repository     best solution for addressing use case 1 and 7, whereas DDS
in advance.                                                      is the best for use cases 1, 4 and 5.
   This request cannot be answered by an XPath expression           It is interesting to note that for addressing use cases 1, 4
because it is not possible to extract only structural informa-   and 5, XPath-based libraries are slower for the EAD-04 file
tion filtering out all data nodes; moreover, the OAI-PMH         which is the one with the highest number of children (i.e.,
set-based organization of metadata does not apply to EAD.        10,271) followed by EAD-09 which also has a high number
On the other hand, answering the “listSets” verb is natural      of children (i.e., 8,930). These two files are challenging for
for NESTOR because it retains the structure by exploiting        all the use cases requiring the descendants or the children
inclusion relationships between sets. Therefore, it answers      of a node such as use cases 1, 3 and 5. Navigational-based
this request by employing the descendant structure opera-        solutions are particularly challenged by this case as we can
tion as shown in Figure 3.                                       see in Figure 6 for the content operation and in Figure 8. On
                                                                 the other hand, we can see that the IDS and the HDS are
4.     VALIDATION                                                not affected by the high max fan-out of these files given that
  We proposed three different instantiations of NESTOR           they can answer without visiting the high number of child
according to three alternative data structures, namely DDS,      nodes, but just by returning a set or by performing basic set
IDS and HDS. In order to compare the query operations            operations. DDS requires more set operations than the other
defined on these data structures with currently adopted so-      two set-based solutions; even though in most cases it is con-
lutions for operating on digital archives we selected two EAD    sistently more efficient than navigation-based solutions, it is
collections that provide us with real-world archival data: the   still less performing than IDS and HDS which are extremely
National Archives of the Netherlands6 and the Library of         efficient for these cases. The overall performances reported
Congress finding aids.                                           in Figure 8 with a particular focus on EAD-04 and EAD-09
  We selected ten EAD files taken from these collections         show that set-based solutions are particularly well-suited to
representing a wide variety of archives with different char-     address the operation employed by use case 3.
acteristics representing key challenges for archival systems.    7
The statistics about these files are reported in Table 1.         We ensure a fair comparison because all the tested solutions
                                                                 are implemented in Java, work in central memory and are
6
    http://www.nationaalarchief.nl/                              tested on the same machine.
   Lastly, use case 2 requires to climb up the archival hierar-      [5] M. W. Duff and C. A. Johnson. Accidentally Found on
chy from a given entry point. We considered EAD files with               Purpose: Information-Seeking Behavior of Historians in
variable depth (from 9 to 17) and we validated the ancestor              Archives. The Library Quarterly, 72(4):472–496, 2002.
operations using the deepest node in each hierarchy as en-           [6] L. Duranti. Diplomatics: New Uses for an Old Science.
try point which represents the worst case scenario for any               Society of Amer. Arch. and Association of Canadian Arch.,
archival system. From a performance viewpoint, in Figure 7               1998.
we can appreciate the difference between the NESTOR set-
                                                                     [7] M. Y. Eidson. Describing Anything That Walks: The Prob-
based approaches and the XPath navigational approaches.                  lem Behind the Problem of EAD. Journal of Archival Or-
Indeed, NESTOR-based solutions behave consistently for all               ganization, 1(4):5–28, 2002.
the tested EAD files and do not depend by the depth and
size of EAD files. On the other hand, the XPath libraries            [8] N. Ferro and G. Silvello. NESTOR: A Formal Model for
behave differently from file to file showing a dependence on             Digital Archives. Inf. Proc. Manage., 49(6):1206–1240, 2013.
the number of nodes, fan-out and depth of the files; for in-         [9] N. Ferro and G. Silvello. Descendants, Ancestors, Children
stance, JXPath behaves less efficiently when EAD files have              and Parent: A Set-Based Approach to Efficiently Address
a high max fan-out (EAD-04 and EAD-09), whereas Xalan                    XPath Primitives. Inf. Proc. Manage., 52(3):399-429, 2016.
performances worsen as the number of nodes increases.               [10] L. Francisco-Revilla, C. B. Trace, H. Li, and S. A. Buchanan.
                                                                         Encoded Archival Description: Data Quality and Analysis.
5.   CONCLUSIONS                                                         Proc. American Society for Inf. Science and Tech., 51(1):1–
                                                                         10, 2014.
   In this paper we identified and described the barriers pre-
venting an efficient access to archival data. We described          [11] L. Freund and E. G. Toms. Interacting with Archival Finding
the main drawbacks of EAD and we showed how it impairs                   Aids. JASIST, 67(4):994-1008, 2015.
a smooth and efficient access to archival descriptions as well      [12] I. Huvila. Participatory archive: towards decentralised cura-
as that it does not satisfy several interoperability require-            tion, radical user orientation, and broader contextualisation
ments.                                                                   of records management. Archival Science, 8(1):15–36, 2008.
   We analyzed the role of the NESTOR model in the context
                                                                    [13] N. A. Khan. Emerging Trends in OAI-PMH Application.
of digital archives and described its main advantages with               In Design, Development, and Management of Resources for
respect to state-of-the-art navigational-based solutions. We             Digital Library Services, pages 147–159, 2013.
have seen that NESTOR set-based approach represents a
paradigm shift in the access of XML files which is well-suited      [14] D. V. Pitti. Encoded Archival Description. An Introduction
                                                                         and Overview. D-Lib Mag., 5(11), 1999.
to enable interaction and interoperability functionalities in
the archival context.                                               [15] C. J. Prom. Does EAD Play Well with Other Metadata
   We identified and described seven use cases highlighting              Standards? Searching and Retrieving EAD Using the OAI
the key challenges archival systems have to address in or-               Protocols. J. of Arch. Org., 1(3):51–72, 2002.
der to deal with common user interaction patterns and to            [16] C. J. Prom. User Interactions with Electronic Finding Aids
satisfy interoperability requirements. In this frame of refer-           in a Controlled Setting. The American Archivist, 67(2):234–
ence, we compared and discussed strengths and limitations                268, 2004.
of navigational-based solutions with respect to NESTOR
set-based ones.                                                     [17] C. J. Prom and T. G. Habing. Using the Open Archives Ini-
                                                                         tiative Protocols with EAD. In Proc. 2nd Joint Conference
   We have seen that NESTOR is a model of access to archival             on Digital Libraries, pages 171–180. ACM Press, 2002.
resources that allows us to better address the identified needs
both from the user and the interoperability viewpoints. From        [18] J. Roth. Serving Up EAD: An Exploratory Study on the
a quantitative standpoint, the experimental validation con-              Deployment and Utilization of Encoded Archival Description
firms that NESTOR-based solutions consistently outperform                Finding Aids. The Amer. Arch., 64(2):214–237, 2001.
state-of-the-art solutions; moreover, we have seen that NESTOR-     [19] W. Scheir. First Entry: Report on a Qualitative Exploratory
based solutions are less dependent – or not dependent at all             Study of Novice User Experience with Online Finding Aids.
– on the hierarchical structure of archives than navigational-           J. of Arch. Org., 3(4):49–85, 2006.
based ones.
                                                                    [20] A. Sexton, C. Turner, G. Yeo, and S. Hockey. Understand-
                                                                         ing users: a prerequisite for developing new technologies.
References                                                               Journal of the Society of Archivists, 25(1):33–49, 2004.

 [1] J. C. Chapman. Observing Users: An Empirical Analysis          [21] S. L. Shreeves, T. G. Habing, K. Hagedorn, and J. A. Young.
     of User Interaction with Online Finding Aids. J. of Arch.           Current Developments and Future Trends for the OAI Pro-
     Org., 8(1):4–30, 2010.                                              tocol for Metadata Harvesting. Library Trends, 53(4):576–
                                                                         589, Spring 2005.
 [2] J. G. Daines and C. L. Nimer. Re-Imagining Archival Dis-
     play: Creating User-Friendly Finding Aids. J. of Arch. Org.,   [22] S. Yako. It’s Complicated: Barriers to EAD Implementation.
     9(1):4–31, 2011.                                                    American Archivist, 71(2):456–475, 2008.

                                                                    [23] J. Zhang. Archival Representation in the Digital Age. J. of
 [3] M. G. Daniels and E. Yakel. Seek and You May Find: Suc-
                                                                         Arch. Org., 10(1):45–68, 2012.
     cessful Search in Online Finding Aid Systems. American
     Archivist, 73:535–468, 2010.                                   [24] X. Zhou. Examining Search Functions of EAD Finding Aids
                                                                         Web Sites. J. of Arch. Org., 4(3/4):99–118, 2008.
 [4] E. Discovery, S. Shaw, and P. Reynolds. Creating the Next
     Generation of Archival Finding Aids. D-Lib Mag., 13(5/6),
     2007.
                                                                    Use-cases 1 and 7                                                                                                                         Use-cases 1, 4 and 5
                                        5
                                       10
                                                                 Descendant Structural Operation                                                                                  10
                                                                                                                                                                                       5
                                                                                                                                                                                                          Descendant Content Operation
                                                      DDS
                                        4             IDS                                                                                                                              4
                                       10             HDS                                                                                                                         10
                                                      Xalan
                                        3
                                       10             Jaxen                                                                                                                       10
                                                                                                                                                                                       3




  Execution Times (msec), log scale




                                                                                                                                            Execution Times (msec), log scale
                                                      JXpath
                                        2                                                                                                                                              2
                                       10                                                                                                                                         10

                                        1                                                                                                                                              1
                                       10                                                                                                                                         10

                                        0                                                                                                                                              0
                                       10                                                                                                                                         10

                                        −1                                                                                                                                             −1
                                       10                                                                                                                                         10

                                        −2                                                                                                                                             −2
                                       10                                                                                                                                         10

                                        −3                                                                                                                                             −3    DDS
                                       10                                                                                                                                         10         IDS
                                                                                                                                                                                             HDS
                                        −4
                                       10                                                                                                                                         10
                                                                                                                                                                                       −4    Xalan
                                                                                                                                                                                             Jaxen
                                                                                                                                                                                             JXpath
                                        −5                                                                                                                                             −5
                                       10                                                                                                                                         10
                                        EAD01        EAD02     EAD03   EAD04     EAD05   EAD06   EAD07   EAD08    EAD09    EAD10                                                   EAD01    EAD02      EAD03   EAD04   EAD05   EAD06   EAD07   EAD08   EAD09   EAD10
                                                                                  EAD files                                                                                                                              EAD files




Figure 6: Execution times of the descendant structural and content operations.

                                                                                Use-case 2                                                                                                                     Use-cases 2 and 6
                                            5
                                                               Ancestor Structural Operation                                                                                           5
                                                                                                                                                                                                        Ancestor Content Operation
                                       10                                                                                                                                         10
                                            XPath:    DDS                                                                                                                                     DDS
                                            4         IDS                                                                                                                              4      IDS
                                       10                                                                                                                                         10          HDS
                                                      HDS
                                                      Xalan                                                                                                                                   Xalan
                                       10
                                            3         Jaxen                                                                                                                       10
                                                                                                                                                                                       3      Jaxen




                                                                                                                                              Execution Times (msec), log scale
  Execution Times (msec), log scale




                                                      JXpath                                                                                                                                  JXpath
                                            2                                                                                                                                          2
                                       10                                                                                                                                         10

                                            1                                                                                                                                          1
                                       10                                                                                                                                         10

                                            0                                                                                                                                          0
                                       10                                                                                                                                         10

                                            −1                                                                                                                                         −1
                                       10                                                                                                                                         10

                                            −2                                                                                                                                         −2
                                       10                                                                                                                                         10

                                            −3                                                                                                                                         −3
                                       10                                                                                                                                         10

                                            −4                                                                                                                                         −4
                                       10                                                                                                                                         10

                                            −5                                                                                                                                         −5
                                       10                                                                                                                                         10
                                        EAD01        EAD02     EAD03   EAD04     EAD05   EAD06   EAD07   EAD08    EAD09    EAD10                                                   EAD01    EAD02      EAD03   EAD04   EAD05   EAD06   EAD07   EAD08   EAD09   EAD10
                                                                                  EAD files                                                                                                                              EAD files




 Figure 7: Execution times of the ancestor structural and content operations.

                                                                                                                           Use-case 3
                                            5
                                                                       Parent Structural Operation                                                                                 5
                                                                                                                                                                                                               Parent Content Operation
                                       10                                                                                                                                         10
                                                      DDS                                                                                                                                    DDS
                                            4         IDS                                                                                                                          4         IDS
                                       10             HDS                                                                                                                         10         HDS
                                                      Xalan                                                                                                                                  Xalan
                                       10
                                            3         Jaxen                                                                                                                        3
                                                                                                                                                                                  10         Jaxen
   Execution Times (msec), log scale




                                                                                                                                    Execution Times (msec), log scale




                                                      JXpath                                                                                                                                 JXpath
                                            2                                                                                                                                      2
                                       10                                                                                                                                         10

                                            1                                                                                                                                      1
                                       10                                                                                                                                         10

                                            0                                                                                                                                      0
                                       10                                                                                                                                         10

                                            −1                                                                                                                                     −1
                                       10                                                                                                                                         10

                                            −2                                                                                                                                     −2
                                       10                                                                                                                                         10

                                            −3                                                                                                                                     −3
                                       10                                                                                                                                         10

                                            −4                                                                                                                                     −4
                                       10                                                                                                                                         10

                                            −5                                                                                                                                     −5
                                       10                                                                                                                                         10
                                        EAD01        EAD02     EAD03    EAD04    EAD05   EAD06   EAD07   EAD08    EAD09     EAD10                                                  EAD01    EAD02      EAD03   EAD04   EAD05   EAD06   EAD07   EAD08   EAD09   EAD10
                                                                                   EAD files                                                                                                                            EAD files


                                            5
                                                                   Children Structural Operation                                                                                       5
                                                                                                                                                                                                           Children Content Operation
                                       10                                                                                                                                         10
                                                      DDS                                                                                                                                    DDS
                                            4         IDS                                                                                                                              4     IDS
                                       10             HDS                                                                                                                         10         HDS
                                                      Xalan                                                                                                                                  Xalan
                                       10
                                            3         Jaxen                                                                                                                       10
                                                                                                                                                                                       3     Jaxen
   Execution Times (msec), log scale




                                                                                                                                        Execution Times (msec), log scale




                                                      JXpath                                                                                                                                 JXpath
                                            2                                                                                                                                          2
                                       10                                                                                                                                         10

                                            1                                                                                                                                          1
                                       10                                                                                                                                         10

                                            0                                                                                                                                          0
                                       10                                                                                                                                         10

                                            −1                                                                                                                                         −1
                                       10                                                                                                                                         10

                                            −2                                                                                                                                         −2
                                       10                                                                                                                                         10

                                            −3                                                                                                                                         −3
                                       10                                                                                                                                         10

                                            −4                                                                                                                                         −4
                                       10                                                                                                                                         10

                                            −5                                                                                                                                         −5
                                       10                                                                                                                                         10
                                        EAD01        EAD02     EAD03    EAD04    EAD05   EAD06   EAD07    EAD08    EAD09    EAD10                                                  EAD01    EAD02      EAD03   EAD04   EAD05   EAD06   EAD07   EAD08   EAD09   EAD10
                                                                                   EAD files                                                                                                                            EAD files




 Figure 8: Execution times of the parent and children structural operations.