An HTTP-Based Versioning Mechanism for Linked Data Herbert Van de Sompel Robert Sanderson Michael L. Nelson Los Alamos National Los Alamos National Old Dominion University, Laboratory, NM, USA Laboratory, NM, USA Norfolk, VA, USA herbertv@lanl.gov rsanderson@lanl.gov mln@cs.odu.edu Lyudmila L. Balakireva Harihar Shankar Scott Ainsworth Los Alamos National Los Alamos National Old Dominion University, Laboratory, NM, USA Laboratory, NM, USA Norfolk, VA, USA ludab@lanl.gov harihar@lanl.gov sainswor@cs.odu.edu ABSTRACT in resource state would lead to a significant num- Dereferencing a URI returns a representation of the current ber of broken references. For robustness, Web state of the resource identified by that URI. But, on the architecture promotes independence between an Web representations of prior states of a resource are also identifier and the state of the identified resource. available, for example, as resource versions in Content Man- Nevertheless, use cases abound that require the availabil- agement Systems or archival resources in Web Archives such ity of (representations of) distinct prior states of resources. as the Internet Archive. This paper introduces a resource Resource versioning is of crucial importance in areas as di- versioning mechanism that is fully based on HTTP and uses verse as community-driven content creation, open govern- datetime as a global version indicator. The approach allows ment, and scientific communication. Also, as more data be- “follow your nose” style navigation both from the current comes available in the Linked Data cloud, the need to version time-generic resource to associated time-specific version re- them will increase if only to allow efficient update of stores sources as well as among version resources. The proposed that leverage the data, and to trace their provenance. Web versioning mechanism is congruent with the Architecture of archives and content management systems possess signifi- the World Wide Web, and is based on the Memento frame- cant amounts of prior versions of resources, but these prior work that extends HTTP with transparent content negoti- versions are largely disconnected from current versions and ation in the datetime dimension. The paper shows how the discoverable only in an ad-hoc manner. Given this state of versioning approach applies to Linked Data, and by means Web resource versioning, we consider these challenges: of a demonstrator built for DBpedia, it also illustrates how it can be used to conduct a time-series analysis across versions 1. Given the current version of a resource, how can “fol- of Linked Data descriptions. low your nose” style navigation to prior versions of the resource be achieved? Categories and Subject Descriptors 2. Given any version of a resource and a particular times- H.3.5 [Information Storage and Retrieval]: Online In- tamp, how can “follow your nose” style navigation to- formation Services wards another version that matches the timestamp be achieved? General Terms This paper is concerned with versioning mechanisms that are machine-actionable, have global scope, and are in- Design, Experimentation, Standardization dependent of media-type. Hence, approaches that are mainly beneficial to human users such as untyped hyperlinks Keywords in HTML with anchor text that provides navigational guid- Web Architecture, HTTP, Linked Data, Resource Version- ance (e.g. “previous/next version”), or version semantics ing, Web Archiving, Temporal Applications expressed in metadata-carrying URIs [13] are not consid- ered. Also, mechanisms that have version indicators specific 1. INTRODUCTION to a certain server such as the deprecated “Content-Version” header field from RFC 2068 [6] are not considered. Similarly, The Architecture of the World Wide Web [11] states that versioning mechanisms that are specific to media-types such dereferencing a URI yields a representation of the (current) as the link element combined with the “prev” and “next” state of the resource identified by that URI, and highlights relationships as used in HTML [17] or Atom [15] are not the impracticality of keeping prior states accessible at their considered. own distinct URIs: Our contributions are a resource versioning mechanism Resource state may evolve over time. Requiring a based on the global notion of time and an HTTP-based URI owner to publish a new URI for each change mechanism to navigate across versions. Furthermore, we demonstrate how these contributions can be applied for time Copyright is held by the author/owner(s). series analysis across resource versions, and illustrate this LDOW2010, April 27, 2010, Raleigh, USA. using a linked data example. The remainder of this paper is structured as follows: Sec- tion 2 discusses and illustrates characteristics of resource versioning approaches; Section 3 provides an introduction to the Memento framework that extends HTTP with trans- parent datetime content negotiation capabilities; Section 4 shows how the Memento framework suggests an elegant re- source versioning approach that is fully based on HTTP; Section 5 shows how the Memento versioning ideas apply to Linked Data; Section 6 describes the demonstrator we built for the DBpedia environment to illustrate how the proposed versioning mechanism can be used to access prior descrip- tions of DBpedia concepts via their existing DBpedia URIs. In the same section, we also show how this mechanism was used to conduct a time-series analysis of Gross Domestic Product values for several countries across DBpedia ver- sions. Section 7 reviews some related work, and Section 8 holds our conclusions. 2. RESOURCE VERSIONING This Section introduces core characteristics of versioning approaches, discusses these characteristics for a typical re- Figure 1: A resource versioning strategy, and its source versioning approach, and evaluates how that version- expression in RDF. ing approach can meet the challenges (1) and (2) from the Introduction. 2.1 Versioning Characteristics abstract resource and are interlinked by the http: //www.w3.org/2006/gen/ont#sameWorkAs property. The following four core characteristics of versioning ap- proaches are considered: 3. Version Relationships: Can be made available as RDF metadata published about and linked from the related 1. Identification: By which means are different versions resources. The common Dublin Core Terms2 hasVer- identified? sion and isVersionOf predicates (bottom of Figure 1) can be used. Alternatively, the machine-processable 2. Versioning Strategy: What is the approach used to as- and media-independent HTTP Link header [14] can sign identifiers to versions, e.g. do new versions receive be used in combination with the registered prev and a new identifier, do they inherit a prior identifier, etc.? next relationships to express version-relationship se- 3. Version Relationships: How are version relationships mantics. In both cases, it should be noted that the between resources expressed? semantics of the relationships do not strictly only ap- ply to time-based version relationships: hasVersion is 4. Version Timestamping: How is the datetime associ- used in cases where “a related resource is a version, edi- ated with versions conveyed? tion, or adaptation of the described resource”, whereas next refers to “the next resource in an ordered series 2.2 A Typical Resource Versioning Approach of resources”. The quote from the Architecture of the World Wide Web implicitly suggests a versioning approach as depicted at the 4. Version Timestamping: In case of the aforementioned top of Figure 1. This approach is described in terms of the RDF approach, additional triples can be introduced aforementioned core characteristics: to express version timestamps. For example, at the bottom of Figure 1, the Dublin Core Terms predi- 1. Identification: HTTP URIs are used to identify ver- cates created and modified are used in accordance to sions of resources; each version has its own URI. the description for time-specific resource given in the aforementioned ontology as being a resource for which 2. Versioning Strategy: A new URI is minted for each for “the dates of creation and of last modification are each new version. When a use case requires that a re- the same”. It is unclear how a version timestamp source URI-R0 that started its existence at t0 , but at can appropriately be expressed when using the HTTP time t1 changes state in such a way that a distinct iden- Link element: conveying such information is not spec- tity is needed, a new resource with URI-R1 is minted. ified for the prev and next relationships, the HTTP And, if consecutively at time t2 a change in state of Last-Modified header does not provide reliable ver- URI-R1 again requires a new identity, a resource URI- sion semantics, and use of metadata embedded in the R2 is created (top of Figure 1). URI-R0 , URI-R1 , and linked resource yields an approach that is dependent URI-R2 co-exist and, in terms of [2] and its associ- on media-type. ated ontology1 are time-specific resources. They rep- resent the evolving state of a not explicitly defined, The above characterization reveals the technological sub- 1 strates that are used in the considered versioning approaches: Ontology for Relating Generic and Specific Information Re- 2 sources http://www.w3.org/2006/gen/ont http://dublincore.org/documents/dcmi-terms/ for the RDF approach, URIs, HTTP, RDF, and an appropri- ate RDF vocabulary; for the HTTP Link approach, URIs, HTTP, HTTP Link, and registered link relationships. For both approaches, applications such as browser plug- ins, can be created to support the navigation described in question (1) above, whereby the starting point would be the current resource URI-R2 . Also (2) can be achieved for the RDF ap- proach, although a processor would need to traverse versions until a matching datetime is found. Lacking appropriate ver- sion datetime information, (2) can not reliably be achieved in case of the HTTP Link header. To summarize, both (1) and (2) can be achieved for the RDF approach, however: (a) Two technological substrates, HTTP and RDF, must be combined (b) Version datetime Figure 2: The Memento Framework. can not be used as a primary entry point; rather resource versions must be traversed until a version with a match- ing datetime is found (3) The common predicates used to according to the datetime dimension. In a manner express version relationships do not necessarily imply time- symmetrical to the way RFC 2295 introduces the Accept- based version relations. Language request header to express the client’s lan- guage preferences, and the Content-Language response 3. THE MEMENTO FRAMEWORK header to express the language returned by the server, The basic motivation for the Memento3 work [19] is achiev- Memento introduces the Accept-Datetime and Content- ing a tighter integration between the current and the past Datetime headers to express the client’s preferred date- Web. Remnants of the past Web exist both in version-aware time for a Memento, and the datetime of the Memento servers such as Content Management Systems (CMS, e.g. returned by its hosting server, respectively. It can be Wikipedia) and Version Control Systems, and in special- noted that, although RFC 2295 did not specify date- purpose Web Archives such as the Internet Archive4 and time conneg, its desirability is at least suggested by the on-demand WebCite5 archive. Whereas a current rep- Tim Berners-Lee’s Generic Resources Statement [2] resentation of a resource is available from its URI-R, prior as all other dimensions of genericity described in it representations - if they exist - are available from distinct (language, media-type, target-medium) are covered by resources URI-Mi (i=1..n) that encapsulate the state URI- RFC 2295. R had at times ti , with ti prior to the current time. In the Memento framework, the resource that provides the current • In order to support discovery of a TimeGate URI-G representation is named the Original Resource, whereas re- for a resource URI-R, a relationship type of timegate is sources that provide prior representations are named Me- introduced for the HTTP Link response header [14]. In mentos. More formally, a Memento for a resource URI-R case of servers that have internal versioning/archiving (as it existed) at time ti is a resource URI-Mi [URI-R@ti ] support (such as CMS) a TimeGate URI-G for URI-R for which a representation at any moment past its creation can typically be exposed by the server of URI-R itself. time tc is the same as a representation that was available In cases whereby servers rely on third parties for their from URI-R at time ti , with tc ≥ ti . Implicit in this def- versioning/archiving (for example by being recurrently inition is the notion that, once created, a Memento always crawled by the Internet Archive), URI-R and URI-G keeps the same representation. will reside on different servers. In addition, in order From a HTTP perspective, URI-R and URI-Mi are dis- to allow discovering the Original Resource associated connected in that HTTP provides no means to navigate to- with a Memento, another special-purpose HTTP Link wards a URI-Mi via its original URI-R. The Memento frame- header, this time with a relationship type of original work introduces this missing capability as follows (Figure 2): is introduced. • Inspired by Transparent Content Negotiation for HTTP • Memento also introduces the notion of a TimeBun- (conneg from now on) specified in RFC 2295 [10] that dle resource via which an overview is available of all allows HTTP clients to negotiate with HTTP servers Mementos that a server hosts for a given (internal or in four dimensions (media type, language, character external) URI-R. A TimeBundle is a non-information set, compression), Memento introduces conneg in a resource [18] modeled as an ORE Aggregation [12] in fifth dimension: datetime. RFC 2295 introduces the which all Aggregated Resources share a temporal rela- notion of a transparently negotiable resource as the tionship with the Original Resource. A TimeBundle is resource that is the target of conneg, and variant re- described by a TimeMap, which is a specialization of sources that vary according to the aforementioned ne- an ORE Resource Map. A TimeMap lists all URI-Mi gotiable dimensions. Similarly, Memento introduces for a given URI-R as well as their associated meta- the notion of a TimeGate URI-G as a resource that data including timestamp. It also lists the Original supports conneg in the datetime dimension, and Me- Resource URI-R and its TimeGate URI-G. Appendix mentos URI-Mi [URI-R@ti ] as the resources that vary A shows an example RDF/XML TimeMap; other seri- 3 http://mementoweb.org/ alizations such as Turtle and Atom are possible. Dis- 4 http://archive.org/ covery of TimeBundles is supported by the rel value 5 http://webcitation.org/ timebundle in the HTTP Link response header. Three aspects of the Memento architecture ensure that the In Figure 3, note the use of the HTTP Link header to ex- globally deployed HTTP caching infrastructure can be lever- press the very first and most recent Mementos available from aged. First, the Original Resource URI-R and its TimeGate Wikipedia (rel=“first-memento” and rel=“last-memento”, URI-G are always separate resources: URI-R is a conven- respectively) as well as the Mementos that are closest in time tional resource and URI-G is dedicated to datetime conneg. (rel=“prev-memento” and rel=“next-memento”) to the one This eliminates caching problems that would be caused by that is returned. Note also the use of a HTTP Link header transitioning URI-R between non-negotiable and negotiable to point back to the Original Resource (rel=“original”). if URI-R and URI-G were to coincide. Second, the initial Memento architecture [19], required the Original Resource 4. MEMENTO RESOURCE VERSIONING URI-R to 302 redirect to its TimeGate URI-G; as a result, The Memento framework suggests a versioning mecha- cached versions of URI-R could not be leveraged. By using nism that is fully based on HTTP (see Figure 4). These the timegate HTTP Link for discovery of URI-G, Memento are its core characteristics: clients work with caches instead of against them. Third, URI-G and URI-M are never the same resource so the Me- • Identification: HTTP URIs are used to identify ver- mentos (URI-M) can be cached as well. sions of resources. A detailed overview of HTTP request/response scenar- ios is available in the Memento HTTP Transactions Guide6 . • Versioning Strategy: The top of Figure 4 shows URI-R Here, we highlight certain aspects related to HTTP inter- as the resource from which at any point in time the cur- actions with the TimeGate URI-G. A choice was made to rent representation is served, and URI-Mi as resources handle cases in which URI-G is dereferenced without the that provide access to representations that were previ- Accept-Datetime header, by issuing a “302 Found” redirect ously available from URI-R. In terms of [2] and its as- to the most recent Memento, as opposed to offering a list sociated ontology11 , URI-R is a time-generic resource, of choices to the client. While a list would be feasible for a whereas all URI-Mi are time-specific resources. This top-level resource (say, an HTML page), it would be cumber- strategy is different than the one shown in Figure 1, yet some for the potentially many embedded resources (say, the aligned with the stable URI principle of Cool URIs [3, images in the HTML page). URI-G will only return HTTP 18]: instead of minting a new URI for every new ver- response code “300 Multiple Choices” if explicitly requested sion, keep the URI stable and mint new URIs for old with a “Negotiate: 1.0” request header or when there are versions. This approach has become rather widespread; multiple Mementos with the same Content-Datetime7 . URI- for example, http://cnn.com and http://en.wikipedia. G will return HTTP response code “406 Not Acceptable” org/wiki/DJ_Shadow are such URI-R, whereas http: when the Accept-Datetime is outside of the datetime range //web.archive.org/web/20010911203610/http://www. of known Mementos. For further technical details about the cnn.com and http://en.wikipedia.org/w/index.php? Memento framework, we refer to the original paper [19], and title=DJ_Shadow&oldid=337446696 are examples of the more recent overview of the evolved solution [20] that respective URI-Mi . has resulted from feedback to the original ideas provided by both the Linked Data and Web Archiving communities. • Version Relationships: A timegate HTTP Link header Since its publication, Memento has received significant provided in response to GET/HEAD requests issued attention. Major Web Archives have started implementing against the stable URI-R points at a TimeGate. And, support8 , and work is ongoing to develop support for com- an original HTTP Link header provided in response mon CMS platforms such as MediaWiki and Drupal. Also, to GET/HEAD requests issued at Mementos URI-Mi the establishment of a Memento-track at the JISC Devel- points at URI-R of the Original Resource. As de- oper Days (Dev8D)9 organized by the UK’s Joint Informa- scribed, a TimeGate supports datetime conneg based tion Systems Committee is an early indication of interest by on the content of the Accept-Datetime header. It is both funders and implementers. a time-travel resource that acts as a gateway between As an illustration, Figure 3 shows a Memento HTTP flow the time-generic URI-R and its associated time-specific whereby a client requests a November 8 2009 version of the Mementos URI-Mi . The result of the datetime conneg Wikipedia page for DJ Shadow, by interacting with its cur- is a Memento that meets the expressed datetime pref- rent URI http://en.wikipedia.org/wiki/DJ_Shadow; the erence. Also, the prev-memento and next-memento re- client is pointed by http://en.wikipedia.org/wiki/DJ_Shadow lationships may be used in the HTTP Link header to to a TimeGate at Wikipedia; via that TimeGate the client point at Mementos that are adjacent in time to the successfully retrieves a Memento that meets its datetime returned one. preferences (only headers crucial to convey an understand- • Version Timestamping: Versioning of URI-R is not ing of datetime conneg are shown). We should point out required as it always is the current version. Mementos that Wikipedia has not (yet) implemented such Memento URI-Mi are timestamped by means of the Content- HTTP flows, but a MediaWiki plug-in that adds Memento Datetime response header. support is available10 . 6 The technology substrate used by the Memento versioning http://www.mementoweb.org/guide/http/ approach is fully centered on HTTP: URI, HTTP, HTTP 7 This may occur as HTTP only supports second-level time Link with to-be-registered link relationships, HTTP date- granularity 8 time conneg. The challenges formulated in the Introduction See Agenda of First Memento Implementation Meeting at http://mementoweb.org/events/IA201002/ can be addressed as follows (bottom of Figure 4): 9 http://wiki.2010.dev8d.org/w/Talk_6 wiki/Extension:Memento 10 11 Memento MediaWiki plug-in http://www.mediawiki.org/ http://www.w3.org/2006/gen/ont Figure 3: Memento HTTP Request/Response Cycle. expressed in the timegate HTTP Link header returned by URI-R. The TimeGate supports datetime conneg allowing the client to obtain various versions (Memen- tos) by varying the content of its Accept-Datetime re- quest header. In addition, in cases where the prev- memento and next-memento relationships are avail- able in the Alternates header provided in TimeGate re- sponses, a client can engage in version-to-version nav- igation with a certainty that the version-relationships are time-based. 2. Given any version of a resource and a particular times- tamp, how can “follow your nose” style navigation to- wards another version that matches the timestamp be achieved? A version URI-Mi provides an original HTTP Link header pointing at the stable URI-R. From thereon, this scenario is the same as described in the previ- ous point; the timestamp is used as the content of the Accept-Datetime request header. The Content- Datetime provides the earliest datetime at which the returned version became available; that version was Figure 4: Memento Resource Versioning. still the then-current one at the datetime that was ex- pressed in the Accept-Datetime header. 1. Given the current version of a resource, how can “fol- 5. MEMENTO RESOURCE VERSIONING low your nose” style navigation to prior versions of AND LINKED DATA the resource be achieved? The current version is the Figure 5 shows how Memento integrates in the Linked stable URI-R. A client application can follow its nose Data environment. In this case, URI-R is a cool URI for a to a TimeGate for URI-R by using the URI that is non-information resource [18], and the current description combinations for the requested subject URI-R. This ap- proach readily supports providing the first-memento, last- memento, next-memento and prev-memento relationships in the HTTP Link header provided in responses. The returned Memento is the one with the start/end interval that cov- ers the datetime requested via conneg. For conneg requests with a datetime value that is in the range of the current DBpedia version, the TimeGate issues an HTTP 302 redi- rect to the Original Resource URI-R at dbpedia.org. Me- mentos are available both in HTML and RDF/XML. For example, the Memento for DBpedia 3.3’s France resource in HTML is http://mementoarchive.lanl.gov/dbpedia/ memento/20090701/http://dbpedia.org/page/France. Colleagues at DBpedia kindly implemented the timegate HTTP Link header pointing at our TimeGates. This re- quired approximately one hour and consisted of adding a stored procedure in the OpenLink Virtuoso engine to add the appropriate HTTP Link header. Figure 6 shows a Memento HTTP flow whereby a client requests the description of France that was available from Figure 5: Memento Resource Versioning and Linked DBpedia on March 20, 2008. Only headers crucial to convey Data. an understanding of the conneg are shown. To illustrate full Memento compliance, we also imple- mented TimeBundle/TimeGate support for our DBpedia Table 1: DBpedia Demonstrator Database Table. version archive. This functionality was not used to achieve id subject start end triples the time-series analysis described below. The Appendix integer, varchar(256) datetime datetime blob shows our TimeMap for the DBpedia resource http://dbpedia. auto increment, not null, org/resource/France; the content should be self-explanatory. PK Although DBpedia currently operates under a regime of recurrent discreet updates, both the proposed Memento ap- proach and our specific database design support a possible of URI-R is available at URI-S. A TimeGate URI-G is in- future regime in which DBpedia is updated on an ongoing troduced for URI-R, and to support its discovery a timegate basis. In this case, an archiving mechanism would need to HTTP Link header is provided in responses to GET/HEAD be added to ensure that versions of distinct DBpedia de- requests to URI-R. When a Linked Data client is in need scriptions are pushed/pulled into the version archive as they of prior descriptions of URI-R, it follows its nose to URI-G, change. where it can use datetime conneg to arrive at a description of URI-R as it existed at some time in the past. Note that 6.2 Time-Series Analysis using Memento Re- the conneg with URI-G can include dimensions other than source Versioning datetime. The media-type dimension that is commonly used To illustrate the power of the proposed approach, we im- in Linked Data to allow a choice of descriptions expressed in plemented a simple time-series analysis using both past and RDF serializations or HTML can also be supported. Simi- current DBpedia data. We set out to trace the evolution over larly, negotiation on language can be supported. time of the Gross Domestic Product Per Capita for various countries, leveraging the http://dbpedia.org/property/ 6. THE DBPEDIA DEMONSTRATOR gdpPppPerCapita property. The straightforward data-time-traveling algorithm used to 6.1 Demonstrator Set-Up construct the time-series data-set is described by the below We have implemented the architecture depicted in Figure pseudo code. It must be noted that the actual script must 5 in the DBpedia context. We first downloaded the five prior rely on some ad-hoc heuristics to deal with diverging data English-language versions of DBpedia (2.0 through 3.3)12 formats used for GDP values. in NT format. Using a python script, the approximately 600 Million triples were loaded into a MySQL table (Table resources := [list of country description TimeGate URIs] 1). Loading took approximately 15 hours, and resulted in a times := [list of date times, one per version including current] prop := "http://dbpedia.org/property/gdpPppPerCapita" MyISAM table of 81 GB. values := {} For each DBpedia subject URI-R, we exposed a TimeGate to support content negotiation in the datetime and media- foreach r in resources: values[r] := [] type dimensions. For example, our TimeGate for DBpedia’s foreach t in times: France resource http://dbpedia.org/resource/France is data := fetch(URI-TG/r, Accept-Datetime: t, Accept: http://mementoarchive.lanl.gov/dbpedia/timegate/http: "application/rdf+xml") graph := parse(data) //dbpedia.org/resource/France. The datetime function- value := graph.sparql(SELECT val WHERE { r prop ?val . }) ality was implemented by retrieving all distinct start/end value := normalize(value) values[r].push(value) 12 http://wiki.dbpedia.org/Downloads34 Figure 6: Memento DBpedia HTTP Request/Response Cycle. The collected data were then turned into a chart (Figure This capability is similar to link relationship types such as 7) using the Google Chart API13 . “latest-version”, “predecessor-version”, “successor-version”, and “working-copy-of” proposed in [4] to allow simple ver- sion navigation between Web resources. The focus of this 7. RELATED WORK proposal that emerged from the AtomPub [7] context, how- Little research has explored a protocol-based solution to ever, is clearly on editorial version control (cf. WebDav, augment the Web with time travel capabilities. TTApache Java Content Repository). Also, it provides no means to [5] introduced an ad-hoc RPC-style mechanism to access navigate versions based on datetime information. archived representations given the URI of their original, e.g. There is a relationship between the described work and “page.html?02-Nov-2009”. This approach reveals the local efforts that research the problem of provenance of Linked scope of the problem addressed by TTApache, as opposed Data, specifically those provenance aspects concerned with to the global perspective taken by the Memento datetime the time intervals in which specific data is valid. For ex- conneg framework. Indeed, the query components are is- ample, [8], is concerned with provenance graphs that allow sued against a specific server, and are not maintained when expressing such validity information, whereas [9] focuses on a client moves to another server as is the case with the applications to support preserving link integrity over time. Accept-Datetime header of datetime conneg. TTApache Our proposal introduces a native HTTP approach that al- also allowed addressing archived representations using ver- lows leveraging the results of these efforts at Web scale. sion numbers in query components rather than datetimes. This capability is similar to the deprecated “Content-Version” header field from RFC 2068 [6] and other, similar expired 8. CONCLUSIONS proposals (e.g., [16]). Such versioning features have not URIs like http://weather.example.com/oaxaca used in found wide-spread adoption, presumably because their ad- [11] have gained significant functionality in the Linked Data dress space is tied to a specific resource or server, and not context as they start providing access not only to HTML in- universal like datetime. TTApache also provided support for tended for human consumption, but also to data expressed reserved terms as query components such as “page.html?now”. in some RDF serialization intended for machine processing. When publishing data in accordance to Memento’s HTTP- 13 http://code.google.com/apis/charttools/index.html based versioning mechanism proposed in this paper, their Figure 7: A time-series analysis conducted across DBpedia versions using Memento’s HTTP-based versioning approach. value further increases as they become entry points to both 10. REFERENCES current and past versions of data. The time-series analysis [1] E. Adar, M. Dontcheva, J. Fogarty, and D. S. Weld. described in Section 6.2 is an admittedly simple demonstra- Zoetrope: interacting with the ephemeral web. In tion of a subtle and powerful change in the utility of Linked UIST ’08: Proceedings of the 21st annual ACM Data URIs. symposium on User interface software and technology, The URI http://weather.example.com/oaxaca can now pages 239–248, 2008. be leveraged to obtain an overview of Oaxaca’s weather in [2] T. Berners-Lee. Web architecture: Generic resources. the past months, merely by issuing HTTP GET requests http://www.w3.org/DesignIssues/Generic.html, 1996. with varying datetime preferences. Similarly, time-traveling [3] T. Berners-Lee. Cool URIs don’t change. 1998. a Dow Jones data URI can result in an overview of the http://www.w3.org/Provider/Style/URI.html. stock market’s evolution at any desired granularity. Trac- [4] A. Brown, G. Clemm, and J. Reschke. Link relation ing the evolving state of traffic congestions, implemented in types for simple version navigation between web Zoetrope [1] by high-frequency crawls and scraping of a traf- resources, Internet Draft, 2010. fic web site could be achieved by dereferencing a single data URI with varying timestamps instead. [5] C. E. Dyreson, H. Lin, and Y. Wang. Managing While this paper has focused on Linked Data, it should versions of web documents in a transaction-time web be clear that the proposed versioning mechanism can be ap- server. In WWW ’04: Proceedings of the 13th plied to Web resources in general. It could, for example, be international conference on World Wide Web, pages leveraged to facilitate navigating across issues of Web-based 422–432, 2004. newspapers and magazines, and it can play an important [6] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, and role in better integrating the data-intensive eScience and T. Berners-Lee. Hypertex Transfer Protocol – eHumanities efforts into the Web. Hence, the addition of a HTTP/1.1, Internet RFC-2068, 1997. time dimension to the Web is not something only digital ar- [7] J. Gregorio and B. de hOra. The Atom publishing chaeologists should care about. It is an enabler for a global protocol, Internet RFC-5023, December 2007. HTTP-based versioning mechanism that can support a new [8] O. Hartig. Provenance information in the web of data. range of temporal applications for both the document and In Proceedings of the 2nd Workshop on Linked Data the data Web. This paper has merely scratched the surface on the Web (LDOW2009), 2009. of a new world of possibilities. [9] B. Haslhofer and N. Popitsch. DSNotify–Detecting and Fixing Broken Links in Linked Data Sets. In Proceedings of 8th International Workshop on Web 9. ACKNOWLEDGMENTS Semantics, 2009. The Memento research is partly funded by the Library [10] K. Holtman and A. Mutz. Transparent Content of Congress. Many thanks to Chris Bizer, Kinsley Idehen, Negotiation in HTTP, Internet RFC-2295, 1998. and Mitko Iliev for implementing the timegate HTTP Link [11] I. Jacobs and N. Walsh. Architecture of the world header in DBpedia. wide web, volume one. Technical Report W3C Recommendation 15 December 2004, W3C, 2004. [12] C. Lagoze, H. Van de Sompel, P. Johnston, M. L. Nelson, R. Sanderson, and S. Warner. Adding eScience Assets to the Data Web. In Proceedings of the Linked Data on the Web Workshop (LDOW 2009), 2009. [13] N. Mendelsohn and S. Williams. The use of Metadata in URIs, TAG Finding 2 January 2007, 2007. [14] M. Nottingham. Web linking, Internet Draft, 2010. [15] M. Nottingham and R. Sayre. The Atom Syndication Format, Internet RFC-4287, 2005. [16] K. Ota, K. Takahashi, and K. Sekiya. Version management with meta-level links via HTTP/1.1, Internet Draft draft-ntt-http-version-00, 1996. [17] D. Raggett, A. L. Hors, and I. Jacobs. HTML 4.01 Specification. Technical Report W3C Recommendation 24 December 1999, 1999. [18] L. Sauermann and R. Cyganiak. Cool URIs for the semantic web. Technical Report W3C Interest Group Note 31 March 2008, W3C, 2008. [19] H. Van de Sompel, M. L. Nelson, R. Sanderson, L. L. Balakireva, S. Ainsworth, and H. Shankar. Memento: Time Travel for the Web. Technical Report arXiv:0911.1112, 2009. [20] H. Van de Sompel, M. L. Nelson, R. Sanderson, L. L. Balakireva, S. Ainsworth, and H. Shankar. Memento: Updated Technical Details (February 2010). http://www.slideshare.net/hvdsomp/memento- updated-technical-details-february-2010, 2010. APPENDIX A. AN RDF/XML TIMEMAP The following is an RDF/XML TimeMap for http:// dbpedia.org/resource/France. 2010-02-17T05:26:27Z 2010-02-17T05:26:27Z application/rdf+xml foresite@googlegroups.com Foresite Toolkit (Python) Memento Time Bundle for http://dbpedia.org/resource/France Aggregation ResourceMap 2009-11-01T00:00:00+00:00 2007-09-01T00:00:00+00:00 2007-09-01T00:00:00+00:00 2008-02-01T00:00:00+00:00 2008-02-01T00:00:00+00:00 2008-08-01T00:00:00+00:00 2008-08-01T00:00:00+00:00 2009-11-01T00:00:00+00:00 2008-11-01T00:00:00+00:00 2009-07-01T00:00:00+00:00 2009-07-01T00:00:00+00:00 2009-11-01T00:00:00+00:00