=Paper=
{{Paper
|id=None
|storemode=property
|title=Wiki Authoring and Semantics of Mathematical Document Structure
|pdfUrl=https://ceur-ws.org/Vol-767/paper-06.pdf
|volume=Vol-767
|dblpUrl=https://dblp.org/rec/conf/itp/KurodaN11
}}
==Wiki Authoring and Semantics of Mathematical Document Structure==
Wiki Authoring and Semantics of Mathematical Document
Structure
Hiraku Kuroda∗ and Takao Namiki
Department of Mathematics, Hokkaido University,
060-0810 Sapporo, Japan
Abstract
We are developing a CMS including document authoring feature based on wiki to publish struc-
tural mathematical documents on the Web. Using this system, users can write documents including
mathematical expressions written in LATEX notation and explicitly stated characteristic structures of
mathematical articles such as definitions, theorems, and proofs. Documents input to the system is
published on the Web as not only XHTML files to be browsed but also XML files complying with
NLM-DTD, which is used to exchange articles electronically. Not only single wiki page document,
users can build a document which consist of more than one pages and is described its structure se-
mantically by the system. In order to do this, we also propose an application of OAI-ORE and RDF
vocabularies to describe structures of documents consisting of several resources.
1 Introduction
Today, many documents are published on the Web. The term “documents” here includes articles of news
or blog, wiki pages, journal articles, and any other web pages. These documents are published as HTML
or XHTML to be browsed, or as PDF or PS file to be printed out. Sometimes one document is published
as several formats.
Some documents consist of several resources. One of examples is a document including a graphic
image. Here, we assume that body text of the document is written in a HTML file and the image is a
JPG file. On the Web, each of them is independent resource and given unique URI. When URI of the
image is put on src attribute of an img element in the HTML file, we should treat the document as not
just referencing but including or embedding the image. In this case, this document is an aggregation of
two resources that are HTML file of body text and JPG file of graphical image.
Sometimes documents include not only graphic images but also whole of other (more small) docu-
ments. In general, this is called transclusion. HTML does not have this transclusion feature in itself,
but MediaWiki, for example, has templates feature to extract content of other wiki pages to the page [3].
In this case, the document is an aggregation of resources that are body text written in wiki markup and
other documents which are indicated in the document to be included.
Furthermore, we sometimes build a document by integrating several documents. In this case, not
only a large document is just split into several documents, but each of documents is independent and
has their own URI, and they can be referenced directly. In general, parts, chapters, or sections of a
document are able to be independent documents. MathML specification by W3C [8] is an example of
such documents. This document consists of one overview, eight sections, and eleven appendices. These
divisions are independent web pages and have their own URIs. Finer divisions of a document may be
independent documents according to structure or characteristics of a document. For example, definitions,
theorems, proofs, and expressions in mathematical documents may be independent documents.
Open Archives Initiative Object Reuse and Exchange is standards to describe and exchange aggre-
gations of web resources [11]. In the User Guide of OAI-ORE [13], journal articles are described as
∗ hiraku@math.sci.hokudai.ac.jp
Wiki Authoring and Semantics Kuroda and Namiki
Listing 1: A document including a theorem written in Wiki
At f i r s t , we g i v e T a y l o r ’ s t h e o r e m and p r o o f o f i t .
[ [ t h e o r e m i d =” t a y l o r t h e o r e m ” t i t l e =” T a y l o r ’ s t h e o r e m ” |
L e t $ f $ be a f u n c t i o n which i s d e f i n e d on t h e i n t e r v a l $ ( a , b ) $ and s u p p o s e t h e $ n $ t h
d e r i v a t i v e $ f ˆ { ( n ) }$ e x i s t s on $ ( a , b ) $ . Then f o r a l l $x$ and $ x 0 $ i n $ ( a , b ) $ ,
$$ R n ( x ) = \ f r a c { f ˆ { ( n ) } ( y ) }{n ! } ( x−x 0 ) ˆ n $$
w i t h $y$ s t r i c t l y b e t w e e n $x$ and $ x 0 $ ( $y$ d e p e n d s on t h e c h o i c e o f $x$ ) . $R n ( x ) $ i s t h e
$n$th remainder of the Taylor s e r i e s f o r $f ( x ) $ .
]]
( O r i g i n a l t e x t o f t h e t h e o r e m i s h t t p : / / p l a n e t m a t h . o r g / e n c y c l o p e d i a / T a y l o r s T h e o r e m . html ,
retrieved at 2011.05.08)
aggregations of representation files such as PDF or PS. In this article, on the other hand, we propose
describing documents as aggregations of constituting resources and relating the documents with their
representations apart from describing aggregations.
With a background like that, we are developing a content management system Matherial, which
manages and publishes mathematical documents and other resources. One of the purposes is developing
a system which assists to write documents consisting of several resources, and publishes as web pages
with appropriate metadata to describe its structure and publishes as XML files complying with NLM-
DTD for further reusing.
Matherial provides authoring assistant feature based on wiki engine. Users of the system can write
a wiki page including chapters, sections, mathematical statements, and expressions, or they can write
some of them as independent wiki pages and integrate them into one document and publish it. Relation-
ships between documents and included resources, and between documents and wiki pages representing
them, are modeled as aggregations of OAI-ORE, and they are described in XHTML representation of
documents by RDFa. Matherial can output documents into not only one or more XHTML pages, but
also XML files complying with NLM-DTD. Therefore, other systems supporting NLM-DTD are able to
re-use documents by Matherial.
The paper is organized as follows: In section 2, we present an example of mathematical structural
documents on Matherial. In section 3, we propose an application of OAI-ORE and RDF vocabulary to
describe structural documents on Matherial and more generally on the Web. Finally, section 4 concludes
the paper.
2 Mathematical Contents Management System
2.1 Wiki-based Authoring
One of major features of Matherial, developed in this study, is assistant authoring mathematical doc-
uments. This is based on so-called “Wiki Engine”, so users can write documents by simple markup
notation and publish them on the Web. They can put mathematical expressions written in LATEX notation
into texts, and our own MathML library [7] converts them to MathML [8].
The most simple type of documents created on Matherial is one consisting of a wiki page. When users
need to write mathematical text structures, such as definitions, theorems, and proofs, using functional
markup for them, they can expressly provide that segments have such property.
For example, Listing 1 is a document written in wiki markup including a theorem. When users
write theorems in their document directly like this, URIs of theorems are hash URIs, appending
Wiki Authoring and Semantics Kuroda and Namiki
Listing 2: A document imporint other resources
We b e g i n w i t h T a y l o r ’ s t h e o r e m and i t s p r o o f .
[ [ import wiki / TaylorTheorem ] ]
[ [ import wiki / ProofOfTaylorTheorem ] ]
F o r a f u n c t i o n $ f ( x ) $ , $ f $ i s T a y l o r e x p a n d a b l e when $\ l i m {n\ t o \ i n f t y } R n ( x ) =0$ where $R$ i s
r e m i n d e r t e r m o f [ [ [ w i k i / T a y l o r T h e o r e m | t h e t h e o r e m ] ] , and we h a v e b e l l o w .
[ [ import wiki / TaylorExpansion ] ]
Even i f complex f u n c t i o n $ f ( z ) $ i s n o t h o l o m o r p h i c a t a p o i n t $c$ , i f $ f $ i s h o l o m o r p h i c i n an
a n n u l u s a r o u n d $c$ , we g e t L a u r e n t s e r i e s b e l l o w ,
$ $ f ( z ) = \ sum {n=\ i n f t y }ˆ{\ i n f t y } a n ( z−c ) ˆ n$$
where
$ $ a n =\ f r a c 1 {2\ p i i }\ o i n t \gamma\ f r a c { f ( z ) dz } { ( z−c ) ˆ { n +1}} $$
and $\gamma$ i s a c l o s e d c u r v e i n t h e a n n u l u s ( f i g . [ [ r e f a n n u l u s ] ] ) .
[ [ f i g u r e f i l e / AnnulusOfLaurent id =annulus ] ]
T h i s i s e x t e n s i o n o f [ [ [ T a y l o r E x p a n s i o n ] ] ] f o r f u n c t i o n s which a r e n o t h o l o m o r p h i c .
their IDs as fragment to URI of the document. In this case, assuming a URI of a document is
http://mw2011.matherial.org/wiki/Taylor, a URI of a theorem itself in the document is http:
//mw2011.matherial.org/wiki/Taylor#taylor_theorem.
For important definitions, theorems, and proofs, considering we discuss about them or reuse them
from other documents, they should be independent documents and referenced by their own URI. On
wiki of Matherial, users can set type of page, for example set that page is a theorem, the system treats the
document by the wiki page as if it is described a theorem. In this case, URI of the theorem is URI of the
document by wiki page. Detail about URIs and relationships of documents and wiki pages in Matherial
are illustrated in section 3.
With Matherial, users can write documents importing and extracting mathematical statements which
have been created as independent document. Moreover users can put images which are managed in
Matherial into documents in the same way, and they can use descriptions of images which were input
when images were upload to the system instead of writing new descriptions in the page. Listing 2 is
a document importing statements which are already published and going on to describe a statement
following them. In that example, an image referenced in the text will be imported with its description.
Moreover, aggregating these documents as sections, chapters, or parts, users can build a new docu-
ment. In Matherial, users input enumeration of sub documents with metadata of the document such as
title, author’s information, and so on into form to build the document. Detail of semantic structure of
documents which consist of several resources is described at section 3.
2.2 Output Documents
Matherial output documents as XML files. XML schemas of output XML files are XHTML [18] to read
directly by web browsers, and NLM-DTD [9] to exchange articles electrically.
For XHTML files, metadata are described as RDF graph [14] and embedded by RDFa [15]. Metadata
written into XHTML are metadata of the document itself such as title, authors’ information, and time and
date when the document was created and update, and structure information about relationships between
the document and other resources. For example, relationships between resources for a document about
the Laurent series shown previously is illustrated at Fig.1. The web page of this document browsed is
Wiki Authoring and Semantics Kuroda and Namiki
Listing 3: A part of an NLM-DTD XML version of a document
< a r t i c l e −meta>
< t i t l e −group>L a u r e n t S e r i e s a r t i c l e − t i t l e > t i t l e −group>
Kuroda surname>H i r a k u g i v e n −names >
c o n t r i b >
c o n t r i b −group>
29 5 2011 y e a r >
a r t i c l e −meta>
f r o n t >
We b e g i n w i t h T a y l o r ’ s t h e o r e m and i t s p r o o f . p>
< t i t l e >T a y l o r Theorem t i t l e >
L e t