<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">CellStore -the Vision of Pure Object Database</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jan</forename><surname>Vraný</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Department of Computer Science</orgName>
								<orgName type="department" key="dep2">FEE</orgName>
								<orgName type="institution">Czech Technical University in Prague</orgName>
								<address>
									<addrLine>Karlovo náměstí 13, 00</addrLine>
									<postCode>120</postCode>
									<settlement>Praha</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Jan</forename><surname>Vrany</surname></persName>
							<email>vranyj1@fel.cvut.cz</email>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">Department of Computer Science</orgName>
								<orgName type="department" key="dep2">FEE</orgName>
								<orgName type="institution">Czech Technical University in Prague</orgName>
								<address>
									<addrLine>Karlovo namesti 13, 00</addrLine>
									<postCode>120</postCode>
									<settlement>Praha</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">CellStore -the Vision of Pure Object Database</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">84A2C258A005630ADCDA4FA71959DC08</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T14:34+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes a vision of CellStore, a kind of universal database system, which would be capable of storing and operating on several different data models -object, network, hierarchical and even relational one. Features of CellStore will be described as well as underlying storage model and database architecture.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Motivation</head><p>The world's mainstream programming paradigm for robust, large scale, missioncritical application is object-oriented programming (OOP). Many of such applications need support of database to maintain its data. But nobody doubts that the database should be relational or object-relational one. The semantic gap between those two totally different paradigms brings some problems, that has to be solved. Basically, there are three possible solutions:</p><p>-The application operates on data in a "relational way", i.e. the programmer has to use SQL queries to access data directly. In this case, usage of objects is limited only to usage of OO libraries for GUI and so on. -Some kind of object-relational mapper is used (GLORP <ref type="bibr" target="#b4">[5]</ref> or Hibernate <ref type="bibr" target="#b5">[6]</ref> are examples of such O-R mappers). This allows programmers to manipulate data in a "object" way, but architecture and capabilities of O-R mapper limits the design of application and underlying database schema. -Network or object database is used instead of relational one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Currently available object-oriented databases</head><p>There are currently many so-called "object databases" -OmniBase, DB4Objects, ZODB, GOODS, Elephant, GemStone/S. In fact many of them are network rather than object ones. Both network and object database are very similar. Both can store any arbitrary object structure. The difference is that an object database also stores code (methods) together with regular data. Object database can execute any code stored in it itself, no client environment is needed.</p><p>V. Snášel, K. Richta, J. Pokorný (Eds.): Dateso 2006, pp. 32-39, ISBN 80-248-1025-5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">OmniBase</head><p>OmniBase <ref type="bibr" target="#b6">[7]</ref> is embedded network database written in smalltalk. It is available for many different smalltalk dialects -Doplhin Smalltalk, Squeak, VisualWorks, Smalltalk/X and VAST. OmniBase supports multi-version concurrency control, object clustering, online backups and thread-safe operations.</p><p>Garbage collecting is supported, but cannot be performed on live database.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">DB4Objects</head><p>DB4Objects <ref type="bibr" target="#b7">[8]</ref> is less or more similar to OmniBase, but there are two differences:</p><p>1. DB4Objects is targeted on Java and .NET (C#) platforms.</p><p>2. DB4Objects can operate as embedded database or can run as normal database server which communicates with clients over the network.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">GemStone/S</head><p>GemStone/S <ref type="bibr" target="#b8">[9]</ref> is full-featured object database based on smalltalk dialect called Smalltalk DB. GemStone application consists of three parts: a client (usually VisualWorks smalltalk), a Gem (a part of GemStone responsible for evaluating, transaction processing and so on) and a Stone (a part responsible for managing low-level storage). Each part can run on different node in a network. A special Gem called GcGem is responsible for garbage collecting, which is performed during normal processing of client requests.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">The CellStore project</head><p>The basic motivation for CellStore project was development of an experimental database, which can be used as a basis for experimenting with various database algorithms like locking and caching strategies, transaction policies, different data type models, etc. The project is divided into three relatively independent parts:</p><p>-CellStore low-level storage, which provides a basic storage management, -CellStore/OODB, an experimental object database, -CellStore/XML, an experimental native XML database.</p><p>The design and implementation of CellStore is focused more on simple OO design and modularity than on implementation performance. As long as Cell-Store is experimental system, saving several bytes of memory or several processor instructions doesn't matter.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">The low-level storage model</head><p>The low-level storage model gave CellStore its name. It is combination of storage models of Lisp, Smalltalk and Oracle RDBM. Basically, the storage is divided into two main spaces.</p><p>-Cell space, which contains only the structural information about stored data.</p><p>Structure is kept in fixed-size cells. Each cell has several fields, which can contain pointer to another cell in the cell space or a pointer to a record in the data space. Cells describe only relationships between data elements (objects) stored in a CellStore database. -Data space, which contain actual data, i.e. a byte arrays. Data space is organised into blocks, each block may contain several records. Each record in data space is identified by a unique data pointer. The internal organisation of data space is similar to data blocks in Oracle or in any other relational database.</p><p>This approach has several advantages:</p><p>1. usage of fixed-length cell simplifies cell allocation and automatic storage reclamation 2. it is possible to store many different object models</p><p>The second advantage is a more important one. It allows to store different data (class-based objects, prototype-based objects, XML data and even relational data) together in the same database. Thus CellStore can act as a pure object database or as an XML database. Note that data are stored in their native form, mapping of data to cell and data space is less or more straightforward. This is why we call CellStore a universal database.</p><p>Mapping objects into cell and data space In this section, mapping of objects will be described. Consider class-based object model and eight-field cells.</p><p>Each object occupies at least one cell, called the head cell and zero or more cells called tail cells.</p><p>The head cell contains cell header, which contains cell type (non-indexable class-based instance for example), other information like the number of cells occupied by this instance, gc support information, tail-cell flag and more.</p><p>Second field of the head cell contains pointer to ACL set, pointer to another object (in fact, pointer to another object's head cell), which contains all information needed for access control to this object (because we are designing multi-user database).</p><p>Third field of the head cell contains pointer to object's class, which is also an object represented by head cell.</p><p>Other fields contain pointers to ordinary instance variables. If all instance variables cannot be stored in a single cell, the last field contains pointer to the next (possibly tail) cell. Another possibility is to use something like indirect pointers as used in inode-based file systems. Data of indexed classes (arrays, byte arrays) are stored in the data space. An example of objects structure and its mapping to cell and data space is on figure 1 and figure 2. Note, that integers are stored as immediate values <ref type="bibr" target="#b0">[1]</ref>. Mapping XML data into cell and data space Another example of data that can be stored in CellStore is XML data. Although XML data can be stored into CellStore as normal objects (DOM nodes) as shown above, we are using more efficient, XML specific mapping.</p><p>There are 9 types of cells:</p><formula xml:id="formula_0">-character data cell -attribute cell -element cell -document cell -document type cell -processing instruction cell -comment cell -xml resource cell -collection cell</formula><p>The last two cell types represent XML:DB objects as described in <ref type="bibr" target="#b1">[2]</ref>. Each cell has a pointer to its parent cell, first child cell and sibling cell. Meanings of the last four fields depend of the type of the cell (see table <ref type="table" target="#tab_0">1</ref>).</p><p>Children of any cell are linked through the sibling pointer and parent holds pointer to the first child.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">The CellStore's virtual machine</head><p>Classic virtual machine consists of an object memory and an interpreter. Object memory is responsible for managing objects in memory, for efficient storage reclamation and the interpreter defines all the execution semantics. We think that it's possible to implement virtual machine on the top of CellStore storage, so one can think about CellStore as one large multi-user virtual machine with persistent, transaction-capable object memory.</p><p>The idea is to move as much functionality as possible to CellStore's virtual machine. This includes indexing algorithms, garbage collector, jitter etc. The CellStore should provide only basic object memory management and common, This allows user (programmer) to experiment with different algorithms, jitters, garbage collectors and, as long as the interpreter itself will be implemented on the top of CellStore, with different programming languages and code semantics.</p><p>The CellStore's virtual machine should provide only the following:</p><p>object memory management supporting only common object model as described in section 3.1 dumb, built-in interpreter which is capable if interpreting simple, limited language (bytecode) -we called it the bootstrap interpreter capability of trap out unknown language (bytecode) and let user-level interpreter to evaluate them. basic support for installing native (jitted) code into VM's native code cache There is no need for speed of any interpreter as long as the interpreter will be able to interpret jitter (implemented in any language). The jitter can translate itself into the native code to make itself fast and then translate the rest.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Architecture of CellStore database</head><p>The high level architecture of CellStore is shown on figure <ref type="figure" target="#fig_3">3</ref>. From the VM's side of view, OODB transaction manager plays the role of object memory, so it should provide interface similar to Smalltalk-80's object memory <ref type="bibr" target="#b0">[1]</ref>. In addition, it must provide an interface for transaction managing (start, commit, abort) and an interface for garbage collector.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Status of the CellStore project</head><p>The Cellstore project is developed at Department of Computer Science, FEE CTU Prague by Michal Valenta, Jan Vrany, Pavel Strnad, Karel Prihoda and Jan Zak.</p><p>Whole the project is developed in Smalltalk/X -a free smalltalk implementation. Smalltalk/X has been chosen because of its pure object orientation, source code availability, outstanding development tools and because of its extreme agility. To achieve practical performance, system can be translated to C <ref type="bibr" target="#b3">[4]</ref>.</p><p>In these days, only the lowest level storage manager is implemented. It can manage cell and data spaces. First experiments show that the storage is able to store whole INEX database <ref type="bibr" target="#b9">[10]</ref> (about 500MB of XML documents) using mapping described in section 3.1 without significant performance lost, that means that the document reconstruction time of single, randomly chosen document n was almost independent on database size.</p><p>First versions of cache and XML transaction managers are implemented and tested but they are not integrated to the rest of the system, yet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion and future work</head><p>This paper presented the vision of a pure object database built on the top of CellStore storage model. In CellStore virtual machine, as much components as possible is lifted up to "user-space", making experiments with different languages, semantics, jitter, garbage collectors and other algorithms and techniques very easy.</p><p>To make such system working, several things has to be developed:</p><p>-OODB transaction manager and its interface to bootstrap interpreter.</p><p>tiny bootstrap interpreter experimental, naive one-to-one non optimising jitter other language interpreter Once things mentioned above will be implemented and tested, we will have a working database system, which can be used as test bed for many different algorithms. Such system will make development of new approaches and algorithms very easy.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Example of object structure</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Example mapping objects into the cell and data space</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. High level architecture of CellStore</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>Meanings of fields in XML cells</figDesc><table><row><cell>Field</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Smalltalk-80: The Language and its Implementation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Goldberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Robson</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1983">1983</date>
			<publisher>Addison-Wesley</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<ptr target="http://xmldb-org.sourceforge.net/xapi/xapi-draft.html" />
		<title level="m">XML:DB Working Draft</title>
				<imprint/>
	</monogr>
	<note>XML:DB initiative</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<ptr target="http://wiki.cs.uiuc.edu/CampSmalltalk/VM+Issues" />
		<title level="m">Camp Smalltalk</title>
				<imprint/>
	</monogr>
	<note>VM Issues</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Back to the Future</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ingalls</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kaehler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Maloney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wallace</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kay</surname></persName>
		</author>
		<ptr target="http://users.ipa.net/~dwighth/squeak/oopsla_squeak.html" />
	</analytic>
	<monogr>
		<title level="m">The Story of Squeak, A Practical Smalltalk Written in Itself</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<ptr target="http://glorp.org/" />
		<title level="m">GLORP: Generic Lightweight Object-Relational Persistence</title>
				<imprint/>
	</monogr>
	<note>Camp Smalltalk</note>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Relational Persistence for Java and</title>
		<ptr target="http://www.hibernate.org/" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title/>
		<author>
			<persName><surname>Omnibase</surname></persName>
		</author>
		<ptr target="http://www.gorisek.com" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title/>
		<author>
			<persName><surname>Db4objects</surname></persName>
		</author>
		<ptr target="http://db4objects.org" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">/</forename><surname>Gemstone</surname></persName>
		</author>
		<ptr target="http://www.gemstone.com" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<ptr target="http://inex.is.informatik.uni-duisburg.de/2006/" />
		<title level="m">INEX: Initiative for the Evaluation of XML Retrieval</title>
				<imprint/>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
