<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">The Pharmacology Workspace: A Platform for Drug Discovery</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alasdair</forename><forename type="middle">J G</forename><surname>Gray</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sune</forename><surname>Askjaer</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Christian</forename><surname>Brenninkmeijer</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kees</forename><surname>Burger</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Netherlands Bioinformatics Center</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Christine</forename><surname>Chichester</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Netherlands Bioinformatics Center</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">James</forename><surname>Eales</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Chris</forename><forename type="middle">T</forename><surname>Evelo</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Maastricht University</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Carole</forename><surname>Goble</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Paul</forename><surname>Groth</surname></persName>
							<affiliation key="aff3">
								<orgName type="institution">VU University Amsterdam</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Harland</forename><surname>Lee</surname></persName>
							<affiliation key="aff4">
								<orgName type="department">Connected Discovery</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Antonis</forename><surname>Loizou</surname></persName>
							<affiliation key="aff3">
								<orgName type="institution">VU University Amsterdam</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Steve</forename><surname>Pettifer</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rishi</forename><surname>Ramgolam</surname></persName>
							<affiliation key="aff5">
								<orgName type="department">Academic Concept Knowledge Limited</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mark</forename><surname>Thompson</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Netherlands Bioinformatics Center</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andra</forename><surname>Waagmeester</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">Maastricht University</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Antony</forename><forename type="middle">J</forename><surname>Williams</surname></persName>
						</author>
						<author>
							<persName><forename type="first">H</forename><surname>Lundbeck</surname></persName>
						</author>
						<title level="a" type="main">The Pharmacology Workspace: A Platform for Drug Discovery</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">A20012228135FC2D9C0B35CF70C9A4FA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T23:51+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We present the Open PHACTS linked data platform that is being developed to address a set of example drug discovery research questions and which supports several drug discovery applications. The platform retrieves data from many complementary, but overlapping, data sources to present an integrated view of the data. The platform exploits two entity resolution services: respectively for transforming text and chemical structures to a concept. The single concept URI provided by the resolution service is then expanded to a set of equivalent URIs used by the data sources.</p><p>Availability. An alpha version is currently available to the Open PHACTS consortium. A first public release of the platform will be made in late 2012, see http://www.openphacts.org/.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>EXTENDED ABSTRACT</head><p>The investigation and development of new drugs requires that scientists involved in the process deal with multiple information sources. These range from online databases of proteins (e.g. UniProt and Enzyme) and chemicals (e.g. ChEMBL, ChemSpider, and DrugBank), to models of biological pathways (e.g. Reactome, WikiPathways, and KEGG) and scientific literature. These information sources are often held in different formats and sourced from a wide variety of organizations. Together they cover a wide area of the scientific space of interest, but overlap in the data they provide and also record different (or even inconsistent) representations of the same data.</p><p>A significant challenge to scientists is the labour intensive integration of datasets. The entities of interest must be identified and mapped to each other to allow complementary information from many data sources to be collated in a single record. For example, ChemSpider contains data about chemical compounds and where they can be sourced, while ChEMBL complements this with data about the bioactivity of drug-like molecules and DrugBank provides information on the clinical use of drugs which contain the molecules. These data sources can be linked based on the chemical structure of the compounds. However, differences in scientific or technical approaches to molecular structure representation mean that different data sources will not always be in agreement, often varying in the charged state of the compound, e.g. "Simvastatin" on ChemSpider 1 and DrugBank 2 . Thus, for successful data integration one must devise strategies that address inconsistencies within the existing data.</p><p>The linked data platform being developed in the Open PHACTS project<ref type="foot" target="#foot_0">3</ref> aims to overcome these data integration challenges. There are two key entry points into the system, both of which perform resolution from user input to an identifier for a data concept.</p><p>The first is through keyword search, as shown in Figure <ref type="figure" target="#fig_1">1</ref>. In the pharmacology domain, this is more than just text matching as keywords can often match to multiple often very distinct concepts. For example, when typing "menthol" does the user mean the chemical menthol, or the menthol receptor protein. The user interface supports this disambiguation by providing different entry points, e.g. compound by name or target by name (shown in Figure <ref type="figure" target="#fig_1">1</ref>). The Identifier Resolution Service (IRS) translates userentered entity names (in free text form), together with the context information, into known entities within the system (i.e. that have a defined URI). The IRS uses several dictionaries including a custom dictionary of chemical names and synonyms from ChemSpider, as well as MeSH, GO, and SwissProt. The IRS provides data for the auto-complete text box including the preferred name for the entity and a link to its definition. This supports the user in disambiguating the entity that they mean. The identified entity URI can then be used to retrieve further information from the linked data platform.</p><p>The second entry point is through chemical structure search that uses a tool for drawing chemical structures which are then converted to a standardised chemical structure representation. This is then processed by the ChemSpider structure search service to return a ChemSpider URI for the chemical entity drawn. The service can also be used for substructure and similarity searches.</p><p>The linked data platform leverages the comprehensive work already performed by the community in creating RDF-based datasets, which are relevant for the Open PHACTS project. The current platform uses the ChEMBL and ChEBI datasets provided by the Chem2Bio2RDF project <ref type="bibr" target="#b0">(Chen et al., 2010)</ref>, the conversion of DrugBank provided by the LODD project <ref type="bibr" target="#b2">(Samwald et al., 2011)</ref>, and the conversion of the Enzyme database sourced from UniProt <ref type="bibr" target="#b1">(Jain et al., 2009)</ref>. A significant challenge is ensuring that the RDF versions of the datasets are kept up-to-date with the originals from which they are derived. For example, the Chem2Bio2RDF version of ChEMBL is version 8 whereas the original dataset is now at version 13.</p><p>The data sources are integrated using parameterized SPARQL queries that are called through an API exposed by the linked Gray et al.  data platform. The API call generates a query containing the URI returned by the IRS. The query is then expanded at execution time using an identity mapping service that equates the data entity URIs from the various data sources. To provide adequate interaction speeds, we have cached the datasets in the linked data platform.</p><p>The result for doing a compound lookup with the search term "Aspirin" is shown in Figure <ref type="figure" target="#fig_2">2</ref>. Information about the chemcial structure is sourced from ChemSpider, details of its bioactivity are obtained from ChEMBL, and information about the drugs in which the compound is active are obtained from DrugBank. Currently, the provenance of the data points is not shown in the user interface, although this is planned for the public release.</p><p>The linked data platform is being developed to answer a set of pharmacology research questions that require data to be integrated from a variety of data sources <ref type="bibr" target="#b3">(Williams et al., 2012)</ref>. The platform hides the complexities of interacting with the linked data and concepts by exposing an API that provides the core functionality to support a wide variety of drug discovery applications being developed within the Open PHACTS project, although only one has been shown in this demonstration paper.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>1</head><label></label><figDesc>http://www.chemspider.com/Chemical-Structure. 49179.html accessed May 2012. 2 http://www.drugbank.ca/drugs/DB00641 accessed May 2012.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. Screenshot showing a search with the identifier resolution service for the term "menthol".</figDesc><graphic coords="2,148.14,94.48,301.28,92.73" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Screenshot showing the integrated information returned for Aspirin.</figDesc><graphic coords="2,148.14,223.81,301.27,225.04" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">http://www.openphacts.org/ accessed May 2012.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGEMENTS</head><p>The research leading to these results has received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement number 115191, resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013) and EFPIA companies' in kind contribution.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data</title>
		<author>
			<persName><forename type="first">B</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Jiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Wild</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">255</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Infrastructure for the life sciences: design and implementation of the UniProt website</title>
		<author>
			<persName><forename type="first">E</forename><surname>Jain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bairoch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Duvaud</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Phan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Redaschi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Suzek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Martin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Mcgarvey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Gasteiger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">136</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Linked open drug data for pharmaceutical research and development</title>
		<author>
			<persName><forename type="first">M</forename><surname>Samwald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jentzsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Bouton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Kallesoe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Willighagen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hajagos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Marshall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Prud'hommeaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Hassanzadeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Pichler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Stephens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Cheminformatics</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">19</biblScope>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Open PHACTS: Semantic interoperability for drug discovery</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">J</forename><surname>Williams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Harland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Groth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Pettifer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Chichester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">L</forename><surname>Willighagen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">T</forename><surname>Evelo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Blomberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Ecker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Goble</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mons</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Drug Discovery Today</title>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
	<note>To appear</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
