<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Building Interaction Profiles for Beer Search Tools in DLs</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Maram</forename><surname>Barifah</surname></persName>
							<email>maram.barifah@usi.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Informatics</orgName>
								<orgName type="institution">Università della Svizzera italiana (USI</orgName>
								<address>
									<settlement>Lugano</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Building Interaction Profiles for Beer Search Tools in DLs</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">5E399D2718D69006733340B6AC34C51D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T16:42+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This research starts by considering users of a digital library (DL) and aims at using the data extracted from logged les, including search strategies, and queries, to build eective user-interaction proles. and use them to guide designers and systems developers in the production of more usable, useful and eective interaction proles. It is important to stress that the proposed interaction proles are built by extracting a number of features from the log les. Thus, they contain information about real searching experiences, including usage patterns, user familiarity with the system, and time intervals.</p><p>This study is conducted in collaboration with RERO Doc digital library 1 . RERO Doc is the network of the libraries of Western Switzerland. Users from dierent parts of the world can search on dierent domains: Nursing, Economics, Computer Science and others. RERO Doc provides various document types such as books, articles, theses, periodicals,etc. Thus, the research questions are:</p><p>(1) What are the most suitable techniques to produce rich/realistic groups of data extracted from log les in order to build interaction proles? (2) What are the main features to characterise interaction proles? (3) What is the minimum size of data to produce robust groups to use for building interaction proles? Data preparing and processing: Interface analysis: the interface is inspected in order to understand dierent search options. Preprocessing phase: we follow the framework of [2] as the following: Data loading: the dataset consists of 59 million records 20 GB collected over a six-month period. Data cleaning: including users identications hidden, elimination of the erroneous, and corrupted records. Data parsing: consists of sessions recognition and removing the nonhuman sessions e.g. Googlebot, SemanticScolarBot. The session "is a common unit of interaction that is used in search log analysis" <ref type="bibr" target="#b2">[3]</ref>. Session recognition depends on the user interactions and on the features of the interface. This phase is crucial for identifying distinct classes of searching patterns <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. We identify a session by the combinations of user IP, time stamp, and user agent extracted from the log les. Also, the non-human requests are removed in this stage and so data is reduced to 9 GB. Data coding: in this phase, the URL requests were analysed and divided into meaningful parts including user IP, time stamp, request, referrer, user agent, session IP. Researchers follow dierent strategies to analyse the URLs embodied on the log les. For example, <ref type="bibr" target="#b0">[1]</ref> </p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>build a hierarchical taxonomy of the website, and <ref type="bibr" target="#b1">[2]</ref> dene a code schema of the all types of the interactions based on analysing and understanding the structure of the website. In this research we consider both strategies. Features engineering: Based on the interface and log les analysis, the meaningful features are identied. Mining user behaviour: The remaining sessions are further analysed and grouped based on the available variables in the records. We started the analysis of the rst million record which consists of 125000 session records. 72095 sessions were detected with only one record. The aim of this phase is to identify dierent usage patterns among user interactions and group them accordingly. Two main dierent grouping techniques were used: topic modelling and K-Means. For 6 topic model, the Coherence Score of the topic modelling is 0.35. For K-Means the estimated number of clusters is 6 with Silhouette Coecient of 0.95. So far, six dierent usage patterns have been identied and interpreted qualitatively:</p><p>(1) Single sessions or known-item, where searchers visit RERO for downloading documents without any interactions. (2) Complicated sessions, where the usage pattern is characterised by heavily interactions including submitting queries, browsing, and using dierent functions. (3) Light navigators who navigate the library for navigating without using dierent functions on the interface. (4) Advance navigators whose navigations are characterised by using dierent functions and many iterations. (5) Light browsers whose searching is simple, short without using dierent functions. (6) Advanced browsers, their interactions is long and including advance search functions. In conclusion, log le analysis is an unobtrusive method to detect usage patterns of digital library. The aim of this research in progress is to build interactions proles in the digital library context in order to gain more insights into users searching experiences. We plan more experiments to test and compare the eectiveness of the techniques used to group data for preparing interaction proles. Then, we will involve experts to assess the quality of these interaction proles and how eective these are in assisting designers and system developers in the production of more usable, useful and eective tools to support searchers.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="1,0.00,191.12,595.00,459.77" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://doc.rero.chDESIRES, August</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2018" xml:id="foot_1">, Bertinoro, Italy © 2018 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Stochastic modeling of usage patterns in a web-based information system</title>
		<author>
			<persName><forename type="first">Hui-Min</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><forename type="middle">D</forename><surname>Cooper</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the Association for Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">53</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page" from="536" to="548" />
			<date type="published" when="2002">2002. 2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Towards an integrated clickstream data analysis framework for understanding web users&apos; information behavior</title>
		<author>
			<persName><forename type="first">Yu</forename><surname>Chi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tingting</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daqing</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rui</forename><surname>Meng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">iConference 2017 Proceedings</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Categorising search sessions: some insights from human judgments</title>
		<author>
			<persName><forename type="first">Tony</forename><surname>Russell-Rose</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Paul</forename><surname>Clough</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elaine</forename><forename type="middle">G</forename><surname>Toms</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Information Interaction in Context Symposium</title>
				<meeting>the 5th Information Interaction in Context Symposium</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="251" to="254" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
