<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Mining Knowledge TV: A Proposal for Data Integration in the Knowledge TV Environment</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">José</forename><forename type="middle">Carlos</forename><surname>Almeida</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal da Paraíba João Pessoa -PB</orgName>
								<address>
									<country key="BR">Brasil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Patrício</forename><surname>Junior</surname></persName>
							<email>jcapjunior@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Universidade Federal da Paraíba João Pessoa -PB</orgName>
								<address>
									<country key="BR">Brasil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Natasha</forename><surname>Correia Queiroz</surname></persName>
							<email>natasha@di.ufpb.br</email>
							<affiliation key="aff1">
								<orgName type="institution">Universidade Federal da Paraíba João Pessoa -PB</orgName>
								<address>
									<country key="BR">Brasil</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Mining Knowledge TV: A Proposal for Data Integration in the Knowledge TV Environment</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">4D9A95136F8DD5E21148724568453C73</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T01:23+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>H.3.4 [Systems and Software]: Information networks Algorithms</term>
					<term>Design</term>
					<term>Languages</term>
					<term>Standardization Data Mining</term>
					<term>Digital TV</term>
					<term>Digital TV personalisation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents Mining Knowledge TV, a module for data mining that is part of the Knowledge TV (KTV) Project. KTV proposes the specification of a semantic layer that is embedded in a Digital TV (DTV) environment, improving the way that content is accessed by other applications.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Interactive Digital TV <ref type="bibr" target="#b1">[1,</ref><ref type="bibr" target="#b2">2,</ref><ref type="bibr" target="#b8">8]</ref> is a new stage of TV technology, which intends to support the convergence of digital technologies through a systematic change from analogical to digital equipments and infra-structure. This change generates modifications in the whole productive chain, mainly in the consumption of final content.</p><p>In this scenario, this paper aims at presenting the specification of Mining Knowledge TV-MKTV, which focuses on the integration of data mining <ref type="bibr" target="#b3">[3]</ref> technology with semantic aspects, mostly of them derived from the AI Knowledge Representation and Semantic Web <ref type="bibr" target="#b4">[4,</ref><ref type="bibr">5,</ref><ref type="bibr" target="#b6">6]</ref> research. The MKTV is being developed in the context the Brazilian System of Digital TV -SBTVD and is part of the project goal to give the TVDI a semantic layer.</p><p>Among other aspects, it has the aim of providing a rich knowledge base of data descriptions, resources, services, applications and relations amount such elements.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">MINING KNOWNLEDGE TV -MKTV</head><p>The main aim of MKTV is the implementation of a KDD environment, which focuses on data mining and semantic information on the Knowledge TV platform <ref type="bibr" target="#b7">[7]</ref>. This solution will provide a priori unknown data to DTV applications that use the SBTVD Ginga middleware <ref type="bibr" target="#b8">[8]</ref> , so that they can use this solution to face issues such as information overload, personalization, directed merchandizing and so on.</p><p>The mining process will be carried out on the data from many sources, mainly the sources that come from the Service Information (SI) metadata table, which uses the MPEG2 standard in the Ginga DTV environment. This standard is used to represent information about TV programs, services and multimedia interaction. Examples of such information are channels, program schedule, program classification, etc. User behaviour is also an important kind of information source because it indicates, for example, the channels usually watched with start time and total watching period. The useful content obtained by means of data mining will be semantically enriched through the use of ontologies and then provided as a service to NCL or Java languages application developers. This is possible because Ginga supports the development of applications using both languages on its architecture. More information about the Ginga architecture can be seen in <ref type="bibr" target="#b8">[8]</ref>.</p><p>The data mining process acts on all these sources and generates new information that is semantically enriched by means of a domain ontology. This semantic process enables a better analysis and turns more explicit the meaning of the data mining resultant discovered knowledge. This semantic is provided as a service and creates opportunities, which can be used for NCL or Java developers to implement more powerful and sophisticated applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">ARCHITECTURE DESCRIPTION 3.1 Investigation of solution for data mining</head><p>Brazilian DTV is being characterized as an environment of technological convergence, new and extremely susceptible to changes. It is not yet completely standardized and it is constantly being updated. In this way, these aspects impose restrictions that we must consider during the architectural modelling. These evaluated aspects can be highlighted as restrictions:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head></head><p>The small processing capacity of the set-top box;</p><p> Reduced and unstable space for persistence of information;</p><p> Mechanism for exclusion of applications when changing channels, i.e.; the change of channel will delete all application information related to that channel.</p><p>All these limitations in the architecture of the STB lead us to use a hybrid approach detached from the middleware. That means that the components of the KTV (and consequently the MKTV) with highest consumption of resources (such as processing power and memory) will be exploring the Ginga middleware return channel <ref type="bibr" target="#b8">[8]</ref>. The return channel is the implementation of the htpp protocol on the DTV environment. That means that some components will be running on the web and will communicate via the Internet with DTV components.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Architecture</head><p>The Mining Knowledge TV (MKTV) is the component of the Knowledge TV architecture that accounts for the discovery and treatment of useful knowledge from the DTV data, users behaviour and other sources such as the Web. These data are initially stored in a local relational database and gradually we will start the process of extraction, transformation and load (ETL) of information. After the ETL process, the data will follow for the next module that is the Data Warehouse (DW) <ref type="bibr" target="#b3">[3]</ref>, a technique that is commonly used in conjunction with Data Mining <ref type="bibr" target="#b3">[3]</ref>. The DW will be organized in departmental Data Marts, in accordance with the domains and tasks to be mined (e.g. personalization, marketing, business), concentrating on historical data and integrated.</p><p>The historical data will be organised in the DW. Next, the Data Mining module applies data mining algorithms, searching and discovering useful patterns and information not known in the existent DW. The knowledge extracted through the MKTV will be encapsulated in semantic files with more expressive power (OWL files). Ontologies specification in OWL will be the standard for communication between the modules of the KTV. Figure <ref type="figure" target="#fig_0">1</ref> illustrates the KTV conceptual architecture and the MKTV module.</p><p>One application scenario is the problem of recommendation and personalization of content. To deal with such problems, specific modules, specified on the conceptual architecture, will be instantiated and executed. First the system stores the data that comes from the STB to a database. Then, the information related to user watched programs will be extracted to the Data Mart Personalization in the DW. After this process, it will be used clustering algorithms to find groups with similar preferences. Such knowledge discovered will feed and enrich the ontology specified in the semantic modelling layer and will return the pattern discovered in the form of recommendation to the user. For example, the next available programs similar to the ones the user uses to watch. Depending on the data mining goal, other tasks and algorithms can be applied to discover the desired knowledge. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">CONCLUSIONS AND DIRECTIONS</head><p>This paper describes our initial works on the Mining Knowledge TV(MKTV), which is part of the KTV project. The major aim of this project is to provide semantic knowledge to be used for other DTV applications. At the moment, the MKTV is in a development stage, so that we have carried out a survey of the state of the art in data mining for DTV. In addition, we have also identified the main data mining methods and algorithms that are currently used in the DTV, together with a list of tools that are compatible to this new computational and interactive platform.</p><p>We can testify the innovation feature of this proposal if we consider the few DTV works that focus on the joint use of knowledge representation and data mining techniques to generate a better quality set of data.</p><p>The next MKTV activities intend to simulate DTV data traffic and integrate content from the data mining process and semantic modelling sub-layer. As future work, during its validation stage, MKTV will collaborate with the JCollab Project [16], whose aim is to develop a platform to create journalistic content via a social network. Another potential future work is the investigation about the integration of MKTV solution to other Digital TV systems.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. KTV Architecture and MKTV module</figDesc><graphic coords="2,54.00,429.52,227.50,169.35" type="bitmap" /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><surname>References</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Interactive Digital Television: Technologies and pplications</title>
		<author>
			<persName><forename type="first">G</forename><surname>Lekakos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Chorianopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Doukidis</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<publisher>EUA</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">G</forename><surname>Lemos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fernandes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Elias</surname></persName>
		</author>
		<title level="m">Introdução à Televisão Digital Interativa: Arquitetura, Protocolos, Padrões e Práticas</title>
				<meeting><address><addrLine>Salvador, Bahia, Brazil</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
	<note>JAI Jornada de Atualização em informática</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Data Mining Concepts and Techniques</title>
		<author>
			<persName><forename type="first">J</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kamber</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>Editora Elsevier</publisher>
			<pubPlace>UK</pubPlace>
		</imprint>
	</monogr>
	<note>2a Edição</note>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">NoTube -Making TV a Medium for Personalized Interaction</title>
		<author>
			<persName><forename type="first">L</forename><surname>Aroyo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Conconi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dietze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kaptein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Nixon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Nufer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Palmisano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Vignaroli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yankova</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009. 2009</date>
			<pubPlace>EuroITV; Leuven, Belgium</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Semantic TV resources brokering towards future television</title>
		<author>
			<persName><forename type="first">H</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dietze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Benn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">1st NoTube workshop on Future Television</title>
				<meeting><address><addrLine>EuroITV</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010. 2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<ptr target="http://www.w3.org/2001/sw/" />
		<title level="m">W3C Semantic Web Activity</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>World Wide Web Consortium</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Aspectos Semânticos e Convergência Digital (Web e TV Digital</title>
		<author>
			<persName><forename type="first">N</forename><surname>Lino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Araújo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lemos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Siebra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 2a. Conferência Web W3C Brasil (W3C Web.br 2010</title>
				<meeting>2a. Conferência Web W3C Brasil (W3C Web.br 2010<address><addrLine>Belo Horizonte, Brasil</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The Procedural Middleware for the Brazilian Digital TV System</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">L D</forename><surname>Souza Filho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">E C</forename><surname>Leite</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E C F</forename><surname>Batista</surname></persName>
		</author>
		<author>
			<persName><surname>Ginga-J</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the Brazilian Computer Society</title>
		<idno type="ISSN">0104-6500</idno>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="47" to="56" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">JCollab: Uma Ferramenta para Produção e Distribuição de Telejornais no Contexto da Web 2.0</title>
		<author>
			<persName><forename type="first">J</forename><surname>Mangueira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Oliveira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Alves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Medeiros</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Lemos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">XXXVI Conferência Latino-Americana de Informatica -CLEI</title>
				<meeting><address><addrLine>Assunção -Paraguai</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
