<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main"></title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<affiliation key="aff0">
								<orgName type="department">MELINDA T ÓTH and TAM ÁS KOZSIK</orgName>
								<orgName type="laboratory" key="lab1">VIKT ÓRIA F ÖRD ŐS</orgName>
								<orgName type="laboratory" key="lab2">ELTE-Soft Nonprofit Ltd</orgName>
								<orgName type="institution">Eötvös Lor ánd University</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Ericsson-ELTE-Soft-ELTE Software Technology Lab</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="institution">ány Péter sét ány</orgName>
								<address>
									<addrLine>C P ázm</addrLine>
									<postCode>1117</postCode>
									<settlement>Budapest</settlement>
									<country key="HU">Hungary</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<orgName type="institution">P ázm ány Péter sét ány</orgName>
								<address>
									<postCode>1117</postCode>
									<settlement>Budapest</settlement>
									<country key="HU">Hungary</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff4">
								<orgName type="institution">ány Péter sét ány</orgName>
								<address>
									<addrLine>C P ázm</addrLine>
									<postCode>1117</postCode>
									<settlement>Budapest</settlement>
									<country key="HU">Hungary</country>
								</address>
							</affiliation>
						</author>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">FD3D2A9389025ED9B314F7D313A992EE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T07:48+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>D.2.3 [Software Engineering] Coding Tools and Techniques</term>
					<term>D.2.7 [Software Engineering] Distribution, Maintenance, and Enhancement</term>
					<term>F.2.2 [Analysis of algorithms and problem complexity] Nonnumerical Algorithms and Problems Languages, Design Grouping, Filtering, Clone detection, Erlang, Suffix tree, Static program analysis</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Code clones are unwanted phenomena in legacy code that make software maintenance and development hard. Detecting these clones manually is almost impossible, therefore several code analyser tools have been developed to identify them. Most of these detectors apply a general token or syntax based solution, and do not use domain specific knowledge about the language or the software. Therefore the result of such detectors contains irrelevant clones as well. In this paper we show an algorithm to refine the result of existing clone detectors with user defined domain specific predicates to preserve only useful group of clones and to remove clones that are insignificant from the point of view defined by the user.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Code clones, the result of the "copy&amp;paste" programming technique, have negative impact on software quality and on the efficiency of the software maintenance process. Although copying may be the fastest way of creating a new feature, after a while it is really hard to detect and maintain the multiple instances of the same code snippets.</p><p>Based on static source code analysis, clone detectors try to identify code clones automatically. Several clone detectors exist <ref type="bibr" target="#b12">[Roy et al. 2009</ref>] applying different techniques to select the clones. These techniques include string, token, syntax and also semantics based approaches.</p><p>In the context of the Erlang programming language <ref type="bibr" target="#b0">[Armstrong 2007</ref>], there are three clone detectors <ref type="bibr" target="#b10">[Li and Thompson 2009;</ref><ref type="bibr" target="#b6">Fördős and Tóth 2014b;</ref><ref type="bibr" target="#b4">2013]</ref> implementing different techniques to select duplicated code. Although the clones identified by these techniques can be considered duplicates, some of them are irrelevant in certain points of view. The filtering system of Clone IdentifiErl allows users to tailor the result in different ways using domain specific knowledge about the language.</p><p>This filtering technique can be easily applied on duplicate code detectors that yield clone pairs <ref type="bibr" target="#b6">[Fördős and Tóth 2014b;</ref><ref type="bibr" target="#b4">2013]</ref>: it simply leaves out the pairs which do not fulfil the requirements. When a clone detector groups the identified clones <ref type="bibr" target="#b1">[Baker 1996;</ref><ref type="bibr" target="#b9">Koschke 2012;</ref><ref type="bibr" target="#b5">Fördős and Tóth 2014a]</ref>, the result is more comprehensible, but makes the filtering less straightforward. Filtering out some part of a group of clones results in smaller groups of clones. Sometimes smaller means that we have less group members, in other cases we have smaller clones -or both.</p><p>In this paper we show a general, language independent algorithm to refine the result of existing clone detectors that produce groups of clones. We apply domain specific predicates to the clones to filter out useful groups of clones from different points of view. For example, the clone elimination process can be simplified by removing clone instances that are difficult to be eliminated. The filtering system can be also used to exclude those clones from the result that are duplicates of any given exceptions. For instance, consider that the generated source code should be removed from the results. We also emphasize here that maintenance time can be decreased by focusing on the clones that cause the hardest problems, so the developer can work on the most useful maintenance tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">RELATED WORK</head><p>Clone detection is a wide field of research. Here we focus on filtering and grouping techniques.</p><p>Several detectors for duplicates exist, but only a few of them concentrate on functional languages, such as <ref type="bibr" target="#b3">[Brown and Thompson 2010]</ref> developed for Haskell, and Wrangler <ref type="bibr" target="#b10">[Li and Thompson 2009]</ref> for Erlang. We have proposed Clone IdentifiErl <ref type="bibr" target="#b4">[Fördős and Tóth 2013]</ref> for Erlang, which is an AST/metric based approach. We have also published a purely metric driven algorithm <ref type="bibr" target="#b6">[Fördős and Tóth 2014b</ref>] that characterises the Erlang language by using software metrics to identify clones in Erlang programs. In Clone IdentifiErl, a new standalone extensible filtering system has been introduced to filter out irrelevant clones, whilst in <ref type="bibr" target="#b6">[Fördős and Tóth 2014b]</ref> we have given a filtering system that is capable of removing both irrelevant and false positive clones. The papers <ref type="bibr" target="#b6">[Fördős and Tóth 2014b;</ref><ref type="bibr" target="#b4">2013]</ref> argued for the necessity of this step, and presented a domain-specific implementation for Erlang.</p><p>Earlier, <ref type="bibr" target="#b8">Juergens and Göde [2010]</ref> have proposed an iterative, configurable clone detector (ConQAT) containing a filtering system. ConQAT can remove repetitive generated code fragments and overlapping clones by iteratively reconfiguring and rerunning its initial clone detector.</p><p>Different clone detection techniques have been used by known detectors. Some of them, e.g. the suffix-tree algorithm <ref type="bibr" target="#b1">[Baker 1996</ref>], form groups from the resulting clones. On the other hand, there are algorithms that produce clone pairs. In this latter case, it is also possible to group the results, but this has an additional computational overhead. For example, paper <ref type="bibr" target="#b5">[Fördős and Tóth 2014a]</ref> shows a general and broadly usable method to group the result of clone detection algorithms.</p><p>Although significant research has been carried out in this area, grouping and filtering in one step is a novel technique. The method, presented in this paper, does not conflict with the already existing techniques and tools: it is an additional tool to refine existing results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">IMPROVING CLONE DETECTION</head><p>There are many algorithms for detecting code duplicates. They differ in accuracy as well as in execution time. Our goal is to improve accuracy without compromising efficiency. In this paper, we propose a standalone, and also a language independent, approach that can be used after any duplicated code detector to tailor its results. Here, we also present how it can be configured to facilitate accurate clone detection in Erlang programs.</p><p>Clone detectors result in either clone pairs or clone groups. Regardless of the format of the result, our algorithm can improve it until pairs can be considered as groups of two-elements. Therefore, our algorithm is defined to work with initial clone groups. It examines each initial clone group whether all elements of the group are "relevant" enough. The goodness of an element is judged by easily replaceable filters, thus the behaviour of the algorithm can be customised to fit various purposes. Our algorithm decomposes a group into sub-groups on which all the filters hold, and removes irrelevant elements from the result.</p><p>The algorithm provided here assumes certain properties for initial groups. Namely, each clone in a group must have the same amount of building blocks. In our Erlang implementation, the building block is a top-level expression, where a top-level expression refers to the main expression or to a sequence of main expressions making up a function clause. There are many clone detectors <ref type="bibr" target="#b12">[Roy et al. 2009</ref>] that produce initial clone groups for which the required property holds. Note that a well-known detection technique, the suffix tree based algorithms <ref type="bibr" target="#b1">[Baker 1996;</ref><ref type="bibr" target="#b9">Koschke 2012]</ref>, provide initial clone groups with the required property. A clone detector employing suffix tree <ref type="bibr" target="#b7">[Gusfield 1997</ref>] has been implemented in Wrangler <ref type="bibr" target="#b10">[Li and Thompson 2009]</ref> (and also in RefactorErl [RefactorErl project 2014]), and we will use this kind of initial groups for the practical evaluation of our approach in this paper. Before we detail our algorithm, we briefly review the general clone detection technique that uses suffix tree.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Suffix tree</head><p>Usually, clone detectors that use suffix tree are token-based algorithms. Tokens of the analyzed program are mapped to words of a formal language based on the token kind. For instance, identifiers and literals can be mapped to 'a', the begin keyword is mapped to 'b', the case keyword is mapped to 'c' and so on. The suffix tree is constructed from using the transformed tokens. The groups of initial clones are gathered as subtrees from the entire suffix tree. This step requires O(n * log n) (n denotes the number of tokens) steps in the worst case.</p><p>The main advantages of this technique are the low computational cost and the compact result, because it demands no further grouping. Several duplicated code detectors <ref type="bibr" target="#b1">[Baker 1996;</ref><ref type="bibr" target="#b9">Koschke 2012]</ref> use this algorithm as their initial clone detectors, because suffix tree based clone detection is a general technique that can be easily applied to detect clones in any programming language.</p><p>However, its general applicability implies its weak points. It is a token-based detector with no builtin knowledge of programming languages, thus several clones forming no valid syntactical unit may appear in its result. These clones need to be further cut to meet the syntactical rules of the programming language in which the clones were implemented. Hereupon, it produces several useless clones that consist of a few tokens and have nearly no syntactic characteristics. Last but not least, finding gapped clones becomes impossible, because the algorithm of suffix tree cannot deal with these clones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Filtering</head><p>Our algorithm examines each initial clone group whether its elements satisfy the required properties described by the user defined filters. Filters are applied to the building blocks of the clone instances that belong to the same group. Filters can be chosen arbitrarily to fit any purpose.</p><p>Our algorithm allows developers to concentrate only on important clones by removing irrelevant clones from the result, as Clone IdentifiErl does. Our algorithm can also come to rescue, if the result needs to be cleaned by excluding elements that are required to be duplicated. For instance, consider constraints originating from business logic, or the cases when a function acts as a bridge that connects two applications. Expressions referring to these functions are obviously clones, but they are necessarily present. For another example, consider that the examined source code contains a parser (e.g.: yecc) generated source code which can be excluded from the clone groups by using our algorithm. To best of our knowledge, no Erlang specific approach can handle such exceptions.</p><p>In this paper, we show how our algorithm can ease the elimination process by using Erlang specific filters. Although a suffix tree based detector is being employed to produce the initial clones, our implementation already ensures that the initial clones of a group are real syntactic clones that differ only in the used identifiers and operators. Thus, we only want to check that the following two predicates hold for a group to ease the elimination process.</p><p>-The elements of the group refer to nearly the same set of functions. The first property excludes clones whose elements call mostly different functions, because the functionalities implemented in these clones are likely to be independent. Thus, the elimination of these clones is not a high-priority task. For instance, consider Figure <ref type="figure" target="#fig_1">1</ref>. The latter property is greatly Erlang specific. Note that these filters should only be replaced when tailoring the algorithm to report clones written in another programming language.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Grouping &amp; filtering in one step</head><p>In this section we briefly explain how to process an initial clone group by filtering and regrouping clones. The algorithm decomposes a group containing m pieces of n-unit long clone instances into subgroups based on filtering results.</p><p>The original group can be best understood as an expression matrix of size n × m: a column in this matrix is a clone instance, i.e. a sequence of n (top-level) expressions appearing in the program that was categorized by the initial grouping as a clone of the other columns in the same matrix. A row in the matrix contains m occurrences of a "similar" expression. For efficiency of the initial clone detection phase, this similarity may be too permissive. We can design more specific "filters" to express domainspecific knowledge about "relevant" clones, by considering two expressions similar in a more restrictive manner. For instance, we can introduce a filter which considers two expressions different if they refer to different record types -even if they were declared similar by the initial clone detection.</p><p>A subgroup is formed as an intersection of some selected rows and columns from the original expression matrix. Columns of a subgroup are clones that are relevant from the point of view of the applied filtering mechanism. Our algorithm will try to find maximal sized sub-groups, with elements consisting of as many units as possible.</p><p>For the sake of the example, assume that we characterise expressions based on the referred record types. Let us denote an expression with 'a' if it refers to a record type 'a'. In Figure <ref type="figure">2</ref>, a characterisation of an initial clone group can be seen. The group contains five (i.e.: m = 5) three-unit long clone instances (i.e.: n = 3). The elements of the matrix represent the records referred by the expressions of the clones; the third element of the first row is an expression referring to record 'a' (only). Note that only two of the initial clones in the group are considered relevant by this filtering: column 1 and column 3. Furthermore, our algorithm will identify a shorter clone consisting of two top-level expressions c;e in columns 2 and 4 as well. We want to maximise both the length (number of columns) and the size (number of rows) of every constructed sub-group. These properties are orthogonal to each other, therefore the maximisations of them impede each other.</p><p>We propose an iterative algorithm. We start by identifying sub-groups having one-unit long clone instances. For our example, such sub-groups are shown in Figure <ref type="figure">3</ref>: in each row, the elements that belong to the same sub-group are painted using the same shade of grey. Obviously, we create maximal sized sub-groups. In the third row, for instance, we could identify two sub-groups (one for d and another one for e), both containing two columns. In the first row, however, we have a sub-group with three columns.</p><p>Next, we try to improve the other dimension, i.e. to lengthen the elements of sub-groups. This goal is achieved in two steps. First, we try to join the previously determined sub-groups; here care must be taken not to lose any existing maximal sized sub-groups. We can join two sub-groups if there are no rows between them (clones are continuous blocks of top-level expressions), and they share at least two columns (i.e. at least two clones contain the matching expressions). Furthermore, if there is a clone instance in any of the to-be-joined sub-groups that is not included in the newly created sub-group, then the original sub-group containing this clone instance must be preserved (otherwise it can be thrown away). We will come back to this covering problem soon.</p><p>Joining is illustrated in Figure <ref type="figure" target="#fig_2">4</ref>. Note that matrix elements may belong to multiple sub-groups, as happened with the bs in the second row of the expression matrix. We could join those bs with the as of the first row, as well as with the ds of the third one.</p><p>In the second step of sub-group lengthening, we iteratively glue overlapping sub-groups together. Check the left half of Figure <ref type="figure">5</ref>: two groups (a;b in the first two rows and b;d in the last two rows) are glued together based on the overlapping in the second row (dashed rectangles). Naturally, it is required that glued sub-groups have at least two common columns: without that they would not be clones. Here the sub-groups share the first and the third columns.</p><p>Again, if a clone instance that belongs to any of the input sub-groups is not present in the glued subgroup, then its containing sub-group must be preserved. (We will refer to this phenomena as the new sub-group not covering the old one.) This can be observed in the right half of Figure <ref type="figure">5</ref>. After constructing the new sub-group a;b;d in columns 1 and 3, we can drop sub-group b;d, but we must preserve the sub-group a;b in the first two rows, since it contains the third column, which is not included in the new sub-group, and hence the new sub-group does not cover the original a;b sub-group.</p><p>The gluing step is repeated until there are no sub-groups that can be glued together. Then the algorithm terminates, and outputs the determined sub-groups as a refined grouping of clones.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">FORMAL DESCRIPTION</head><p>Now we define our filtering&amp;grouping algorithm more precisely. The algorithm operates on a single clone group, represented as an expression matrix of size n × m. Each column represents a sequence of top-level expressions in a function clause (a clone instance), and a row corresponds to similar expres- </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Joining clone instances</head><p>The second step of the algorithm takes clone subgroups containing one unit long clones, and try to join subgroups.</p><formula xml:id="formula_0">S 1 → S 2</formula><p>Joining can be defined in two steps. First, we introduce S 1 as follows.</p><formula xml:id="formula_1">S 1 = ( 1 , u 2 ), c 1 ∩ c 2 ( 1 , u 1 ), c 1 ∈ S 1 , (u 1 + 1, u 2 ), c 2 ∈ S 1 , |c 1 ∩ c 2 | &gt; 1</formula><p>Let the binary relation covers over selections be defined as a partial order in the following way:</p><formula xml:id="formula_2">s 1 covers s 2 if and only if s 2 .c ⊆ s 1 .c ∧ s 1 .r. ≤ s 2 .r. ∧ s 2 .r.u ≤ s 1 .r.u</formula><p>Finally, we can provide S 2 by combining S 1 and S 2 and eliminating selections that are already covered by other, larger selections.</p><p>S 2 = S 1 ∪ S 1 \ s ∃s ∈ S 1 : s covers s</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Glueing clone instances</head><p>The third step, which must be repeated until fixed point is reached (which will happen after no more than n − 2 iterations) is also described in two steps.</p><formula xml:id="formula_3">S i = s s 1 , s 2 ∈ S i , s 1 .r overlaps with s 2 .r, s.r. = min(s 1 .r. , s 2 .r. ), s.r.u = max(s 1 .r.u, s 2 .r.u), s.c = s 1 .c ∩ s 2 .c, |s.c| &gt; 1</formula><p>where two blocks of rows are overlapping, i.e.</p><p>( 1 , u 1 ) overlaps with ( 2 , u 2 ) if and only if ( 1 ≤ 2 ≤ u 1 ) ∨ ( 2 ≤ 1 ≤ u 2 ).</p><p>Now we can define S i+1 by removing all the selections from S i that are covered by other, larger selections.</p><p>S i+1 = S i \ s ∃s ∈ S i : s covers s</p><p>When the iteration of this third step reaches fixed point, the last set of selections, S t can be used to determine the set of subgroups returned by our algorithm. For each selection (( , u), c) ∈ S t , we yield a subgroup of size (u − + 1) × |c|, containing the intersection of the selected rows and columns of the initial expression matrix.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">CONCLUSIONS</head><p>In this paper, we proposed a broadly usable filtering algorithm that quickly removes those clones from the results that are insignificant from the point of view defined by the user. The proposed algorithm is language independent, thus the results of many duplicated code detectors can efficiently be improved. By removing irrelevant clones, the maintenance costs can be decreased, because the programmers need to only deal with important issues. In this paper, we defined rules that are specialised for easing the clone elimination process in Erlang programs. We discussed the underlying ideas, and we also gave a formal description of our algorithm.</p><p>We note that we successfully evaluated the realisation 1 of the algorithm and assessed the results. All of our goals were reached; clones that are hard to eliminate are not present in the results. The filtering 1 The authors would like to thank to Bence Szabó for the implementation.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 1 .</head><label>1</label><figDesc>Fig. 1. A clone whose instances refer to different functions</figDesc><graphic coords="4,152.88,195.10,84.23,72.70" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Joining sub-groups Fig. 5. Finding sub-groups to be glued</figDesc></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>3:22</head><p>• V. F örd ős, M. T óth and T. Kozsik phase requires only a small extra computational cost that is infinitesimal. Moreover, it removes clones that are insignificant from the point of view defined by the user. Thus, the algorithm quickly cleans the result and helps programmers focus on only important cases.</p><p>Future work will consist of evaluating the proposed algorithm by using initial clones reported by different clone detectors and studying and comparing the results of these test runs. Differences in the results will indicate areas for future study.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>V. F örd ős, M. T óth and T. Kozsik sions with respect to an initial clone detection algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>G ∈ E n×m</head><p>For filtering out irrelevant clones, we will use a reflexive and symmetric (but not necessarily transitive) binary relation over expressions.</p><p>Our algorithm takes a clone group G and a filtering relation f , and produces a set of subgroups, which refines the grouping G.</p><p>Each subgroup G i represents a clone group made up of m i clone instances, each instance having length n i , where</p><p>A subgroup selects a block of rows and some of the columns of the original expression matrix. The initial clones represented by the selected columns in the subgroup contain an expression sequence (the selected rows) which is accepted as a "relevant" clone.</p><p>We can represent a subgroup with a "selection" s, relative to an initial group G. The block of rows selected by s is denoted by s.r, where s.r. is the lower, and s.r.u is the upper bound of the selection. The set of columns selected by s is denoted by s.c. (To improve the efficiency of the algorithm, s.c can be represented as an ordered list of numbers.)</p><p>The algorithm is defined as two steps (Sections 4.1 and 4.2, respectively) followed by an iteration of a third step (Section 4.3). No more than n − 2 iterations of the third step are needed; the algorithm can terminate earlier if fixed point is reached.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">One unit long clone instances</head><p>The first step of the algorithm produces S 1 , a set of selections of the initial group.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>G, f → S 1</head><p>Each clone instance of each selection in S 1 has length 1 (these clone instances are formed from only one expression). This is the only step of the algorithm where filtering takes place, and predicate f is used. All pairs formed from the elements of each selection in S 1 satisfy predicate f . In the subsequent steps we shall maximize both the length of reported clone instances, and the size of the subgroups.</p><p>where graph(f, G, i) is the graph of f regarding to the expressions of the i th row in G with vertices {1, . . . , m} and edges</p><p>and V ∈ MaxProperCliques(g) means that V is a clique (the vertices of a complete subgraph) of g which contains at least 2 vertices, and V is not included in a larger clique (i.e. V is inclusion-maximal <ref type="bibr" target="#b2">[Bomze et al. 1999]</ref>).</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Programming Erlang: Software for a Concurrent World</title>
		<author>
			<persName><forename type="first">Joe</forename><surname>Armstrong</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
	<note>Pragmatic Bookshelf</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Parameterized Pattern Matching: Algorithms and Applications</title>
		<author>
			<persName><forename type="first">Brenda</forename><forename type="middle">S</forename><surname>Baker</surname></persName>
		</author>
		<ptr target="http://www.sciencedirect.com/science/article/pii/S0022000096900033" />
	</analytic>
	<monogr>
		<title level="j">J. Comput. System Sci</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="28" to="42" />
			<date type="published" when="1996">1996. 1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The maximum clique problem</title>
		<author>
			<persName><forename type="first">Immanuel</forename><forename type="middle">M</forename><surname>Bomze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>Budinich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Panos</forename><forename type="middle">M</forename><surname>Pardalos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marcello</forename><surname>Pelillo</surname></persName>
		</author>
		<ptr target="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10" />
	</analytic>
	<monogr>
		<title level="m">Handbook of Combinatorial Optimization</title>
				<imprint>
			<date type="published" when="1999">1999. 1999</date>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="1" to="74" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Clone Detection and Elimination for Haskell</title>
		<author>
			<persName><forename type="first">Christopher</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Simon</forename><surname>Thompson</surname></persName>
		</author>
		<ptr target="http://www.cs.kent.ac.uk/pubs/2010/2976" />
	</analytic>
	<monogr>
		<title level="m">PEPM&apos;10: Proceedings of the 2010 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation</title>
				<editor>
			<persName><forename type="first">John</forename><surname>Gallagher</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Janis</forename><surname>Voigtlander</surname></persName>
		</editor>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="111" to="120" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Identifying Code Clones with RefactorErl</title>
		<author>
			<persName><forename type="first">Viktória</forename><surname>Fördős</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Melinda</forename><surname>Tóth</surname></persName>
		</author>
		<idno>-9</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th Symposium on Programming Languages and Software Tools</title>
				<meeting>the 13th Symposium on Programming Languages and Software Tools<address><addrLine>Szeged, Hungary</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="31" to="45" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Comprehensible presentation of clone detection results</title>
		<author>
			<persName><forename type="first">Viktória</forename><surname>Fördős</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Melinda</forename><surname>Tóth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th Symposium on Computer Languages, Implementations and Tools</title>
				<meeting>the 4th Symposium on Computer Languages, Implementations and Tools</meeting>
		<imprint>
			<date type="published" when="2014">2014a</date>
		</imprint>
	</monogr>
	<note>accepted</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Utilising the software metrics of RefactorErl to identify code clones in Erlang</title>
		<author>
			<persName><forename type="first">Viktória</forename><surname>Fördős</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Melinda</forename><surname>Tóth</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of 10th Joint Conference on Mathematics and Computer Science (Informatica)</title>
		<title level="s">LIX. Studia Universitatis Babes ¸-Bolyai</title>
		<meeting>10th Joint Conference on Mathematics and Computer Science (Informatica)<address><addrLine>Cluj-Napoca, Romania</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">2014b</date>
			<biblScope unit="page" from="103" to="118" />
		</imprint>
	</monogr>
	<note>Issue Special Issue 1</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">Dan</forename><surname>Gusfield</surname></persName>
		</author>
		<title level="m">Algorithms on strings, trees, and sequences: computer science and computational biology</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Cambridge University Press</publisher>
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Achieving Accurate Clone Detection Results</title>
		<author>
			<persName><forename type="first">Elmar</forename><surname>Juergens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nils</forename><surname>Göde</surname></persName>
		</author>
		<idno type="DOI">10.1145/1808901.1808902</idno>
		<ptr target="http://doi.acm.org/10.1145/1808901.1808902" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th International Workshop on Software Clones (IWSC &apos;10)</title>
				<meeting>the 4th International Workshop on Software Clones (IWSC &apos;10)<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Large-Scale Inter-System Clone Detection Using Suffix Trees</title>
		<author>
			<persName><forename type="first">R</forename><surname>Koschke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Software Maintenance and Reengineering (CSMR), 2012 16th European Conference</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="309" to="318" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Clone detection and removal for Erlang/OTP within a refactoring environment</title>
		<author>
			<persName><forename type="first">Huiqing</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Simon</forename><surname>Thompson</surname></persName>
		</author>
		<idno type="DOI">10.1145/1480945.1480971</idno>
		<ptr target="http://doi.acm.org/10.1145/1480945.1480971" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation (PEPM &apos;09)</title>
				<meeting>the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation (PEPM &apos;09)<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="169" to="178" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Suffix tree based duplicate code analysis</title>
		<ptr target="http://pnyf.inf.elte.hu/trac/refactorerl/wiki/SuffixTreeBasedDuplicateCodeAnalysis" />
		<imprint>
			<date type="published" when="2014-07">2014. July 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach</title>
		<author>
			<persName><forename type="first">K</forename><surname>Chanchal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><forename type="middle">R</forename><surname>Roy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rainer</forename><surname>Cordy</surname></persName>
		</author>
		<author>
			<persName><surname>Koschke</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.scico.2009.02.007</idno>
		<ptr target="http://dx.doi.org/10.1016/j.scico.2009.02.007" />
	</analytic>
	<monogr>
		<title level="j">Sci. Comput. Program</title>
		<imprint>
			<biblScope unit="volume">74</biblScope>
			<biblScope unit="issue">7</biblScope>
			<biblScope unit="page" from="470" to="495" />
			<date type="published" when="2009-05">2009. May 2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
