<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CSS Corpus for Reproducible Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nico de Groot</string-name>
          <email>nico@nicasso.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vadim Zaytsev</string-name>
          <email>vadim@grammarware.net</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Raincode Labs</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universiteit van Amsterdam</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Reproducibility of research heavily depends on the availability of the datasets from the experiments in the context of metaprogramming, the corpus of the code that was used to run the analyses and transformations. In the case of CSS, the problem is even more acute since the web is a constantly changing environment where the same address can refer to a frequently changing artefact. In this report, we explain how we created a corpus of CSS les as a part of our project of building a framework for analysing style sheets. We also include two case studies of explanatory nature showing how style sheets from various websites go about coding conventions and about code duplication. We believe this work will be useful for other CSS researchers to compare techniques they develop, on a uniform yet realistic dataset.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>CSS, or Cascading Style Sheets, is the de facto standard in specifying the appearance of web pages. It is
a language supported to some extent by all existing internet browsers and standardised by the World Wide
Web Consortium the leading authority in web technologies and standards [B˙HL11, eEG +11]. Even in the
presence of other better, modern, e cient, well-designed alternatives, it remains the only industrially viable
option for deployment of styles, leaving languages like SASS [CWE06] or LESS [SSP +09] to be used strictly on
developers’ side, if at all.</p>
      <p>A typical style sheet in CSS could look like this:
}</p>
      <p>This sheet contains one rule with two selectors and three declarations. Each selector speci es one particular
element to be matched with this style: in our simple example these are element selectors, other kinds of selectors
including class selectors and ID selectors, as well as more complex pseudo-selectors for specifying the rst child
or the rst line of the matched area. Each declaration assigns a value to a property, with the type of a value
being determined by the particular property: a padding’s value is expected to be a length with a unit, but a
font-family property expects a comma-separated list of names of individual fonts and font families.</p>
      <p>CSS is an important element of the web development landscape, yet it is largely underrepresented in academic
research. In a recent study we managed to cite all peer-reviewed papers ever published with CSS or Cascading
Copyright c by the paper’s authors. Copying permitted for private and academic purposes.</p>
      <p>Proceedings of the Seminar Series on Advanced Techniques and Tools for Software Evolution SATToSE2016 (sattose.org), Bergen,
Norway, 11-13 July 2016, published at http://ceur-ws.org
Style Sheets in the title [GZ16a]: 4 with general discussions, 2 case studies, 3 on pinpointing language
shortcomings and improving on them, 5 on preprocessors, 2 on classi cation of syntactic errors, 7 on refactoring, 7
on analysis, 5 on security issues, 6 on IDE support. With the raising interest in spreadsheets and the
maturity of them gaining acceptance among researchers, CSS is probably the most scarcely investigated industrially
successful mainstream software language.</p>
      <p>As part of a bigger project of building a framework to analyse CSS speci cations [dG16b, dG16a], we have
faced the challenge of empirical validation. We wanted to rely on a comprehensive corpus of CSS les, with
reasonably high feature coverage numbers and potential for regression testing integration. This report collects
many issues related to that particular part of our work, and exposes preliminary results. The work on the actual
analysis tool and infrastructure is still ongoing and can be observed from the GitHub accounts of the authors.</p>
      <p>Replications of website analysis papers are usually next to impossible since most modern active vendors
change their applications continuously and deploying new versions up to 50 times a day [Sch14]. This means
that providing the extensive list of websites used in the experiment, is not sustainable, since the actual CSS les
behind those names would have changed hundreds and thousands times between the original experiment and its
replication. One of the approaches is to provide a timed snapshot of a collection of web applications, and this is
what this report focuses on.</p>
      <p>As related work we can point the readers to Qualitas Corpus [TAD +10], a large collection of compilable
software projects in Java; Atlantic Zoo [CTB +03], a versatile gathering of metamodels obtained by many means
from mining papers to converting ontologies; or Grammar Zoo [Zay15], a grammarware-centred repository of
artefacts of various nature containing knowledge about language structure. On an even more closely related
note, both scale-wise and topic-wise, let us point to a recent paper by Mazinanian et al. [MTM14] on discovering
refactoring opportunities in CSS. The dataset for the project was made publicly available [Maz14], which was
greatly appreciated by replicators [PVZ16b]. However, as the e ort by Punt et al. [PVZ16a] showed, the dataset
had some issues related to crawling time glitches, crawling location speci city, le access miscon guration,
unavailability of cookies, and les being renamed. In our work we tried to combine the best we could learn from
all these projects and approaches, which led to automated crawling of popular and acknowledged sources, with
subsequent manual and tool-supported ltering, curation, standardisation. The result has over 0.5 MLOC of
correct pretty-printed CSS, and can be used to test parsers and try out and compare techniques developed by
CSS researchers.</p>
      <p>The remainder of the report is organised as follows: section 2 explains the inclusion criteria, exposes details
on how the corpus was composed and shows simple and advanced metrics calculated on it; section 3 sketches
a case study of using the corpus with our work-in-progress framework to detect coding conventions; section 4
shows another case study about clone detection; section 5 draws preliminary conclusions and contemplates future
endeavours.
2</p>
      <p>The Corpus: Contents, Selection, Metrics
The 50 style sheets for the corpus were picked from a selection of the most popular websites from the Alexa
top 500 most popular sites on the web. Duplicate websites within the list such as http://google.es and
http://google.fr have been ignored, since those would just result in the same style sheets multiple times.
Furthermore, http://t.co has been ignored from the list since it is the link/redirect service of Twitter, and not
a real website. Finally, http://blogspot.com is ignored as well since it refers to http://blogger.com which is
already in the top 50. The nal list covered the following websites:
360.cn
blogger.com
gmw.cn
kat.cr
naver.com
pinterest.com
stackover ow.com
weibo.com
yandex.ru
aliexpress.com
chinadaily.com
google.com
linkedin.com
net ix.com
pornhub.com
taobao.com
whatsapp.com
youtube.com
amazon.com
diply.com
hao123.com
live.com
o ce.com
qq.com
tmall.com
wikipedia.org
apple.com
ebay.com
imdb.com
mail.ru
ok.ru
reddit.com
tumblr.com
wordpress.com
baidu.com
facebook.com
imgur.com
microsoft.com
onclickads.net
sina.com.cn
twitter.com
xvideos.com
bing.com
fc2.com
instagram.com
msn.com
paypal.com
sohu.com
vk.com
yahoo.com</p>
      <p>The actual style sheets from these websites have been downloaded using the CSS Stats tool [MJO14], which
automatically extracts external style sheets as well as embedded CSS from web pages. The CSS provided by CSS
1
3
0
;
2</p>
      <p>9
.c360n li.rsseecaoxpm.zcaaoonmm l.ecaoppm i.caoubdmi.cgobnm l.recggoobm ili.ccaaoynhdm li.coypdm .ecaoybm .fcecaoookbm .fcc2om.cgnwm l.ecgogoom .caoo312hm i.cobdmmi.rcogumm i.trscagaonmm .trcak il.iceoknndm .liceovm il.raum i.frtsccooomm.sconmm .recaovnm i.tecoxnm .cecoom .roku l.itscceoakndn .lcaaoyppm i.trtseeconpm .rcoopnhubm.coqqm i.rtecoddm i..sccaonnm .scoouhm .recoovwm .tcoaaoobm ll.tcaomm l.trcoubmm i.trttecowm .covkm i.ecoobwm.tscoaaphpwm iii.regoakpdw .rrssceoopdwm .isceooxvdm.coooayhm .reayxnud .tceooyubum
4
3
6
;
3
3
7
9
9
2
7
;
7
6
6
2
;
7
5
9
4
2
;
5
2
9
2
5
;
40 81
9
;
3
3
8
8
5
;
79 12
9
;
6
0
;375 ;5931 ;3492
4
2
;342 ;160 9
1 62
1 ;
9
5
8
1
;
1
2
(a) Amount of important statements vs the average selector
specicity</p>
      <p>500</p>
      <p>2 3 4 5
(c) Percentage of declarations with !important modiers
0 2 4 6 8
(d) Percentage of cloned lines per type</p>
      <p>10
upwards trend.</p>
      <p>Having this upward trend in the speci city of selectors in the source le order does not impact the e ect of
the style sheet, since the actual order is only considered in case of con icting selectors with equal speci city since
then the source order will solve the con ict [eEG +11]. However, placing selectors in the style sheet in an upward
trend, based on their speci city and source order, does make it easier to reason about the CSS. For example,
if selectors with high speci city are placed at the beginning of the style sheet, and you later on have to change
the presentation of those elements, you have either to overrule the speci city of the earlier de ned selectors, or
ensure they all have the same speci city.</p>
      <p>An example of a speci city graph is shown in Figure 3, which is created using the style sheet of Whatsapp.com.
Like the speci city graphs of most of the other websites from the sample set, the graph does not display a slowly
increasing line. This can have multiple reasons, as for example all CSS from the websites of the sample set
contain multiple style sheets which are all combined in one graph. This is due to the fact that during the
downloading of the style sheets using the CSSstats tool, all style sheets have been merged into a single style
sheet. Furthermore CSS preprocessors such as SASS or LESS could have been used to parse the CSS, placing
rules in non-optimal positions. Our last hypothesis is that developers are not always completely familiar with
the cascading characteristics of CSS.</p>
      <p>What is interesting when looking at the speci city graph of Whatsapp.com, is that the style sheet immediately
starts with very speci c selectors. About the rst 100 selectors seem to consist of mostly ID selectors and just
some class and element selectors. This could a ect the maintenance aspect of the style sheet, as possibly more
speci c selectors have to be used later on in the style sheet to overrule the already very speci c base styles. This
could also explain the high usage of the !important modi er for Whatsapp.com, as 9.09% of all declarations
have applied it. The !important would allow the developers at Whatsapp.com for a quick and easy solution for
solving cascading problems, even though it is considered a code smell [Zak11, GZ16a, Gha14].</p>
      <p>The high amount of !important modi ers in Whatsapp.com, and its high average speci city, may give an
impression that it could have positive correlation. This would be interesting due to the fact that a higher
average speci city value would badly a ect the maintainability of the style sheet, creating complex cascading
related problems. Important modi ers would be a tempting solution for developers to use when the average
speci city is high, as those are a quick and dirty way to solve these kinds of problems. However, this hypothesis
has been refuted after a little probe, which results are shown in Figure 2a. An explanation for this outcome could
be that websites which mostly use higher speci city selectors, will simply keep creating selectors with even higher
speci city values, increasing the average speci city. As long as no low speci city selectors are used, no major
cascading issues are likely to occur therefore not increasing the temptation for developer to use the !important
modi er.
300
One of the analyses that are possible to implement within our framework is checking whether developers have
applied coding conventions correctly. Checking if a semicolon is present after each declaration, if short
hexadecimal values are used, or that a vendor-pre xed property is followed by a standard property [GZ16a], is all
possible. Since there are also other tools that check coding conventions for CSS, we will compare our
implementations for some coding conventions to theirs, and analyse how much more e cient our model is in conducting
such analysis. Finally, a selection of the following ten coding conventions will be validated on the sample set,
providing additional insights in the quality of the CSS [GZ16b]:</p>
      <sec id="sec-1-1">
        <title>Use short hexadecimal values (Performance)</title>
        <p>Use the shorthand margin and padding property (Performance)</p>
      </sec>
      <sec id="sec-1-2">
        <title>Disallow empty rules (Possible error)</title>
      </sec>
      <sec id="sec-1-3">
        <title>Do not use id selectors (Maintainability)</title>
        <p>Require standard property with vendor pre x (Compatibility)</p>
      </sec>
      <sec id="sec-1-4">
        <title>When possible, use em instead of pix (Accessibility)</title>
      </sec>
      <sec id="sec-1-5">
        <title>Disallow duplicate properties (Possible error)</title>
      </sec>
      <sec id="sec-1-6">
        <title>Avoid using !important (Maintainability)</title>
        <p>Avoid qualifying ID and class names with type selectors (Performance)</p>
        <p>The conventions were taken from open-source communities, companies and CSS professionals. They regard
possible errors, compatibility, accessibility, maintainability, and performance [GZ16b]. Coding conventions
related to lexical details such as required locations of spaces, are not taken into account since the CSS Stats
tool [MJO14] used to download the style sheets, as mentioned above, has pretty-printed them all uniformly.</p>
        <p>Figure 4 shows the percentage of violations per coding convention for the complete sample set. The most
violated coding convention is the disallowing of the ID selector. ID selectors are disallowed since those should
be unique, pointing to only a single element. By using ID selectors, developers limit themselves to styling only
a single element, losing the bene t CSS provides regarding to the reuse of styles. However, as can be seen in the
graph, not all style sheets adhere to this coding convention. Of all 50 websites, 15 of those have a minimum of
10% ID selectors, even ranging up to 36.05% (bing.com). Furthermore we have analysed that the !important
modi er is used on average 16 times every 1000 declarations. Some websites even have more than 5% of all their
declarations use the !important modi er, with Whatsapp.com being on top with 9.09%. Such a high usage
of !important modi ers demonstrates bad use/understanding of the cascading characteristic of CSS. Figure 2c
shows more information on the usage of the !important modi er.</p>
        <p>Relating the amount of violations per category of coding conventions to the amount of lines of code in the
style sheet, presented some insights in the occurrences of violations. Both the maintainability and compatibility
related coding smells showed a strong positive correlation against the amount of lines of code in a style sheet,
with their correlations being 0:9938, and 0:9960 respectively. The possible error category also had a positive
correlation, being 0:8319. For the performance category there was no signi cant correlation, as its value was
0:0532. These values are based on only 2 3 coding conventions per category, therefore not being a complete
representation for each category, as only a small section of the available coding conventions per category have
been used. However, they do indicate that there is a need for better CSS standards to prevent, better CSS
analysis tools to detect, and better CSS refactoring tools tools to x, the decline of quality in style sheets.
4</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Case Study: Detecting Code Clones</title>
      <p>Clone analysis, detection, management and tool evaluation have been very active topics in software engineering
research at least since 1994, with numbers of papers dedicated to them climbing each year [RZK14]. Clones are
usually considered harmful, since they bloat the codebase and hamper proper maintenance since each bug xed
in cloned code, needs its xes propagated to all code incarnations, including those that signi cantly evolved since
the cloning time. Without joining the ongoing discussion on the usefulness of clones for solving some tasks (in
particular in product line implementation), we can point out that in general having or lacking duplicates is a
remarkable property of the source code, and quite characteristic of the programming style, partly dictated by
the chosen software language. Hence, we are interested in investigating clones in CSS. The expectation is to have
results similar to other small DSLs [TC11].</p>
      <p>Results of running the clone detector on the sample set, can be seen in Figure 5. These are the results using
the current con guration as shown in the list below. Running the clone detector with di erent con gurations
would result in di erent results, however, for the current sample set these speci c con gurations result in accurate
clones.</p>
      <sec id="sec-2-1">
        <title>Clones should have a minimum mass of 6.</title>
      </sec>
      <sec id="sec-2-2">
        <title>Clones should occupy a minimum of 3 LOC.</title>
        <p>The following is normalised for type 2 and type 3 clones:</p>
      </sec>
      <sec id="sec-2-3">
        <title>File names of style sheets Selectors of rule sets Media queries in @media rules 6</title>
      </sec>
      <sec id="sec-2-4">
        <title>Type 3 clones should be at least 80% equal.</title>
        <p>In Figure 2d, three box plots are shown, one for each clone type. It shows that clones of type 1 have an average
length in lines of 4.76% and a median of 0.75%, however more interesting are the outliers which have more than
20%, and for 2 websites even around 40% clones lines. These two are chinadaily.com and ok.ru. A reason for
chinadaily.com to have such a high percentage of type 1 clones, is that the style sheet contains a @media rule,
which aims at screens with a max-width of 1154 pixels. However, instead of only adding rule sets to the @media
rule that override the previously de ned properties for speci c resolutions (e.g., for smartphones), it also contains
direct duplicates of rule sets from outside the @media rule which add no bene t whatsoever. Refactoring the
style sheet to remove the speci c @media rule, and pretty printing the CSS to run it through the clone detector
again would be a simple and easy way to verify this, however, the @media rule is never closed with a right brace,
making it impossible to do, as we can only guess where the @media rule should have been closed.</p>
        <p>For ok.ru, the high amount of type 1 clones does not seem to be related to any @media as was the case for
chinadaily.com. For some reason a lot of rule sets that are de ned relatively early in the style sheet ( rst 10,000
lines), are de ned again later on in the style sheet (from about 25,000 lines). It seems that the ok.ru website
loads 3 style sheets, with 2 of them being fairly equal, containing a lot of the same rule sets. It seems like one of
the style sheets is simply duplicated and then partially modi ed, while not keeping in mind that most rule sets
are already de ned elsewhere.</p>
        <p>Type 2 clones are found the most often, having an average of 17.30% and a median of 16.76%, with one
outlier of 38.96% which is microsoft.com. When looking at the style sheet of microsoft.com, it seems like they
have used a tool to generate CSS with, maybe SASS or LESS, because the rst 2000 lines mostly contains rule
sets as shown in Listing 1, rule sets with one or multiple selectors and only a single width declaration width a
percentage value. Most of these width declarations with equal values occurred multiple times in di erent rule
sets, which could explain the high amount of type 2 clones.</p>
        <p>Listing 1: Part of the microsoft.com style sheet
1 . CSPvNext . margin-row-fluid &gt;. bp2-col-10-3 {
2 width: 27.4%
3 }
4 . CSPvNext . margin-row-fluid &gt;. bp2-col-10-4 {
5 width: 37.2%
6 }
7 . CSPvNext . margin-row-fluid &gt;. bp2-col-10-6 {
8 width: 56.8%
9 }</p>
        <p>Then there are the type 3 clones, that with the current con gurations, result in an average of 4.79% and a
median of 2.59%. There is one outlier that stands out the most, as it has 73.91% of type 3 clones. This is the
qq.com website, and what is surprising about qq.com is that when combining its type 1, type 2, and type 3
clones, it shows that a 100% of the lines are considered cloned lines. This means that every line in the style
sheet, is part of one or more clones. After analysing the qq.com style sheet, the high amount of type 3 clone
seems to be a result of copy and pasting. To give an example, in Listing 2 two rule sets are shown taken from the
qq.com style sheet. The only thing which sets these rule sets apart are their selectors and the font-size and
display declarations in the second rule set, with the remaining 8 declarations being identical. The two rule sets
where not even a 100 lines apart from each other in the style sheet, giving the impression that the developers
do not fully understand the inheritance and cascading characteristics of CSS, or that maybe they did, but just
wanted to develop the style sheet in a short amount of time while not being bothered by CSS’ inheritance and
cascading characteristics. Nevertheless, it shows that the 5,449 lines of code that the qq.com style sheet now
uses to style its website with, can be reduced signi cantly.</p>
        <p>Listing 2: Part of the qq.com style sheet</p>
        <p>Preliminary Conclusions and Future Work
In this report, we have explained how we composed a corpus of realistic CSS code from popular websites, as a
part of the e ort to build a framework for CSS analysis. We have brie y gone through two case studies that
showed how the corpus can be (re)used. The project is still a work in progress, but the corpus is ready and is
already serving us well.</p>
        <p>Ultimately we will use this corpus of CSS les to compare our framework with existing alternatives, by
implementing the same algorithms within various frameworks. Smell detection, clone management, metrics calculation
and detecting refactoring opportunities will remain the main themes. Analysing the corpus statistically to see
which language features of CSS are more widely used and therefore more crucial to support, is also an interesting
option.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgements</title>
      <p>This report is based on the extended abstract of the presentation given at the SATToSE symposium in Bergen,
Norway, on 12 July 2016 [dG16b], as well as on the graduate thesis defended at the University of Amsterdam,
The Netherlands, on 21 July 2016 [dG16a].
[BD16]
[dG16a]</p>
      <p>Golnaz Gharachorlu. Code Smells in Cascading Style Sheets: An Empirical Study and a Predictive
Model. Master’s thesis, University of British Columbia, Canada, 2014. URL: http://hdl.handle.
net/2429/51364 .</p>
      <p>Boryana Goncharenko and Vadim Zaytsev. Language Design and Implementation for the Domain of
Coding Conventions. In Tijs van der Storm, Emilie Balland, and DÆniel Varr , editors, Proceedings
of the Ninth International Conference on Software Language Engineering (SLE) , pages 90 104, 2016.
doi:10.1145/2997364.2997386 .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [AMIO12]
          <string-name>
            <given-names>Adewole</given-names>
            <surname>Adewumi</surname>
          </string-name>
          , Sanjay Misra, and
          <string-name>
            <surname>Nicholas</surname>
          </string-name>
          Ikhu-Omoregbe.
          <article-title>Complexity Metrics for Cascading Style Sheets</article-title>
          . In Beniamino Murgante, Osvaldo Gervasi, Sanjay Misra, Nadia Nedjah,
          <string-name>
            <surname>Ana Maria A. C. Rocha</surname>
          </string-name>
          , David Taniar, and Bernady O. Apduhan, editors,
          <source>Proceedings of the 12th International Conference on Computational Science and Its Applications (ICCSA)</source>
          , pages
          <fpage>248</fpage>
          <lpage>257</lpage>
          . Springer,
          <year>2012</year>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -31128-4_
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>[B˙HL11] Bert</surname>
            <given-names>Bos</given-names>
          </string-name>
          , Tantek ˙elik, Ian Hickson, and
          <article-title>H kon Wium Lie. Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Speci cation</article-title>
          .
          <source>W3C Recommendation</source>
          ,
          <year>June 2011</year>
          . http://www.w3.org/TR/2011/ REC-CSS2-
          <volume>20110607</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>M. Serdar</surname>
          </string-name>
          <article-title>Bi er and Banu Diri. Defect Prediction for Cascading Style Sheets</article-title>
          . Applied Soft Computing ,
          <year>2016</year>
          . doi:http://dx.doi.org/10.1016/j.asoc.
          <year>2016</year>
          .
          <volume>05</volume>
          .038 .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [CTB+03]
          <string-name>
            <surname>Jordi</surname>
            <given-names>Cabot</given-names>
          </string-name>
          , Massimo Tisi,
          <string-name>
            <surname>Hugo BruneliŁre</surname>
          </string-name>
          , et al.
          <article-title>AtlantEcore Metamodel Zoo</article-title>
          . http://www.emn. fr/z-info/atlanmod/index.php/Ecore ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [CWE06]
          <article-title>Hampton Catlin, Natalie Weizenbaum, and Chris Eppstein</article-title>
          .
          <source>SASS: Syntactically Awesome Style Sheets</source>
          ,
          <year>2006</year>
          . http://sass-lang.
          <source>com .</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Nico de Groot</surname>
          </string-name>
          .
          <article-title>Analysing and Manipulating CSS using the M 3 Model</article-title>
          .
          <source>Master's thesis</source>
          , Universiteit van Amsterdam, The Netherlands,
          <year>July 2016</year>
          . URL: http://www.scriptiesonline.uba.uva.nl/ en/scriptie/613750 .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Nico de Groot</surname>
          </string-name>
          .
          <article-title>Analysing CSS using the M3 Model</article-title>
          .
          <source>In Pre-proceedings of the Ninth Seminar on Advanced Techniques and Tools for Software Evolution (SATToSE)</source>
          ,
          <year>2016</year>
          . URL: http://sattose.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          wdfiles.com/local--files/
          <year>2016</year>
          :alltalks/SATTOSE2016_paper_10.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [eEG+11] Tantek ˙elik, Elika J. Etemad, Daniel Glazman, Ian Hickson,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Linss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and John Williams. Cascading</given-names>
            <surname>Style</surname>
          </string-name>
          <article-title>Sheets (CSS) Selectors Level 3</article-title>
          . W3C Recommendation ,
          <year>September 2011</year>
          . http: //www.w3.org/TR/2011/REC-css3
          <string-name>
            <surname>-</surname>
          </string-name>
          selectors-20110929/ .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [Maz14]
          <article-title>Boryana Goncharenko and Vadim Zaytsev. Reverse Engineering a CSS Coding Conventions Catalogue</article-title>
          . Draft, https://github.com/boryanagoncharenko/CssCoco/blob/master/analysis.md ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Davood</given-names>
            <surname>Mazinanian</surname>
          </string-name>
          .
          <source>Dataset for FSE'14 submission</source>
          ,
          <year>2014</year>
          . URL: http://users.encs.concordia.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>ca/~d_mazina/papers/FSE'14/ .</mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [MJO14]
          <string-name>
            <given-names>Adam</given-names>
            <surname>Morse</surname>
          </string-name>
          ,
          <string-name>
            <surname>Brent Jackson</surname>
            ,
            <given-names>and John Otander. CSS</given-names>
          </string-name>
          <string-name>
            <surname>Stats</surname>
          </string-name>
          ,
          <year>2014</year>
          . http://cssstats.com .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [MTM14]
          <string-name>
            <given-names>Davood</given-names>
            <surname>Mazinanian</surname>
          </string-name>
          , Nikolaos Tsantalis, and
          <string-name>
            <given-names>Ali</given-names>
            <surname>Mesbah</surname>
          </string-name>
          .
          <article-title>Discovering Refactoring Opportunities in Cascading Style Sheets</article-title>
          .
          <source>In Proceedings of the 22nd Symposium on the Foundations of Software Engineering (FSE)</source>
          , pages
          <fpage>496</fpage>
          <lpage>506</lpage>
          . ACM,
          <year>2014</year>
          . doi:
          <volume>10</volume>
          .1145/2635868.2635879 .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [PVZ16a]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Punt</surname>
          </string-name>
          , Sjoerd Visscher, and
          <string-name>
            <given-names>Vadim</given-names>
            <surname>Zaytsev</surname>
          </string-name>
          .
          <article-title>Experimental Data for the A?B*A Pattern in CSS: Inputs and Outputs</article-title>
          .
          <source>In Proceedings of the 32nd International Conference on Software Maintenance and Evolution (ICSME)</source>
          ,
          <source>page 616</source>
          ,
          <year>2016</year>
          . Best Artefact Award. doi:
          <volume>10</volume>
          .1109/ICSME.
          <year>2016</year>
          .
          <volume>91</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [PVZ16b]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Punt</surname>
          </string-name>
          , Sjoerd Visscher, and
          <string-name>
            <given-names>Vadim</given-names>
            <surname>Zaytsev. The A?B*A Pattern</surname>
          </string-name>
          :
          <article-title>Undoing Style in CSS and Refactoring Opportunities it Presents</article-title>
          .
          <source>In Proceedings of the 32nd International Conference on Software Maintenance and Evolution (ICSME)</source>
          , pages
          <fpage>67</fpage>
          <lpage>77</lpage>
          ,
          <year>2016</year>
          . doi:
          <volume>10</volume>
          .1109/ICSME.
          <year>2016</year>
          .
          <volume>73</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>[RZK14] Chanchal</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Roy</surname>
            ,
            <given-names>Minhaz F.</given-names>
          </string-name>
          <string-name>
            <surname>Zibran</surname>
            , and
            <given-names>Rainer</given-names>
          </string-name>
          <string-name>
            <surname>Koschke</surname>
          </string-name>
          .
          <article-title>The vision of software clone management: Past, present, and future (Keynote paper)</article-title>
          .
          <source>In Serge Demeyer</source>
          , David Binkley, and Filippo Ricca, editors,
          <source>Proceedings of the Software Evolution Week: Conference on Software Maintenance, Reengineering, and Reverse Engineering</source>
          , pages
          <fpage>18</fpage>
          <lpage>33</lpage>
          . IEEE Computer Society,
          <year>2014</year>
          . doi:
          <volume>10</volume>
          .1109/CSMR-WCRE.
          <year>2014</year>
          .
          <volume>6747168</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [Sch14]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Schauenberg</surname>
          </string-name>
          . Development, Deployment &amp; Collaboration at Etsy. In QCon London,
          <year>2014</year>
          . https://qconlondon.com/london-2014/london-2014/presentation/Development, %
          <volume>20Deployment</volume>
          %
          <fpage>20</fpage>
          &amp;
          <article-title>%20Collaboration%20at%20Etsy</article-title>
          .html .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [SSP+09]
          <string-name>
            <surname>Alexis</surname>
            <given-names>Sellier</given-names>
          </string-name>
          , Jon Schlinkert, Luke Page, Marcus Bointon, MÆria Jur£ovi£ovÆ,
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Max</given-names>
            <surname>Mikhailov</surname>
          </string-name>
          . Less,
          <year>2009</year>
          . http://lesscss.org .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [TAD+10]
          <string-name>
            <surname>Ewan</surname>
            <given-names>Tempero</given-names>
          </string-name>
          , Craig Anslow, Jens Dietrich, Ted Han,
          <string-name>
            <given-names>Jing</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Markus</given-names>
            <surname>Lumpe</surname>
          </string-name>
          , Hayden Melton, and
          <string-name>
            <given-names>James</given-names>
            <surname>Noble</surname>
          </string-name>
          .
          <article-title>Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies</article-title>
          .
          <source>In Asia Pacic Software Engineering Conference (APSEC</source>
          <year>2010</year>
          ) , pages
          <fpage>336</fpage>
          <lpage>345</lpage>
          ,
          <year>December 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [TC11]
          <article-title>[Zak11] [Zay15] Robert Tairas and Jordi Cabot. Cloning in DSLs: Experiments with OCL</article-title>
          . In Anthony M.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Sloane</surname>
          </string-name>
          and Uwe A mann, editors,
          <source>Revised Selected Papers of the Fourth International Conference on Software Language Engineering</source>
          , volume
          <volume>6940</volume>
          <source>of LNCS</source>
          , pages
          <fpage>60</fpage>
          <lpage>76</lpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>doi:10</source>
          .1007/978-3-
          <fpage>642</fpage>
          -28830-
          <issue>2</issue>
          _
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Nicholas C.</given-names>
            <surname>Zakas</surname>
          </string-name>
          . Disallow !important,
          <year>2011</year>
          . https://github.com/CSSLint/csslint/wiki/ Disallow-!important .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Vadim</given-names>
            <surname>Zaytsev. Grammar Zoo</surname>
          </string-name>
          :
          <article-title>A Corpus of Experimental Grammarware</article-title>
          .
          <source>Fifth Special issue on Experimental Software and Toolkits of Science of Computer Programming (SCP EST5)</source>
          ,
          <volume>98</volume>
          :
          <fpage>28</fpage>
          51,
          <year>February 2015</year>
          . doi:
          <volume>10</volume>
          .1016/j.scico.
          <year>2014</year>
          .
          <volume>07</volume>
          .010 .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>