<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adjective-Noun Placement Variation in French Twitter Corpora</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Joan Palmiter Bajorek</string-name>
          <email>jpbajorek@ucdavis.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of California</institution>
          ,
          <addr-line>Davis One Shields Ave, Davis, CA 95616</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <fpage>7</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>French written and spoken corpora reveal that variation of adjective-noun placement occurs more often within written corpora than in spoken ones (Lightfoot, 1979; Prévost, 2009; Thuilier, 2013). I examine adjective lemma variation of placement, adjective categories, frequency, and syllable length as a means of assessing current Twitter trends of adjective placement. In this preliminary study, a cross-section of 30 adjective lemmas within a corpus of approximately 6,000 French tweets demonstrates that parallels exist between the usage of adjectives on Twitter and norms of spoken French.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        This article examines French adjective-noun
placement in the determiner phrase within
written, spoken, and Twitter corpora.
Placement of French adjectives in the
determiner phrase has long been controversial
and misconstrued within the literature
        <xref ref-type="bibr" rid="ref11 ref2">(Benzitoun, 2014; Grevisse &amp; Goosse, 2011)</xref>
        .
It is generally accepted that “[i]n Romance
languages such as French and Spanish,
postnominal adjectives are the rule rather than
the exception”
        <xref ref-type="bibr" rid="ref1">(Alexiadou, Haegeman, &amp;
Stavrou, 2007)</xref>
        .
      </p>
      <p>
        However, contemporary written and spoken
corpora indicate that adjective placement has
greater flexibility and ambiguity than has
previously been understood. Statistical analysis
of corpora indicate that adjective placement is
more restrictive in spoken usage than written
usage by a large margin
        <xref ref-type="bibr" rid="ref30">(Thuilier, 2013)</xref>
        . This
research investigates adjective placement in
      </p>
      <p>French tweets through the examination of two
French Twitter timelines of users Le Monde
and Cyprien. It is hypothesized that French
tweets will display higher levels of adjective
placement restriction than Thuilier’s written
and spoken corpora findings. This may be
attributed to Twitter’s character limit and the
curt, spoken-like treatment of language on the
social network and overall online usage of
language.
2</p>
      <sec id="sec-1-1">
        <title>French Adjectives</title>
        <p>
          In French, attributive adjectives “adjectifs
épithètes" can be placed before or after the noun as
seen in (1)
          <xref ref-type="bibr" rid="ref11 ref14">(Grevisse &amp; Goosse, 2011; Laenzlinger,
2005)</xref>
          , an example modified from
          <xref ref-type="bibr" rid="ref30">(Thuilier, 2013)</xref>
          .
(1)
a. un beau requin
a beautiful shark
b. un requin sympa
a shark nice
In example (1a), the adjective “beau,” found in the
prenominal position and modifies the noun
“requin.” Conversely, in (1b), the adjective
“sympa” is postnominal
          <xref ref-type="bibr" rid="ref1 ref10 ref30">(Alexiadou et al., 2007;
Delbecque, 1990; Thuilier, 2013)</xref>
          . French
adjectives are widely believed to exist in one of
three categories: fixed prenominal, fixed
postnominal, or accepted alternator, with
recognized semantic shift attributed to either
position.
        </p>
        <p>
          Although these categories are frequently used,
“[i]t has proven notoriously difficult to define the
functions of the two positions”
          <xref ref-type="bibr" rid="ref23">(Sleeman, Van de
Velde, &amp; Perridon, 2014)</xref>
          and variation is oft
attributed to semantic shifts
          <xref ref-type="bibr" rid="ref1 ref14">(Alexiadou et al.,
2007; Laenzlinger, 2005)</xref>
          . For example, the
adjective “brave” means good or decent in the
preposition, but courageous or brave in the
postposition. In this way, positions are correlated
with meaning. Yet, Thuilier, a native speaker of
French, provides several examples of exceptions of
words that are considered as regular alternators
with specified meanings correlated to positions
(2013). Furthermore, many adjectives retain the
same meaning regardless of placement, such as the
adjective “charmant,” meaning charming
          <xref ref-type="bibr" rid="ref30">(Thuilier,
2013)</xref>
          . In many ways, this variation in placement
thus is a choice of the user and varies widely on
the usage context. It should be noted that the
primary concern of this research is variation in
placement and patterns of adjective corpora rather
than semantic interpretations, which are outside the
scope of this study.
2.1
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Syntactic Theory</title>
      <p>
        In considering the syntactic basis of adjective
placement, modern frameworks explain the
variation through Branching Direction Theory
        <xref ref-type="bibr" rid="ref26">(Song, 2012)</xref>
        . It is posited that nouns move
leftward and up the syntactic tree with
“piedpiping” and “snowballing” movement and interact
with linear and mirror image matching among
languages
        <xref ref-type="bibr" rid="ref14 ref29 ref7 ref8">(Cinque, 1994, 2010; Laenzlinger, 2005;
Stavrou, 2012)</xref>
        . However, much is left to be
desired in these theoretical understandings of
syntactic underpinnings. If adjectives that alternate
in position and have semantic shifts relegated to
those positions, why does the noun moving up and
down the tree account for the adjective’s semantic
change
        <xref ref-type="bibr" rid="ref1">(Alexiadou et al., 2007)</xref>
        ? How does this
theory account for languages that display varied
adjective-noun ordering systems? In the words of
Alexiadou et al., “ [a] question that remains open is
why various languages have to resort to different
means (order, excessive articles, morphology) in
order to encode different interpretations of
adjective-noun combinations” (parentheses
original (2007)). Do to the recent growth in
Optimality Theory
        <xref ref-type="bibr" rid="ref26">(Song, 2012)</xref>
        , there is hope that
elegant, more fully explanatory theories can be
developed.
2.2
      </p>
    </sec>
    <sec id="sec-3">
      <title>Historical Background</title>
      <p>
        French evolved from Latin, a language with no
fixed adjective placement
        <xref ref-type="bibr" rid="ref23">(Sleeman et al.,
2014)</xref>
        . Between the 13th and 19th Century,
adjective-noun placement shifted from
prenominal preference in Old French to
postnominal dominance in Modern French
        <xref ref-type="bibr" rid="ref23 ref3">(Boucher, 2006; Sleeman et al., 2014)</xref>
        .
      </p>
      <p>
        Several theories postulate reasons for this
phenomenon. According to Glatigny,
“preposed adjectives belong in very great
majority to the ancient foundations of the
language” (translated by the author
        <xref ref-type="bibr" rid="ref30">(Thuilier,
2013)</xref>
        ). The shift may resulted from the
influence and exposure of French and
Romance languages to Germanic languages
        <xref ref-type="bibr" rid="ref11">(Grevisse &amp; Goosse, 2011)</xref>
        . Yet some reject
these “influenced by another language”
interpretations due to the lack of documented
evidence
        <xref ref-type="bibr" rid="ref16">(Lightfoot, 1979)</xref>
        . Another theory
indicates that prenominal adjectives have
marked positions that licenses specialized
reading and greater frequency
        <xref ref-type="bibr" rid="ref15">(Ledgeway,
2012)</xref>
        . Evidence for this theory comes from the
stronger syntactic bond of prenominal
adjectives to nouns as compared to the
relationship of their postnominal counterparts
and nouns. The phonological connections
between prenominal adjectives and nouns are
demonstrated through the use of liaisons and
irregular adjective agreements
        <xref ref-type="bibr" rid="ref15">(Ledgeway,
2012)</xref>
        .
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>Usage Norms in Corpora</title>
      <p>For decades, newspapers, literary texts, university
essays, and spoken corpora have been analyzed for
their adjective placement norms. While patterns
arise, gaps in information and uniformity of
analysis abound in the research. Most studies cite
raw counts of adjectives found to be prenominal or
postnominal. They do not take into account
adjective lemma repetition, aspects of frequency,
and the possibility of alternation of position.
Recognizing the importance of the source of
corpora, Delbecque writes that prenominal and
postnominal proportions of placement are “mainly
to be attributed to the text genre” (1990). Echoing
these sentiments, Thuilier writes that intense
difference exist according to the type of production
(2013).
3.1</p>
    </sec>
    <sec id="sec-5">
      <title>Literary Texts</title>
      <p>
        In Wydler’s study of the famous text “La Chanson
de Roland,” 70% of overall adjectives were
prenominal
        <xref ref-type="bibr" rid="ref16 ref4">(Gerard J. Brault, 1978; Lightfoot,
1979)</xref>
        . “The Song of Roland” is the written version
of the epic tale that was once performed orally,
though its precise origins are clouded in
“considerable speculation”
        <xref ref-type="bibr" rid="ref5 ref8">(G.J. Brault, 2010)</xref>
        .
Delbecque notes that modern French novels
demonstrate the same proportional split in
placement, but provides no data to support the
claim (1990).
      </p>
      <p>
        In another analysis of French adjectives, Wagner
separated adjectives into three basic categories.
“Cardinals” are basic, essential, and high
frequency adjectives throughout the French
language. “Populaires” are mainstream,
colloquial, and wide spoken adjectives. Finally,
“savants” are “learned” adjectives of scholarly
registers (translations by the author
        <xref ref-type="bibr" rid="ref16">(Lightfoot,
1979)</xref>
        ). For clarity in the following section, these
three categories will be referred to as Frequent,
Colloquial, and Scholarly. In an examination of
13th and 14th century prose, Wagner found that
Frequent adjectives strongly favored prenominal
positions: 2,393 prenominal and 11 postnominal
occurrences, a 99.5% to 0.5% split
        <xref ref-type="bibr" rid="ref16">(Lightfoot,
1979)</xref>
        . 75% of Colloquial adjectives were
prenominal whereas roughly 70% of Scholarly
adjectives were postnominal.
      </p>
      <p>
        When the Frequent and Colloquial categories
were combined, the average prenominal to
postnominal division was 95% to 5%. This
combined category was of extremely high
frequency, almost 16 times as common as the
Scholarly category
        <xref ref-type="bibr" rid="ref16">(Lightfoot, 1979)</xref>
        . It is
important to note that all of the data literature cited
in this section explores the raw counts of the data
and categorization of lemmas, but does not specify
the frequency or other attributes of the adjectives.
Wagner’s study underlines the importance of
adjective categorization when considering
adjective placement. In addition, Lightfoot notes
that the difference between source adjectives and
those that are “fairly recent” borrowings
reinforcing Glatigny’s theories of ancient French
versus modern linguistic acquisitions (1979).
3.2
      </p>
    </sec>
    <sec id="sec-6">
      <title>University Essays</title>
      <p>
        In 1980, Wilmet investigated 90 philology essays
of university students dating from 1977 to 1978
        <xref ref-type="bibr" rid="ref10">(Delbecque, 1990)</xref>
        . Of the 3,835 adjectives found,
Wilmet reported that 2.3% were exclusively
prenominal, 16.9% alternated in position, and
80.8% were exclusively postnominal (1980).
Wilmet did not account for categories of
adjectives, but what can be interpreted from this
study is the propensity for the work of university
students, who focus their studies on literary
criticism, history, and linguistics, to use literary,
worked, polished language that displays
postnominal preferences, an idea supported by
Thuilier’s work (2013).
3.3
      </p>
    </sec>
    <sec id="sec-7">
      <title>Newspapers</title>
      <p>
        Postnominal preference is also found in most of the
studies in newspaper corpora. Five studies of
French adjective placement in newspapers
published between 1911 and 1978 by Damourette
and Pichon, Hug, and Forsgren returned
homogenous results of 65% postnominal
placement. With an average of 3,000 adjectives per
study, all of this research compared raw counts of
adjective-noun placement
        <xref ref-type="bibr" rid="ref10">(Delbecque, 1990)</xref>
        .
3.4
      </p>
    </sec>
    <sec id="sec-8">
      <title>Spoken and Modern Corpora</title>
      <p>
        Contemporary analyses provide more thorough and
nuanced approaches to corpora studies as seen
through the work of
        <xref ref-type="bibr" rid="ref30">Thuilier (2013)</xref>
        . For reduced
redundancies, section 2.8, refers to Thuilier’s 2013
work. In 2013, Thuilier compiled a corpus of over
1.3 million French words from written and spoken
French. The written corpus was composed of Le
Monde newspaper articles dating from 1989-1993
sourced from the French Treebank (FTB). For the
spoken corpus, Thuilier used the 2005 edition of
the CORAL Romance corpus of transcribed
speech. Thuilier’s data reveals placement norms
between the French written and spoken usages.
      </p>
      <p>Spoken data displayed placement that was more
rigid and more likely to follow prescriptivist norms
than written language. In explanation, Thuilier
hypothesizes that written language is “travaillée,”
worked and polished, and therefore may have
greater opportunity for flexibility, nuance, and
variations of style (translation by the author).
Conversely, it was hypothesized that spoken
language is spontaneous and instinctive tendency
follows mainstream norms.</p>
      <p>Thuilier’s corpus comprised of 1,750 adjective
lemmas of which 170 were found to alternate in
position. Oral corpora lemmas demonstrated a
prenominal preference of 74% while their written
corpora counterparts demonstrated a prenominal
preference of 67%.</p>
      <p>Additionally, Thuilier coined a term about the
syllable length patterning demonstrated by the
data, “court avant long,” short before long
(translated by the author). Prenominal adjectives
were frequently monosyllabic and “adjectives of
one syllable are more than 80% in anteposition”
(translated by the author).
From these studies and especially that of
Thuilier’s work, further experiments can be
designed to investigate several factors in
adjectivenoun placement in French corpora. To summarize
the above sections succinctly the percentages
quoted in this section are summarized in Table 1.
The following factors have been previously
demonstrated to be significant in classifying
French adjective placement: lemmas as a better
gauge of corpora rather than raw counts of
adjectives, variation of placement, categorization
of adjectives, frequency, origin of corpora, and
syllable length.
3.5</p>
    </sec>
    <sec id="sec-9">
      <title>Twitter</title>
      <p>
        Twitter is a real-time social networking and
microblogging service
        <xref ref-type="bibr" rid="ref17">(Lomicka &amp; Lord, 2012)</xref>
        .
Users can read, post, and repost messages called
“tweets.” Tweets are limited to 140 characters that
are retained on a user’s profile page, called a
“timeline”
        <xref ref-type="bibr" rid="ref17">(Lomicka &amp; Lord, 2012)</xref>
        .
      </p>
      <p>
        A global phenomenon, Twitter has 288 million
users as of 2015
        <xref ref-type="bibr" rid="ref19 ref27 ref28">(Popper, 2014; Statistica, 2015)</xref>
        .
While the company is American and based in San
Francisco, 77% of Twitter accounts are outside of
the United States and over 35 Twitter company
offices are outside of the United States, including
one in Paris
        <xref ref-type="bibr" rid="ref19 ref25 ref27 ref31 ref32 ref33 ref34 ref35">(Twitter, 2015a)</xref>
        . Despite its American
origins, which might imply high volumes of
English language tweets, only 34% of tweets are in
English
        <xref ref-type="bibr" rid="ref24">(Smith, 2015)</xref>
        . Twitter users from France
tripled from 2009 to 2013
        <xref ref-type="bibr" rid="ref27">(Statistica, 2013)</xref>
        .
4
      </p>
    </sec>
    <sec id="sec-10">
      <title>Defining the Corpus</title>
      <p>
        Approximately 6,000 tweets were analyzed from
the Twitter feeds of the French newspaper Le
Monde (@lemondefr) and the French comedian
Cyprien (@MonsieurDream). These users were
chosen because of 1) their comparable four million
Twitter followers
        <xref ref-type="bibr" rid="ref19 ref25 ref27 ref31 ref32 ref33 ref34 ref35">("Twitter statistics for France,"
2015)</xref>
        , 2) Le Monde’s connection to Thuilier’s
(2013) and
        <xref ref-type="bibr" rid="ref36">Wilmet’s (1980</xref>
        ) corpora studies, and 3)
the users have the 2nd and 3rd largest audiences of
Twitter in France. The top 24 French Twitter
accounts are attributed to men or groups
        <xref ref-type="bibr" rid="ref25">(Socialbakers, 2015)</xref>
        thus the Twitter timelines
analyzed in this study are a sample of the most
popular, mainstream content on French Twitter.
4.1
      </p>
    </sec>
    <sec id="sec-11">
      <title>Le Monde</title>
      <p>
        Launched in March 2009, Le Monde’s Twitter
timeline feed of tweets reports events related to
France, the world, and politics. Written by
journalists Pauline Croquet and Clément Martel,
the timeline has over 132,000 tweets and over
6,000 photos and videos
        <xref ref-type="bibr" rid="ref12 ref13 ref19 ref24 ref25 ref27 ref31 ref32 ref33 ref34 ref35">(Twitter, 2015c)</xref>
        . As the
online presence of one of the largest French
newspapers, the tweets from this source are
objective, political, newsworthy, and diplomatic in
nature.
4.2
      </p>
    </sec>
    <sec id="sec-12">
      <title>Cyprien</title>
      <p>
        While Cyprien has significantly fewer tweets than
the Le Monde timeline, Cyprien’s popularity was
determined to be a more important factor than
number of tweets. Launched in June 2007,
MonsieurDream has over 10,000 tweets and is
written by Cyprien Iov who goes by Cyprien
        <xref ref-type="bibr" rid="ref19 ref25 ref27 ref31 ref32 ref33 ref34 ref35">(Twitter, 2015d)</xref>
        . Cyprien is a prolific 24-year-old
French blogger, illustrator, comedian, and poster of
videos on YouTube.com
        <xref ref-type="bibr" rid="ref12 ref13 ref19">(Iov, 2015a, 2015b)</xref>
        . His
work ranges from the banal to the political,
blogging about the experience of moving
apartments, gay marriage, and general
selfpromotion. Cyprien has more Twitter followers
than the Twitter accounts of movie celebrities, for
example Gad Elmaleh with 300,000 fewer
followers
        <xref ref-type="bibr" rid="ref25">(Socialbakers, 2015)</xref>
        . As a young
comedian, Cyprien’s humor is colloquial,
sensationalized, dramatic, and irreverent in nature.
His most popular YouTube video is a rap song
where Cyprien responds a man who criticized his
jokes
        <xref ref-type="bibr" rid="ref9">(Cyprien, 2011)</xref>
        . As of May 2016, the video
has almost 37 million views and includes jokes
about sex, online reputation, clothing style, and
poor education, concluding with a joke about
killing handicapped people
        <xref ref-type="bibr" rid="ref9">(Cyprien, 2011)</xref>
        .
      </p>
      <p>By juxtaposing the linguistic trends from these
two users, adjective placement norms on Twitter
can be examined. It was hypothesized that French
tweets would display higher levels of adjective
placement restriction than Thuilier’s corpora
findings (2013). It was expected that Cyprien’s
usage of language would demonstrate even greater
restriction in adjective placement than Le Monde’s
tweets do to the higher register and scholastic
nature of the latter’s content.
5</p>
    </sec>
    <sec id="sec-13">
      <title>Methodology</title>
      <p>The research looked into the placement norms of
Le Monde and Cyprien on Twitter to determine
whether corpora more closely resembled written
or spoken corpora norms. In addition, questions of
distribution, alternation, syllable length and
frequency were considered in the data collection
and analysis.</p>
      <p>
        Tweets were downloaded from twitter using
Python scripts and the Python module “Tweepy”
        <xref ref-type="bibr" rid="ref21">(Roesslein, 2009)</xref>
        . To access the tweets, a Twitter
account was created with a corresponding Twitter
website giving the user access to personalized
Application Program Interface (API) credentials.
Modified from scripts posted on GitHub
        <xref ref-type="bibr" rid="ref37">(Yanofsky, 2013)</xref>
        , roughly 6,000 tweets were
downloaded from Twitter users Le Monde and
MonsieurDream. Due to restrictions on tweet
access to the general public, roughly 3,200 tweets
were downloadable per Twitter timeline. Each user
has a homepage, a unique page of their tweets and
retweets called a “timeline,” synonymous with the
term “Twitter feed”
        <xref ref-type="bibr" rid="ref19 ref25 ref27 ref31 ref32 ref33 ref34 ref35">(Twitter, 2015b)</xref>
        . The scripts
created csv files of the tweet’s date published and
text content. Due to the prolific tweeting of Le
Monde, all tweets analyzed were published
between April 1st, 2015 and May 4th, 2015. For the
Cyprien corpus, the tweets from his timeline were
published between December 9th, 2009 and May
2nd, 2015. Hereafter for reasons of clarity, the
corpora will be termed “Le Monde” and “Cyprien”
and will no longer be italicized.
      </p>
      <p>For the scope of this study, in the Twitter corpus
30 adjective lemmas were investigated, see
Appendix. These lemmas were chosen based on
their prevalence in the linguistic literature and
were placed into categories: High Frequency,
Colors, Contemporary/Neologisms, Famous
Alternators, and Others, see Appendix A.</p>
      <p>
        Tweets containing the specific adjective lemmas
were appended to separate csv files to isolate all
instances of the adjective. Due to the added
complexity of multiple adjectives, see mirror
imaging
        <xref ref-type="bibr" rid="ref7 ref8">(Cinque, 1994, 2010)</xref>
        , only tweets with a
single adjective were analyzed. Tallies were
created of how many instances were found for each
adjective lemma within the two Twitter feeds.
      </p>
      <p>These tallies were then sorted to categorize
adjectives found in a determiner phrase. From
those that fit the correct syntactic environment,
determiner phrase, they were annotated as
prenominal and postnominal. Tallies were noted
for all the above values. In Python, a Chi square
test was conducted on the raw counts of the 30
adjectives found in the determiner phrase of either
timeline were statistically significant.
6</p>
      <p>Findings
6,406 tweets were data-mined from the Cyprien
and Le Monde Twitter feeds. After isolating the 30
adjective lemmas, 1,181 adjective occurrences
were analyzed. Roughly 82% of these single
adjectives were found in a determiner phrase or
noun phrase, a total of 949 adjective occurrences.
Some adjective lemmas were not attested within
the corpora. In the analysis of the data, variation in
placement, categorization of adjectives, adjective
frequency, syllable length and origin of the corpora
were analyzed. The following analysis considers
the Le Monde and Cyprien datasets separately as
well as combined, to consider the implications of
Twitter as a usage context.</p>
      <p>It is crucial to note that despite the 6,406 tweets
data-mined, the resulting adjectives found in
determiner phrases was lower than initially
anticipated. It is therefore recognized the findings
from this corpus must be considered a preliminary
study that has recommendations for further
investigations.
6.1</p>
    </sec>
    <sec id="sec-14">
      <title>Placement Variation</title>
      <p>Of adjectives found within the corpora, 8% of
adjective lemmas in Cyprien tweets and 30%
adjective lemmas in Le Monde’s tweets varied in
placement, Figure 1.</p>
      <p>
        Of the lemmas which alternated in this corpus,
they demonstrated prenominal preference at an
average of 80%/20% prenominal/postnominal
split. The distribution of this variation ranged from
60% to 99.67% prenominal placement. These
finding are in line with Thuilier’s finding that
alternators are more inclined to be prenominally
placed (2013).
Trends of the placement of alternators differed
between the Cyprien and Le Monde corpora. Le
Monde’s adjectives displayed more variation in
placement whereas very few of Cyprien’s
adjectives varied in placement. Supporting the
initial hypothesis, adjective placement is more
rigid in Cyprien’s tweets than in Le Monde’s
tweets. The margin of difference in usage norms is
a significant finding of this study and would be a
strong starting point, to analyze Twitter feeds of
news sources as compared to the tweets of
individuals, for future investigations.
In consideration of the initial categories of the 30
adjective lemmas, adjectives were analyzed within
their categorizations as a comparison point to
Wagner’s categories of Frequency, Colloquial, and
Scholarly, see Table 2
        <xref ref-type="bibr" rid="ref16">(Lightfoot, 1979)</xref>
        . Of the
Famous Alternators, most of these were
prenominally placed, though many of this category
were adjective lemmas that were unattested in the
corpus. Contemporary/Neologisms and Others
displayed the greatest amount of variation in
placement within the corpus. Mirroring Wagner’s
results, High Frequency adjectives were placed
prenominally for the most part, on average 92% of
the adjective lemmas. Colors were exclusive found
in postnominal positions. This complements
Wilmet’s findings which demonstrated that “blanc,
bleu, noir et rouge” were found postnominally
97.4% of the time
        <xref ref-type="bibr" rid="ref30">(1980; Thuilier, 2013)</xref>
        . Thus,
category of adjective was helpful for grouping
some adjectives, but not for all the categories of
the study.
Adjectives used with highest frequency are
correlated with those that are prenominally placed.
Of the top 10 adjective lemmas in the Cyprien
tweets, all were prenominal dominant at an average
of 97.79% overall. Of the top 10 adjective lemmas
in the Le Monde tweets, 70% were prenominal
dominant, see Table 3. These high frequency
adjectives in the Le Monde tweets were
exclusively postnominal, “politique,” “blanc,” and
“rouge.” As seen also with the above section,
adjectives describing colors are famously
postnominally placed
        <xref ref-type="bibr" rid="ref11">(Grevisse &amp; Goosse, 2011)</xref>
        .
Of the highest frequency adjectives from both
timelines, 6 out of 10 overlap between the Cyprien
and Le Monde tweets.
In the Cyprien corpus, there was a significant
divide between the high frequency adjectives
prenominal dominance to low frequency
postnominal dominance. The Le Monde corpus is
more nuanced in this separation of variables. In the
Le Monde corpus, 7 of the 10 high frequency
adjectives displayed prenominal dominance. For
the low frequency adjectives, 10 of 13 also
displayed prenominal preference, with only 3 that
were exclusively postnominal in
placement.Overall, the Le Monde corpus has a
larger number of adjectives that alternate in
placement than those that are exclusively
postnominally placed, which leads to lower levels
of stratification in frequency and placement. When
the corpora were combined, the mixed nature of
the Le Monde corpus was demonstrated.
Frequency was strongly correlated with
prenominal adjective placement in the Cyprien
corpus where it was highly stratified, but this trend
was less significant in the Le Monde corpus.
6.4
      </p>
    </sec>
    <sec id="sec-15">
      <title>Syllable Length</title>
      <p>
        The results of this study suggest that syllable count
was not a significant factor in adjective placement.
Previous literature indicated that “short before
long” patterns were observed in larger data sets
        <xref ref-type="bibr" rid="ref30">(Thuilier, 2013)</xref>
        , however these trends were not
observed in these data, see Appendix B. The
average percentage for the average prenominal
placement for lemmas with one syllable was 60%,
two syllables at 82%, and three syllables at 61%.
While two syllable adjectives display prenominal
preference, this is not a major finding of this study.
6.5
      </p>
    </sec>
    <sec id="sec-16">
      <title>Twitter Corpora Overall</title>
      <p>As seen in the variation of placement, Le Monde’s
adjective lemmas demonstrated greater flexibility
of placement than Cyprien’s, which supports the
hypothesis. Le Monde’s flexibility was more in
line with written corpora norms as seen in Thuilier,
while Cyprien’s follows more spoken norms of
restriction (2013).</p>
      <p>As compared to the corpora in Table 1, section 2.8,
this Twitter corpora demonstrated significant
stratification at a 76%/31% prenominal split. From
a raw count of the 30 adjectives found in either
position within the corpora, Cyprien’s adjective
placement was more stratified than Le Monde’s
placement, Table 4. When combining the counts in
both Twitter feeds, there was a strong tendency
toward prenominal placement: 504 prenominal to
100 postnominal, 5:1 a ratio.
It should be noted that the lemma “nouveau” was
removed from the raw count comparisons due to its
nature as an outlier. While the lemma posed no
problem as an example of categories, high
frequency, and syllable length, its high raw count
skews the data set. Within the Cyprien data set,
“nouveau” was found 301 instances in the
determiner phrase compared to all the other
adjectives with values ranging from 0 to 59
occurrences. Due to the limited scope of this article
“nouveau” was not analyzed independently, a case
for future study.
7</p>
    </sec>
    <sec id="sec-17">
      <title>Conclusions</title>
      <p>
        Twitter corpora displayed adjective placement
norms that were more closely aligned with spoken
corpora norms than written corpora norms when
compared to the patterns explored in
        <xref ref-type="bibr" rid="ref30">Thuilier
(2013)</xref>
        . As hypothesized, the Cyprien corpus was
more restricted in adjective placement alternation
than the Le Monde corpus. In the Le Monde
tweets, 30% of the adjectives were found in both
prenominal and postnominal positions, much
higher than the 8% alternating in the Cyprien
corpus. This suggests the greater flexibility of
adjective placement in the Le Monde corpus.
Syllable count was not a significant factor in
adjective placement, though there was a tendency
for adjectives of two syllables to be prenominally
place. Frequency was a significant factor in
adjective placement and was highly correlated with
prenominal placement in the Cyprien corpus.
      </p>
      <p>Limitations of this study include small final
numbers of adjectives attested per lemma. Despite
the approximately 6,000-tweet sample size, some
adjectives occurred less than 5 times. In the future,
larger data sets would be preferred. In addition, the
30 adjectives chosen may have influenced the data
collected and in the future, different sampling
methods could remedy this issue. Overall, these
findings provide preliminary insight into the
French adjective placement norms on Twitter.
7.1</p>
    </sec>
    <sec id="sec-18">
      <title>Further Research</title>
      <p>Sampling: Using a parts of speech tagger, a larger
analysis of all the adjectives in the corpora could
be conducted. Tweets could be data mined over a
specific time interval rather than from user
timelines. In addition, this study showed the
longitudinal trends of specific users. Further
research could also examine cross-sectional data of
streams of tweets and isolate specific collocations
or adjective usage. Collocations of adjective-noun
pairings and a study of ngrams of the corpus could
illuminate stylistic preferences of Twitter users.
Gender and other demographics data could be
compiled to generate studies that compare different
types of Twitter users.</p>
      <p>Nouveau: Further study might investigate specific
adjective lemmas, such as nouveau, for a
microanalysis interpretation of the data.</p>
      <sec id="sec-18-1">
        <title>Dialects and Languages: While it is clear that</title>
        <p>
          French is spoken around the world, varieties of the
language based on location were not chosen for the
this study
          <xref ref-type="bibr" rid="ref30">(Thuilier, 2013)</xref>
          , but could be analyzed
to compare language norms of this corpus.
Semantic shifts in placement were not analyzed
within this study, but could be an aspect of future
work. French is not the only language that displays
adjective-noun prenominal and postnominal
alternation; Spanish does as well
          <xref ref-type="bibr" rid="ref1 ref10">(Alexiadou et al.,
2007; Delbecque, 1990)</xref>
          . An analysis of the
adjectives in Spanish tweets, or other languages
with comparable alternation trends could be
compared to this study. This study provides a
jumping-board for further adjective-noun Twitter
corpora studies.
        </p>
      </sec>
      <sec id="sec-18-2">
        <title>Appendix A</title>
        <p>Full Combined Data Sets for Prenominal and
Postnominal Placement of Adjectives found in
Determiner Phrases</p>
      </sec>
      <sec id="sec-18-3">
        <title>Appendix B</title>
      </sec>
    </sec>
    <sec id="sec-19">
      <title>Acknowledgments</title>
      <p>Thank you to the reviewers for their insightful
comments and to Dr. Raúl Aranovich, Dr. Susan
Palmiter, and Alan Wong for their remarks on earlier
versions of this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Alexiadou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haegeman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Stavrou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Noun Phrase in the Generative Perspective</article-title>
          .
          <source>Noun Phrase in the Generative Perspective</source>
          ,
          <volume>71</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>664</lpage>
          . doi:
          <volume>10</volume>
          .1515/9783110207491
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Benzitoun</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>The place of the attributive adjective in French: what we learn speech corpora</article-title>
          . 4e
          <string-name>
            <surname>Congres Mondial De Linguistique Francaise</surname>
          </string-name>
          ,
          <volume>8</volume>
          , 16. doi:
          <volume>10</volume>
          .1051/shsconf/20140801066
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Boucher</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Mapping Function to Form: Adjective Positions in French</article-title>
          .
          <source>Lingvisticae Investigationes</source>
          ,
          <volume>29</volume>
          (
          <issue>1</issue>
          ),
          <fpage>43</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Brault</surname>
            ,
            <given-names>G. J.</given-names>
          </string-name>
          (
          <year>1978</year>
          ).
          <source>The Song of Roland</source>
          . University Park: Pennsylvania State University Press.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Brault</surname>
            ,
            <given-names>G. J.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Song of Roland: An Analytical Edition: Introduction</article-title>
          and Commentary: Pennsylvania State University Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Chomsky</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>1986</year>
          ).
          <article-title>Knowledge of Language: Its Nature, Origin,</article-title>
          and Use: Praeger.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Cinque</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>1994</year>
          ).
          <article-title>On the evidence for partial Nmovement in the Romance DP</article-title>
          .
          <source>Paths towards universal grammar</source>
          ,
          <fpage>85</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Cinque</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>The Syntax of Adjectives: A Comparative Study:</article-title>
          MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Cyprien.</surname>
          </string-name>
          (
          <year>2011</year>
          , May 21,
          <year>2016</year>
          ).
          <article-title>Cyprien répond à Cortex</article-title>
          . Retrieved from https://www.youtube.com/watch?v=dKwzZZKIbUs
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Delbecque</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>1990</year>
          ).
          <article-title>Word Order as a Reflection of Alternate Conceptual Construals in French and Spanish. Similarities and Divergences in Adjective Position</article-title>
          .
          <source>Cognitive Linguistics</source>
          ,
          <volume>1</volume>
          (
          <issue>4</issue>
          ),
          <fpage>349</fpage>
          -
          <lpage>416</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Grevisse</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Goosse</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Le bon usage : grammaire française : 75 ans</article-title>
          . Bruxelles; [Paris]: De Boeck-Duculot.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Iov</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2015a</year>
          ).
          <source>Presse. Retrieved 30 May</source>
          <year>2015</year>
          , from http://www.cyprien.fr/index.php/presse/
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Iov</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2015b</year>
          ).
          <source>YouTube Channel: Cyprien.fr. Retrieved 30 May</source>
          <year>2015</year>
          , from https://www.youtube.com/user/MonsieurDream
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Laenzlinger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>French adjective ordering: perspectives on DP-internal movement types</article-title>
          .
          <source>Lingua</source>
          ,
          <volume>115</volume>
          (
          <issue>5</issue>
          ),
          <fpage>645</fpage>
          -
          <lpage>689</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.lingua.
          <year>2003</year>
          .
          <volume>11</volume>
          .003
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Ledgeway</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>From Latin to romance : morphosyntactic typology and change</article-title>
          . Oxford; New York, NY: Oxford University Press.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Lightfoot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>1979</year>
          ). Principles of Diachronic Syntax: Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Lomicka</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Lord</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>A tale of tweets: Analyzing microblogging among language learners</article-title>
          .
          <source>System</source>
          ,
          <volume>40</volume>
          (
          <issue>1</issue>
          ),
          <fpage>48</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Matthews</surname>
            ,
            <given-names>P. H.</given-names>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>Syntactic relations: A critical survey</article-title>
          (Vol.
          <volume>114</volume>
          ): Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Popper</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <source>Twitter Now Has 255 Million Users, but Activity Has Dropped Year over Year. Retrieved 20 Mar</source>
          .
          <year>2015</year>
          , from http://www.theverge.com/
          <year>2014</year>
          /4/29/5665752/twitter -q1
          <string-name>
            <surname>-</surname>
          </string-name>
          2014-earnings
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Prévost</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>The Acquisition of French: The Development of Inflectional Morphology and Syntax in L1 Acquisition, Bilingualism, and</article-title>
          L2 Acquisition: John Benjamins Publishing Company.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Roesslein</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2009</year>
          ). Tweepy Documentation. from http://docs.tweepy.
          <source>org/en/v3.2</source>
          .0/
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Sinclair</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Carter</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Trust the text : language, corpus and discourse</article-title>
          . London; New York, N.Y.: Routledge.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Sleeman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Van de Velde,
          <string-name>
            <given-names>F.</given-names>
            , &amp;
            <surname>Perridon</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Adjectives in Germanic</article-title>
          and Romance: John Benjamins Publishing Company.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2015</year>
          , 20 May.
          <year>2015</year>
          ).
          <article-title>By the Numbers:</article-title>
          150 Amazing Twitter Statistics (May
          <year>2015</year>
          ).
          <source>30 May</source>
          <year>2015</year>
          , from http://expandedramblings.com/index.php/march2013
          <article-title>-by-the-numbers-a-few-amazing-twitterstats/10/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Socialbakers.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <source>Most Popular Twitter Accounts in France. Twitter Statistics for France. Retrieved 30 May</source>
          <year>2015</year>
          , from http://www.socialbakers.com/statistics/twitter/profile s/france/
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>J. J.</given-names>
          </string-name>
          (
          <year>2012</year>
          ). Word order: Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Statistica.</surname>
          </string-name>
          (
          <year>2013</year>
          ). Facebook,
          <source>Twitter and Google Penetration</source>
          <year>2009</year>
          -2013 | France.
          <source>Retrieved 30 May</source>
          <year>2015</year>
          , from &lt;http://www.statista.com/statistics/417082/socialnetwork-subscribers
          <article-title>-among-internet-users-france/&gt;</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Statistica.</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <source>Social Networks: Global Sites Ranked by Users</source>
          <year>2015</year>
          .
          <article-title>Leading Social Networks Worldwide as of March 2015, Ranked by Number of Active Users (in Millions)</article-title>
          .
          <source>Retrieved 30 May</source>
          <year>2015</year>
          , from http://www.statista.com/statistics/272014/globalsocial-networks
          <article-title>-ranked-by-number-of-users/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Stavrou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>The syntax of adjectives: A comparative study</article-title>
          .
          <source>Language</source>
          ,
          <volume>88</volume>
          (
          <issue>2</issue>
          ),
          <fpage>419</fpage>
          -
          <lpage>423</lpage>
          . doi:
          <volume>10</volume>
          .1353/lan.
          <year>2012</year>
          .0024
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>Thuilier</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Syntaxe du français parlé vs. écrit : le cas de la position de l'adjectif épithète par rapport au nom</article-title>
          .
          <source>TIPA.</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>Twitter.</surname>
          </string-name>
          (
          <year>2015a</year>
          ).
          <source>About Twitter, Inc. Retrieved 30 May</source>
          <year>2015</year>
          , from https://about.twitter.com/company
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <surname>Twitter.</surname>
          </string-name>
          (
          <year>2015b</year>
          ).
          <article-title>Help Center: What's a Twitter timeline?</article-title>
          <source>Retrieved 04 Jun</source>
          <year>2015</year>
          , from https://support.twitter.com/articles/164083
          <article-title>-what-s-atwitter-timeline</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <surname>Twitter.</surname>
          </string-name>
          (
          <year>2015c</year>
          ).
          <source>Le Monde.fr Twitter. Retrieved 30 May</source>
          <year>2015</year>
          , from https://twitter.com/lemondefr
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <surname>Twitter.</surname>
          </string-name>
          (
          <year>2015d</year>
          ).
          <source>Monsieur Dream Twitter. Retrieved 30 May</source>
          <year>2015</year>
          , from https://twitter.com/MonsieurDream
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          Twitter statistics for France. (
          <year>2015</year>
          ).
          <source>Retrieved 20.March</source>
          .
          <year>2015</year>
          , from http://www.socialbakers.com/statistics/twitter/profile s/france/
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <surname>Wilmet</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>1980</year>
          ).
          <article-title>Anteposition and Postposition of Qualificative Epithets in Contemporary French</article-title>
          . Travaux de Linguistique,
          <volume>7</volume>
          ,
          <fpage>179</fpage>
          -
          <lpage>201</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <string-name>
            <surname>Yanofsky.</surname>
          </string-name>
          (
          <year>2013</year>
          , 1 Nov.
          <year>2013</year>
          ).
          <article-title>A Script to Download All of a User's Tweets into a Csv. Yanofsky/tweet_dumper</article-title>
          .
          <source>py. Retrieved 30 May</source>
          <year>2015</year>
          , from https://gist.github.com/yanofsky/5436496
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>