=Paper= {{Paper |id=Vol-1607/bajorek |storemode=property |title=Adjective-Noun Placement Variation in French Twitter Corpora |pdfUrl=https://ceur-ws.org/Vol-1607/bajorek.pdf |volume=Vol-1607 |authors=Joan Palmiter Bajorek |dblpUrl=https://dblp.org/rec/conf/clif/Bajorek16 }} ==Adjective-Noun Placement Variation in French Twitter Corpora== https://ceur-ws.org/Vol-1607/bajorek.pdf
      Adjective-Noun Placement Variation in French Twitter Corpora

                                         Joan Palmiter Bajorek
                                     University of California, Davis
                                    One Shields Ave, Davis, CA 95616
                                     jpbajorek@ucdavis.edu


                                                       French tweets through the examination of two
                    Abstract                           French Twitter timelines of users Le Monde
                                                       and Cyprien. It is hypothesized that French
    French written and spoken corpora reveal
                                                       tweets will display higher levels of adjective
    that variation of adjective-noun placement
                                                       placement restriction than Thuilier’s written
    occurs more often within written corpora
                                                       and spoken corpora findings. This may be
    than in spoken ones (Lightfoot, 1979;
                                                       attributed to Twitter’s character limit and the
    Prévost, 2009; Thuilier, 2013). I examine
                                                       curt, spoken-like treatment of language on the
    adjective lemma variation of placement,
                                                       social network and overall online usage of
    adjective categories, frequency, and
                                                       language.
    syllable length as a means of assessing
    current Twitter trends of adjective
                                                       2     French Adjectives
    placement. In this preliminary study, a
    cross-section of 30 adjective lemmas
                                                       In French, attributive adjectives “adjectifs
    within a corpus of approximately 6,000
                                                       épithètes" can be placed before or after the noun as
    French tweets demonstrates that parallels
                                                       seen in (1) (Grevisse & Goosse, 2011; Laenzlinger,
    exist between the usage of adjectives on
                                                       2005), an example modified from (Thuilier, 2013).
    Twitter and norms of spoken French.
                                                       (1)     a. un beau requin       b. un requin sympa
1   Introduction                                                 a beautiful shark          a shark nice
This article examines French adjective-noun            In example (1a), the adjective “beau,” found in the
placement in the determiner phrase within              prenominal position and modifies the noun
written, spoken, and Twitter corpora.                  “requin.” Conversely, in (1b), the adjective
Placement of French adjectives in the                  “sympa” is postnominal (Alexiadou et al., 2007;
determiner phrase has long been controversial          Delbecque, 1990; Thuilier, 2013). French
and misconstrued within the literature                 adjectives are widely believed to exist in one of
(Benzitoun, 2014; Grevisse & Goosse, 2011).            three categories: fixed prenominal, fixed
It is generally accepted that “[i]n Romance            postnominal, or accepted alternator, with
languages such as French and Spanish,                  recognized semantic shift attributed to either
postnominal adjectives are the rule rather than        position.
the exception” (Alexiadou, Haegeman, &                   Although these categories are frequently used,
Stavrou, 2007).                                        “[i]t has proven notoriously difficult to define the
  However, contemporary written and spoken             functions of the two positions” (Sleeman, Van de
corpora indicate that adjective placement has          Velde, & Perridon, 2014) and variation is oft
greater flexibility and ambiguity than has             attributed to semantic shifts (Alexiadou et al.,
previously been understood. Statistical analysis       2007; Laenzlinger, 2005). For example, the
of corpora indicate that adjective placement is        adjective “brave” means good or decent in the
more restrictive in spoken usage than written          preposition, but courageous or brave in the
usage by a large margin (Thuilier, 2013). This         postposition. In this way, positions are correlated
research investigates adjective placement in           with meaning. Yet, Thuilier, a native speaker of



                                                   7
French, provides several examples of exceptions of          postnominal dominance in Modern French
words that are considered as regular alternators            (Boucher, 2006; Sleeman et al., 2014).
with specified meanings correlated to positions               Several theories postulate reasons for this
(2013). Furthermore, many adjectives retain the             phenomenon.       According     to   Glatigny,
same meaning regardless of placement, such as the           “preposed adjectives belong in very great
adjective “charmant,” meaning charming (Thuilier,           majority to the ancient foundations of the
2013). In many ways, this variation in placement            language” (translated by the author (Thuilier,
thus is a choice of the user and varies widely on           2013)). The shift may resulted from the
the usage context. It should be noted that the              influence and exposure of French and
primary concern of this research is variation in            Romance languages to Germanic languages
placement and patterns of adjective corpora rather          (Grevisse & Goosse, 2011). Yet some reject
than semantic interpretations, which are outside the        these “influenced by another language”
scope of this study.                                        interpretations due to the lack of documented
                                                            evidence (Lightfoot, 1979). Another theory
2.1   Syntactic Theory                                      indicates that prenominal adjectives have
                                                            marked positions that licenses specialized
In considering the syntactic basis of adjective             reading and greater frequency (Ledgeway,
placement, modern frameworks explain the                    2012). Evidence for this theory comes from the
variation through Branching Direction Theory                stronger syntactic bond of prenominal
(Song, 2012). It is posited that nouns move                 adjectives to nouns as compared to the
leftward and up the syntactic tree with “pied-              relationship of their postnominal counterparts
piping” and “snowballing” movement and interact             and nouns. The phonological connections
with linear and mirror image matching among                 between prenominal adjectives and nouns are
languages (Cinque, 1994, 2010; Laenzlinger, 2005;           demonstrated through the use of liaisons and
Stavrou, 2012). However, much is left to be                 irregular adjective agreements (Ledgeway,
desired in these theoretical understandings of              2012).
syntactic underpinnings. If adjectives that alternate
in position and have semantic shifts relegated to           3     Usage Norms in Corpora
those positions, why does the noun moving up and
down the tree account for the adjective’s semantic          For decades, newspapers, literary texts, university
change (Alexiadou et al., 2007)? How does this              essays, and spoken corpora have been analyzed for
theory account for languages that display varied            their adjective placement norms. While patterns
adjective-noun ordering systems? In the words of            arise, gaps in information and uniformity of
Alexiadou et al., “ [a] question that remains open is       analysis abound in the research. Most studies cite
why various languages have to resort to different           raw counts of adjectives found to be prenominal or
means (order, excessive articles, morphology) in            postnominal. They do not take into account
order to encode different interpretations of                adjective lemma repetition, aspects of frequency,
adjective-noun       combinations”       (parentheses       and the possibility of alternation of position.
original (2007)). Do to the recent growth in                Recognizing the importance of the source of
Optimality Theory (Song, 2012), there is hope that          corpora, Delbecque writes that prenominal and
elegant, more fully explanatory theories can be             postnominal proportions of placement are “mainly
developed.                                                  to be attributed to the text genre” (1990). Echoing
                                                            these sentiments, Thuilier writes that intense
2.2   Historical Background                                 difference exist according to the type of production
                                                            (2013).
French evolved from Latin, a language with no
fixed adjective placement (Sleeman et al.,                  3.1    Literary Texts
2014). Between the 13th and 19th Century,
adjective-noun placement shifted from                       In Wydler’s study of the famous text “La Chanson
prenominal preference in Old French to                      de Roland,” 70% of overall adjectives were
                                                            prenominal (Gerard J. Brault, 1978; Lightfoot,



                                                        8
1979). “The Song of Roland” is the written version            80.8% were exclusively postnominal (1980).
of the epic tale that was once performed orally,              Wilmet did not account for categories of
though its precise origins are clouded in                     adjectives, but what can be interpreted from this
“considerable speculation” (G.J. Brault, 2010).               study is the propensity for the work of university
Delbecque notes that modern French novels                     students, who focus their studies on literary
demonstrate the same proportional split in                    criticism, history, and linguistics, to use literary,
placement, but provides no data to support the                worked, polished language that displays
claim (1990).                                                 postnominal preferences, an idea supported by
  In another analysis of French adjectives, Wagner            Thuilier’s work (2013).
separated adjectives into three basic categories.
“Cardinals” are basic, essential, and high                    3.3   Newspapers
frequency adjectives throughout the French
language.      “Populaires”       are     mainstream,         Postnominal preference is also found in most of the
colloquial, and wide spoken adjectives. Finally,              studies in newspaper corpora. Five studies of
“savants” are “learned” adjectives of scholarly               French adjective placement in newspapers
registers (translations by the author (Lightfoot,             published between 1911 and 1978 by Damourette
1979)). For clarity in the following section, these           and Pichon, Hug, and Forsgren returned
three categories will be referred to as Frequent,             homogenous results of 65% postnominal
Colloquial, and Scholarly. In an examination of               placement. With an average of 3,000 adjectives per
13th and 14th century prose, Wagner found that                study, all of this research compared raw counts of
Frequent adjectives strongly favored prenominal               adjective-noun placement (Delbecque, 1990).
positions: 2,393 prenominal and 11 postnominal
occurrences, a 99.5% to 0.5% split (Lightfoot,                3.4   Spoken and Modern Corpora
1979). 75% of Colloquial adjectives were
prenominal whereas roughly 70% of Scholarly                   Contemporary analyses provide more thorough and
adjectives were postnominal.                                  nuanced approaches to corpora studies as seen
   When the Frequent and Colloquial categories                through the work of Thuilier (2013). For reduced
were combined, the average prenominal to                      redundancies, section 2.8, refers to Thuilier’s 2013
postnominal division was 95% to 5%. This                      work. In 2013, Thuilier compiled a corpus of over
combined category was of extremely high                       1.3 million French words from written and spoken
frequency, almost 16 times as common as the                   French. The written corpus was composed of Le
Scholarly category (Lightfoot, 1979). It is                   Monde newspaper articles dating from 1989-1993
important to note that all of the data literature cited       sourced from the French Treebank (FTB). For the
in this section explores the raw counts of the data           spoken corpus, Thuilier used the 2005 edition of
and categorization of lemmas, but does not specify            the CORAL Romance corpus of transcribed
the frequency or other attributes of the adjectives.          speech. Thuilier’s data reveals placement norms
Wagner’s study underlines the importance of                   between the French written and spoken usages.
adjective     categorization    when       considering           Spoken data displayed placement that was more
adjective placement. In addition, Lightfoot notes             rigid and more likely to follow prescriptivist norms
that the difference between source adjectives and             than written language. In explanation, Thuilier
those that are “fairly recent” borrowings                     hypothesizes that written language is “travaillée,”
reinforcing Glatigny’s theories of ancient French             worked and polished, and therefore may have
versus modern linguistic acquisitions (1979).                 greater opportunity for flexibility, nuance, and
                                                              variations of style (translation by the author).
3.2   University Essays                                       Conversely, it was hypothesized that spoken
                                                              language is spontaneous and instinctive tendency
In 1980, Wilmet investigated 90 philology essays              follows mainstream norms.
of university students dating from 1977 to 1978                  Thuilier’s corpus comprised of 1,750 adjective
(Delbecque, 1990). Of the 3,835 adjectives found,             lemmas of which 170 were found to alternate in
Wilmet reported that 2.3% were exclusively                    position. Oral corpora lemmas demonstrated a
prenominal, 16.9% alternated in position, and                 prenominal preference of 74% while their written



                                                          9
corpora counterparts demonstrated a prenominal               are retained on a user’s profile page, called a
preference of 67%.                                           “timeline” (Lomicka & Lord, 2012).
   Additionally, Thuilier coined a term about the               A global phenomenon, Twitter has 288 million
syllable length patterning demonstrated by the               users as of 2015 (Popper, 2014; Statistica, 2015).
data, “court avant long,” short before long                  While the company is American and based in San
(translated by the author). Prenominal adjectives            Francisco, 77% of Twitter accounts are outside of
were frequently monosyllabic and “adjectives of              the United States and over 35 Twitter company
one syllable are more than 80% in anteposition”              offices are outside of the United States, including
(translated by the author).                                  one in Paris (Twitter, 2015a). Despite its American
                                                             origins, which might imply high volumes of
                                                             English language tweets, only 34% of tweets are in
                                                             English (Smith, 2015). Twitter users from France
                                                             tripled from 2009 to 2013 (Statistica, 2013).

                                                             4     Defining the Corpus

                                                             Approximately 6,000 tweets were analyzed from
                                                             the Twitter feeds of the French newspaper Le
                                                             Monde (@lemondefr) and the French comedian
                                                             Cyprien (@MonsieurDream). These users were
                                                             chosen because of 1) their comparable four million
                                                             Twitter followers ("Twitter statistics for France,"
                                                             2015), 2) Le Monde’s connection to Thuilier’s
                                                             (2013) and Wilmet’s (1980) corpora studies, and 3)
                                                             the users have the 2nd and 3rd largest audiences of
                                                             Twitter in France. The top 24 French Twitter
                                                             accounts are attributed to men or groups
                                                             (Socialbakers, 2015) thus the Twitter timelines
  Table 1: Summary of Literature Review Percentages          analyzed in this study are a sample of the most
                                                             popular, mainstream content on French Twitter.
   From these studies and especially that of
Thuilier’s work, further experiments can be
                                                             4.1    Le Monde
designed to investigate several factors in adjective-
noun placement in French corpora. To summarize
                                                             Launched in March 2009, Le Monde’s Twitter
the above sections succinctly the percentages
                                                             timeline feed of tweets reports events related to
quoted in this section are summarized in Table 1.
                                                             France, the world, and politics. Written by
The following factors have been previously
                                                             journalists Pauline Croquet and Clément Martel,
demonstrated to be significant in classifying
                                                             the timeline has over 132,000 tweets and over
French adjective placement: lemmas as a better
                                                             6,000 photos and videos (Twitter, 2015c). As the
gauge of corpora rather than raw counts of
                                                             online presence of one of the largest French
adjectives, variation of placement, categorization
                                                             newspapers, the tweets from this source are
of adjectives, frequency, origin of corpora, and
                                                             objective, political, newsworthy, and diplomatic in
syllable length.
                                                             nature.
3.5   Twitter
                                                             4.2    Cyprien
Twitter is a real-time social networking and
microblogging service (Lomicka & Lord, 2012).                While Cyprien has significantly fewer tweets than
Users can read, post, and repost messages called             the Le Monde timeline, Cyprien’s popularity was
“tweets.” Tweets are limited to 140 characters that          determined to be a more important factor than
                                                             number of tweets. Launched in June 2007,




                                                        10
MonsieurDream has over 10,000 tweets and is                 access to the general public, roughly 3,200 tweets
written by Cyprien Iov who goes by Cyprien                  were downloadable per Twitter timeline. Each user
(Twitter, 2015d). Cyprien is a prolific 24-year-old         has a homepage, a unique page of their tweets and
French blogger, illustrator, comedian, and poster of        retweets called a “timeline,” synonymous with the
videos on YouTube.com (Iov, 2015a, 2015b). His              term “Twitter feed” (Twitter, 2015b). The scripts
work ranges from the banal to the political,                created csv files of the tweet’s date published and
blogging about the experience of moving                     text content. Due to the prolific tweeting of Le
apartments, gay marriage, and general self-                 Monde, all tweets analyzed were published
promotion. Cyprien has more Twitter followers               between April 1st, 2015 and May 4th, 2015. For the
than the Twitter accounts of movie celebrities, for         Cyprien corpus, the tweets from his timeline were
example Gad Elmaleh with 300,000 fewer                      published between December 9th, 2009 and May
followers (Socialbakers, 2015). As a young                  2nd, 2015. Hereafter for reasons of clarity, the
comedian, Cyprien’s humor is colloquial,                    corpora will be termed “Le Monde” and “Cyprien”
sensationalized, dramatic, and irreverent in nature.        and will no longer be italicized.
His most popular YouTube video is a rap song                   For the scope of this study, in the Twitter corpus
where Cyprien responds a man who criticized his             30 adjective lemmas were investigated, see
jokes (Cyprien, 2011). As of May 2016, the video            Appendix. These lemmas were chosen based on
has almost 37 million views and includes jokes              their prevalence in the linguistic literature and
about sex, online reputation, clothing style, and           were placed into categories: High Frequency,
poor education, concluding with a joke about                Colors,     Contemporary/Neologisms,         Famous
killing handicapped people (Cyprien, 2011).                 Alternators, and Others, see Appendix A.
  By juxtaposing the linguistic trends from these             Tweets containing the specific adjective lemmas
two users, adjective placement norms on Twitter             were appended to separate csv files to isolate all
can be examined. It was hypothesized that French            instances of the adjective. Due to the added
tweets would display higher levels of adjective             complexity of multiple adjectives, see mirror
placement restriction than Thuilier’s corpora               imaging (Cinque, 1994, 2010), only tweets with a
findings (2013). It was expected that Cyprien’s             single adjective were analyzed. Tallies were
usage of language would demonstrate even greater            created of how many instances were found for each
restriction in adjective placement than Le Monde’s          adjective lemma within the two Twitter feeds.
tweets do to the higher register and scholastic               These tallies were then sorted to categorize
nature of the latter’s content.                             adjectives found in a determiner phrase. From
                                                            those that fit the correct syntactic environment,
5   Methodology                                             determiner phrase, they were annotated as
                                                            prenominal and postnominal. Tallies were noted
The research looked into the placement norms of             for all the above values. In Python, a Chi square
Le Monde and Cyprien on Twitter to determine                test was conducted on the raw counts of the 30
whether corpora more closely resembled written              adjectives found in the determiner phrase of either
or spoken corpora norms. In addition, questions of          timeline were statistically significant.
distribution, alternation, syllable length and
frequency were considered in the data collection            6   Findings
and analysis.
  Tweets were downloaded from twitter using                 6,406 tweets were data-mined from the Cyprien
Python scripts and the Python module “Tweepy”               and Le Monde Twitter feeds. After isolating the 30
(Roesslein, 2009). To access the tweets, a Twitter          adjective lemmas, 1,181 adjective occurrences
account was created with a corresponding Twitter            were analyzed. Roughly 82% of these single
website giving the user access to personalized              adjectives were found in a determiner phrase or
Application Program Interface (API) credentials.            noun phrase, a total of 949 adjective occurrences.
Modified from scripts posted on GitHub                      Some adjective lemmas were not attested within
(Yanofsky, 2013), roughly 6,000 tweets were                 the corpora. In the analysis of the data, variation in
downloaded from Twitter users Le Monde and                  placement, categorization of adjectives, adjective
MonsieurDream. Due to restrictions on tweet                 frequency, syllable length and origin of the corpora



                                                       11
were analyzed. The following analysis considers
the Le Monde and Cyprien datasets separately as              6.2   Adjective Categories
well as combined, to consider the implications of
Twitter as a usage context.                                  In consideration of the initial categories of the 30
  It is crucial to note that despite the 6,406 tweets        adjective lemmas, adjectives were analyzed within
data-mined, the resulting adjectives found in                their categorizations as a comparison point to
determiner phrases was lower than initially                  Wagner’s categories of Frequency, Colloquial, and
anticipated. It is therefore recognized the findings         Scholarly, see Table 2 (Lightfoot, 1979). Of the
from this corpus must be considered a preliminary            Famous Alternators, most of these were
study that has recommendations for further                   prenominally placed, though many of this category
investigations.                                              were adjective lemmas that were unattested in the
                                                             corpus. Contemporary/Neologisms and Others
6.1   Placement Variation                                    displayed the greatest amount of variation in
                                                             placement within the corpus. Mirroring Wagner’s
Of adjectives found within the corpora, 8% of                results, High Frequency adjectives were placed
adjective lemmas in Cyprien tweets and 30%                   prenominally for the most part, on average 92% of
adjective lemmas in Le Monde’s tweets varied in              the adjective lemmas. Colors were exclusive found
placement, Figure 1.                                         in postnominal positions. This complements
   Of the lemmas which alternated in this corpus,            Wilmet’s findings which demonstrated that “blanc,
they demonstrated prenominal preference at an                bleu, noir et rouge” were found postnominally
average of 80%/20% prenominal/postnominal                    97.4% of the time (1980; Thuilier, 2013). Thus,
split. The distribution of this variation ranged from        category of adjective was helpful for grouping
60% to 99.67% prenominal placement. These                    some adjectives, but not for all the categories of
finding are in line with Thuilier’s finding that             the study.
alternators are more inclined to be prenominally
placed (2013).




   Figure 1: Raw Counts and Percentages of Tweets
                                                              Table 2: Combined Cyprien & Le Monde Data Sets

Trends of the placement of alternators differed
between the Cyprien and Le Monde corpora. Le                 6.3   Frequency
Monde’s adjectives displayed more variation in
placement whereas very few of Cyprien’s                      Adjectives used with highest frequency are
adjectives varied in placement. Supporting the               correlated with those that are prenominally placed.
initial hypothesis, adjective placement is more              Of the top 10 adjective lemmas in the Cyprien
rigid in Cyprien’s tweets than in Le Monde’s                 tweets, all were prenominal dominant at an average
tweets. The margin of difference in usage norms is           of 97.79% overall. Of the top 10 adjective lemmas
a significant finding of this study and would be a           in the Le Monde tweets, 70% were prenominal
strong starting point, to analyze Twitter feeds of           dominant, see Table 3. These high frequency
news sources as compared to the tweets of                    adjectives in the Le Monde tweets were
individuals, for future investigations.                      exclusively postnominal, “politique,” “blanc,” and
                                                             “rouge.” As seen also with the above section,



                                                        12
adjectives describing colors are famously                    While two syllable adjectives display prenominal
postnominally placed (Grevisse & Goosse, 2011).              preference, this is not a major finding of this study.
Of the highest frequency adjectives from both
timelines, 6 out of 10 overlap between the Cyprien           6.5    Twitter Corpora Overall
and Le Monde tweets.
                                                             As seen in the variation of placement, Le Monde’s
                                                             adjective lemmas demonstrated greater flexibility
                                                             of placement than Cyprien’s, which supports the
                                                             hypothesis. Le Monde’s flexibility was more in
                                                             line with written corpora norms as seen in Thuilier,
                                                             while Cyprien’s follows more spoken norms of
                                                             restriction (2013).
                                                             As compared to the corpora in Table 1, section 2.8,
                                                             this Twitter corpora demonstrated significant
                                                             stratification at a 76%/31% prenominal split. From
                                                             a raw count of the 30 adjectives found in either
  Table 3: Combined Cyprien & Le Monde Data Sets             position within the corpora, Cyprien’s adjective
                                                             placement was more stratified than Le Monde’s
In the Cyprien corpus, there was a significant               placement, Table 4. When combining the counts in
divide between the high frequency adjectives                 both Twitter feeds, there was a strong tendency
prenominal dominance to low frequency                        toward prenominal placement: 504 prenominal to
postnominal dominance. The Le Monde corpus is                100 postnominal, 5:1 a ratio.
more nuanced in this separation of variables. In the
Le Monde corpus, 7 of the 10 high frequency
adjectives displayed prenominal dominance. For
the low frequency adjectives, 10 of 13 also
displayed prenominal preference, with only 3 that
were         exclusively        postnominal        in
placement.Overall, the Le Monde corpus has a
larger number of adjectives that alternate in                    Table 4: Raw Counts and Percentages of Tweets
placement than those that are exclusively                    It should be noted that the lemma “nouveau” was
postnominally placed, which leads to lower levels            removed from the raw count comparisons due to its
of stratification in frequency and placement. When           nature as an outlier. While the lemma posed no
the corpora were combined, the mixed nature of               problem as an example of categories, high
the Le Monde corpus was demonstrated.                        frequency, and syllable length, its high raw count
Frequency was strongly correlated with                       skews the data set. Within the Cyprien data set,
prenominal adjective placement in the Cyprien                “nouveau” was found 301 instances in the
corpus where it was highly stratified, but this trend        determiner phrase compared to all the other
was less significant in the Le Monde corpus.                 adjectives with values ranging from 0 to 59
                                                             occurrences. Due to the limited scope of this article
6.4   Syllable Length                                        “nouveau” was not analyzed independently, a case
                                                             for future study.
The results of this study suggest that syllable count
was not a significant factor in adjective placement.         7     Conclusions
Previous literature indicated that “short before
long” patterns were observed in larger data sets
                                                             Twitter corpora displayed adjective placement
(Thuilier, 2013), however these trends were not
                                                             norms that were more closely aligned with spoken
observed in these data, see Appendix B. The
                                                             corpora norms than written corpora norms when
average percentage for the average prenominal
                                                             compared to the patterns explored in Thuilier
placement for lemmas with one syllable was 60%,
                                                             (2013). As hypothesized, the Cyprien corpus was
two syllables at 82%, and three syllables at 61%.



                                                        13
more restricted in adjective placement alternation           adjective-noun prenominal and postnominal
than the Le Monde corpus. In the Le Monde                    alternation; Spanish does as well (Alexiadou et al.,
tweets, 30% of the adjectives were found in both             2007; Delbecque, 1990). An analysis of the
prenominal and postnominal positions, much                   adjectives in Spanish tweets, or other languages
higher than the 8% alternating in the Cyprien                with comparable alternation trends could be
corpus. This suggests the greater flexibility of             compared to this study. This study provides a
adjective placement in the Le Monde corpus.                  jumping-board for further adjective-noun Twitter
Syllable count was not a significant factor in               corpora studies.
adjective placement, though there was a tendency
for adjectives of two syllables to be prenominally           Appendix A
place. Frequency was a significant factor in                 Full Combined Data Sets for Prenominal and
adjective placement and was highly correlated with           Postnominal Placement of Adjectives found in
prenominal placement in the Cyprien corpus.                  Determiner Phrases
  Limitations of this study include small final
numbers of adjectives attested per lemma. Despite
the approximately 6,000-tweet sample size, some
adjectives occurred less than 5 times. In the future,
larger data sets would be preferred. In addition, the
30 adjectives chosen may have influenced the data
collected and in the future, different sampling
methods could remedy this issue. Overall, these
findings provide preliminary insight into the
French adjective placement norms on Twitter.

7.1   Further Research

Sampling: Using a parts of speech tagger, a larger
analysis of all the adjectives in the corpora could
be conducted. Tweets could be data mined over a
specific time interval rather than from user
timelines. In addition, this study showed the
longitudinal trends of specific users. Further
research could also examine cross-sectional data of
streams of tweets and isolate specific collocations
or adjective usage. Collocations of adjective-noun
pairings and a study of ngrams of the corpus could
illuminate stylistic preferences of Twitter users.
Gender and other demographics data could be
compiled to generate studies that compare different
types of Twitter users.
Nouveau: Further study might investigate specific
adjective lemmas, such as nouveau, for a
microanalysis interpretation of the data.
Dialects and Languages: While it is clear that
French is spoken around the world, varieties of the
language based on location were not chosen for the
this study (Thuilier, 2013), but could be analyzed
to compare language norms of this corpus.
Semantic shifts in placement were not analyzed
within this study, but could be an aspect of future
work. French is not the only language that displays



                                                        14
Appendix B                                                     Chomsky, N. (1986). Knowledge of Language: Its
                                                                 Nature, Origin, and Use: Praeger.
Full Combined Data Sets of Syllable Counts                     Cinque, G. (1994). On the evidence for partial N-
                                                                 movement in the Romance DP. Paths towards
                                                                 universal grammar, 85-110.
                                                               Cinque, G. (2010). The Syntax of Adjectives: A
                                                                 Comparative Study: MIT Press.
                                                               Cyprien. (2011, May 21, 2016). Cyprien répond à
                                                                 Cortex.               Retrieved            from
                                                                 https://www.youtube.com/watch?v=dKwzZZKIbUs
                                                               Delbecque, N. (1990). Word Order as a Reflection of
                                                                 Alternate Conceptual Construals in French and
                                                                 Spanish. Similarities and Divergences in Adjective
                                                                 Position. Cognitive Linguistics, 1(4), 349-416.
                                                               Grevisse, M., & Goosse, A. (2011). Le bon usage :
                                                                  grammaire française : 75 ans. Bruxelles; [Paris]: De
                                                                  Boeck-Duculot.
                                                               Iov, C. (2015a). Presse. Retrieved 30 May 2015, from
                                                                  http://www.cyprien.fr/index.php/presse/
                                                               Iov, C. (2015b). YouTube Channel: Cyprien.fr.
                                                                  Retrieved       30       May         2015,      from
                                                                  https://www.youtube.com/user/MonsieurDream
                                                               Laenzlinger, C. (2005). French adjective ordering:
                                                                  perspectives on DP-internal movement types. Lingua,
                                                                  115(5), 645-689. doi: 10.1016/j.lingua.2003.11.003
                                                               Ledgeway, A. (2012). From Latin to romance :
                                                                  morphosyntactic typology and change. Oxford; New
                                                                  York, NY: Oxford University Press.
                                                               Lightfoot, D. (1979). Principles of Diachronic Syntax:
                                                                  Cambridge University Press.
                                                               Lomicka, L., & Lord, G. (2012). A tale of tweets:
Acknowledgments                                                   Analyzing microblogging among language learners.
Thank you to the reviewers for their insightful                   System, 40(1), 48-63.
comments and to Dr. Raúl Aranovich, Dr. Susan                  Matthews, P. H. (2007). Syntactic relations: A critical
Palmiter, and Alan Wong for their remarks on earlier              survey (Vol. 114): Cambridge University Press.
versions of this work.                                         Popper, B. (2014). Twitter Now Has 255 Million Users,
                                                                  but Activity Has Dropped Year over Year.
References                                                        Retrieved       20       Mar.        2015,      from
                                                                  http://www.theverge.com/2014/4/29/5665752/twitter
Alexiadou, A., Haegeman, L., & Stavrou, M. (2007).                -q1-2014-earnings
  Noun Phrase in the Generative Perspective. Noun              Prévost, P. (2009). The Acquisition of French: The
  Phrase in the Generative Perspective, 71, 1-664. doi:           Development of Inflectional Morphology and Syntax
  10.1515/9783110207491                                           in L1 Acquisition, Bilingualism, and L2 Acquisition:
Benzitoun, C. (2014). The place of the attributive                John Benjamins Publishing Company.
  adjective in French: what we learn speech corpora. 4e        Roesslein, J. (2009). Tweepy Documentation. from
  Congres Mondial De Linguistique Francaise, 8, 16.               http://docs.tweepy.org/en/v3.2.0/
  doi: 10.1051/shsconf/20140801066                             Sinclair, J., & Carter, R. (2004). Trust the text :
Boucher, P. (2006). Mapping Function to Form:                     language, corpus and discourse. London; New York,
  Adjective Positions in French. Lingvisticae                     N.Y.: Routledge.
  Investigationes, 29(1), 43-60.                               Sleeman, P., Van de Velde, F., & Perridon, H. (2014).
Brault, G. J. (1978). The Song of Roland. University              Adjectives in Germanic and Romance: John
  Park: Pennsylvania State University Press.                      Benjamins Publishing Company.
Brault, G. J. (2010). Song of Roland: An Analytical            Smith, C. (2015, 20 May. 2015). By the Numbers: 150
  Edition: Introduction and Commentary: Pennsylvania              Amazing Twitter Statistics (May 2015). 30 May
  State University Press.                                         2015,                                           from



                                                          15
   http://expandedramblings.com/index.php/march-                   Twitter. (2015a). About Twitter, Inc.      Retrieved 30
   2013-by-the-numbers-a-few-amazing-twitter-                        May 2015, from https://about.twitter.com/company
   stats/10/                                                       Twitter. (2015b). Help Center: What's a Twitter
Socialbakers. (2015). Most Popular Twitter Accounts in               timeline?         Retrieved 04 Jun 2015, from
   France. Twitter Statistics for France. Retrieved 30               https://support.twitter.com/articles/164083-what-s-a-
   May                      2015,                     from           twitter-timeline
   http://www.socialbakers.com/statistics/twitter/profile          Twitter. (2015c). Le Monde.fr Twitter. Retrieved 30
   s/france/                                                         May 2015, from https://twitter.com/lemondefr
Song, J. J. (2012). Word order: Cambridge University               Twitter. (2015d). Monsieur Dream Twitter. Retrieved
   Press.                                                            30              May              2015,           from
Statistica. (2013). Facebook, Twitter and Google                     https://twitter.com/MonsieurDream
   Penetration 2009-2013 | France. Retrieved 30 May                Twitter statistics for France. (2015).        Retrieved
   2015,                                              from           20.March.2015,                                   from
                    s/france/
Statistica. (2015). Social Networks: Global Sites                  Wilmet, M. (1980). Anteposition and Postposition of
   Ranked by Users 2015. Leading Social Networks                     Qualificative Epithets in Contemporary French.
   Worldwide as of March 2015, Ranked by Number of                   Travaux de Linguistique, 7, 179-201.
   Active Users (in Millions). Retrieved 30 May 2015,              Yanofsky. (2013, 1 Nov. 2013). A Script to Download
   from                                                              All of a User's Tweets into a Csv.
   http://www.statista.com/statistics/272014/global-                 Yanofsky/tweet_dumper.py.         Retrieved 30 May
   social-networks-ranked-by-number-of-users/                        2015, from https://gist.github.com/yanofsky/5436496
Stavrou, M. (2012). The syntax of adjectives: A
   comparative study. Language, 88(2), 419-423. doi:
   10.1353/lan.2012.0024
Thuilier, J. (2013). Syntaxe du français parlé vs. écrit :
   le cas de la position de l’adjectif épithète par rapport
   au nom. TIPA.




                                                              16