=Paper= {{Paper |id=Vol-2473/paper37 |storemode=property |title=Correlation of Perceived Fluency with Phonetic Measures of Speech Rate and Pausing |pdfUrl=https://ceur-ws.org/Vol-2473/paper37.pdf |volume=Vol-2473 |authors=Peter Kleman,Štefan Beňuš |dblpUrl=https://dblp.org/rec/conf/itat/KlemanB19 }} ==Correlation of Perceived Fluency with Phonetic Measures of Speech Rate and Pausing== https://ceur-ws.org/Vol-2473/paper37.pdf
       Correlation of perceived fluency with phonetic measures of speech rate and
                                        pausing
                                                             Peter Kleman
                                             Department of English and American studies
                                                        Faculty of Philosophy
                                                Constantine the Philosopher University
                                           Štefánikova trieda 38/67, Nitra, 949 10, Slovakia
                                                         peter.kleman@ukf.sk

                                                             Štefan Beňuš
                                             Department of English and American studies
                                                        Faculty of Philosophy
                                                Constantine the Philosopher University
                                           Štefánikova trieda 38/67, Nitra, 949 10, Slovakia

                                     Institute of Informatics of the Slovak Academy of Sciences
                                            Dúbravská cesta 9, 841 04 Bratislava, Slovakia
                                                            sbenus@ukf.sk

Abstract. The paper studies the relationship between                     Kallio, Suni, Virkkunen, and Šimko conducted a study in
perceived fluency of L2 semi-spontaneous utterances and               2018 [2] on whether prosodic prominence levels of
phonetic measures such as speech rate and the number of               syllables could be used to predict the prosodic competence
pauses. The data for the correlation analysis comes from a            of L2 speakers of Swedish. They used a continuous wavelet
word guessing experiment conducted with Slovaks speaking              transformation analysis of syllable prominence with
English. Subjects provided cues for target words intended to          combinations of f0, energy, and duration features. The data
facilitate the correct guessing of those words. In the second         for the test was gathered from a larger corpus created
phase, speakers were asked to guess the words to which the            during a computer-aided oral test. They manually annotated
interlocutors were providing cues. The guessers were also             the data to syllable-level and measured f0 using PRAAT.
asked to evaluate the fluency of the interlocutors for each of        This data was assessed using wavelet transformation
the words that the speakers were guessing. The data from the
recordings is analysed through a correlation analysis of the          analysis. The second set of assessments was gathered from
phonetic measures extracted from the acoustic signal and the          expert raters. The results showed that the assessments
level of perceived fluency that was elicited for each target          correlated to the assessments of expert raters. This data
word. The study found that phonetic measures do correlate             provided strong support for future use of wavelet-based
with the levels of perceived fluency. The findings may be used        prominence estimation in automatic assessment of L2
for improvements in automated computer assisted fluency               proficiency.
assessment.                                                              Ramanarayanan, Lange, and Evanini studied the human
                                                                      and automated scoring of fluency, pronunciation, and
                                                                      intonation [3]. They collected interactions of L2 speakers
1     Introduction1                                                   of English and used both human and machine learning for
                                                                      creation of scores for each of the aspects. The study
   The study of the relationship of fluency and phonetic              showed that trained scoring models were generally on par
measures is an endeavour that will prove to be useful when            with human raters’ scores.
it comes to fully understanding how humans perceive                      Therefore, for such automated assessments we need two
fluency of their peers and will aid in the pursuit of creating        separate sets of data. The first set consists of subjective
of automatic fluency measuring algorithms and programs.               data gathered from evaluation of fluency provided by
Such technological advances will be useful in the coming              subjects [4]. The second set of data consists of phonetic
age of intelligent self-learning computer that will be able to        measures that were previously studied and had their
understand, evaluate, and perhaps even study human                    importance assessed [5]. Such approach to data gathering
languages.                                                            was also used in the following study. With the increased
   De Jong and Wempe conducted a study in 2009 [1] using              volume of such data available, the algorithms can be
PRAAT to automatically detect syllable nuclei in order to             improved to incorporate more measures that aid the
measure speech rate. The data used in the study came from             computer in better assessing various aspects of human
experiments performed by 8 participants with tasks such as            speech.
reading aloud syllable lists and informal storytelling. They             The aim of the study was to search for a statistically significant
conducted a correlation analysis on the predicted data                correlation between perceived fluency and phonetic measures.
obtained from the analysis in relation to human syllable              This was firstly studied across the data from all the speakers in
counts done on the data from the experiments. This study              one group. Secondly, they were also divided into groups, which
concluded that automatic syllable count could reliably                consisted of assessors of the same proficiency level. We expected
assess and compare speech rates.                                      that the correlation should be better with all subjects taken into
                                                                      account as assessors.as opposed to only same proficiency group
                                                                      assessors. The rationale behind this statement is that the more
                                                                      varied points of view we have on assessment, the better the
Copyright ©2019 for this paper by its authors. Use permitted under
                                                                      correlation results will be. This was also meant to avoid the
Creative Commons License Attribution 4.0 International (CC BY 4.0).   extremes that were predicted to come up in the analyses.
1.1 Definitions                                                    subjects listened to both of the cues for each word, they
                                                                   were asked to evaluate the fluency level of the interlocutors
   For a number of L2 speakers of English, fluency seems           on a scale of 1 to 7. Since all subjects were naïve assessors,
to be an elusive language feature that they can never quite        they were mainly asked to focus on guessing the words
master. Various disfluencies can have an impact on the             from cues. They were asked to provide spontaneous
speech of a person, both natives and non-natives, as               assessments of fluency. The experimenter marked the
previously demonstrated in research [4]. Previous research         perceived fluency assessments for each of the words. Each
in fluency provides several definitions of what fluency            of the subjects provided 10 assessments for all of the 12
actually is [3, 7, 8, 9, 10, 11, 12], but there does not seem to   speakers resulting in a data set of 120 assessments for each
be an agreed upon definition that is accepted by all. In           subject.
general, fluency is considered to be the overall proficiency
of a speaker that uses a language at a high level [13, 14,         2.1 Data processing
15]. The same general definition can be used for L2
Fluency as well. Fluency was also used as an umbrella                 In the data processing, the recordings from phase one
term, when it was divided into a broad sense and a narrow          were labelled using PRAAT speech analysis software. Each
sense of fluency [16]. The broad sense shares a similar            recording was annotated in three tiers. The first was the cue
definition to the previously mentioned, while the narrow           tier in which the cues were labelled from their beginning to
sense of fluency is referring only to the speed and                their end. The second was the word tier, where each of the
smoothness of delivery.                                            words was labelled from its beginning to its end. And the
   Perceived fluency is defined as “inferences listeners           third was the pause tier, where each of the pauses was
make about a speaker’s cognitive fluency based on their            labelled from its beginning to its end.
perception of utterance fluency” [17]. This aspect of                 A Praat script was then used to extract the number of
fluency was important for the creation of the experiment,          words in each cue and their length, and also the number of
since it provided understanding of how subjective fluency          inside cue pauses and their length from these annotations.
is perceived and what constitutes as fluent speech in the          The data was transferred into an Excel sheet where the
narrow sense that can be used for analysis. The analysis of        words per second were counted as the sum of words in both
perceived fluency and phonetic measures is a new direction         cues divided by the sum of word durations in both cues and
for the automated assessment of fluency.                           the inside cue pause duration in both cues. The overall
                                                                   wordcount was calculated as the sum of words in both cues.
2    Methodology                                                   The overall pause count was calculated as the sum of inside
                                                                   cue pauses in both cues. Lastly, the overall duration of
   Two previously mentioned ideas [16, 17] were joined in          pauses was calculated as the sum of inside cue pause
the creation of the current study. Smoothness was be               duration in both cues. The levels of perceived fluency were
represented by the frequency and length of pauses and the          also added to each word as evaluated by each of the
speed with words per second and the overall wordcount.             subjects.
Perceived fluency [17] was used as a subjective measure               The first data set was created from the evaluations of
that was collected from subjects in the experiment.                fluency that were provided by subjects during the word
   The basis for the study was a semi-spontaneous word             guessing experiment. The second set of data consisted of
guessing experiment conducted on 13 L2 speakers of                 four different phonetic measures that were chosen for the
English with proficiency levels of C1, B2, and B1. The             correlation analysis in relation with the evaluated levels of
experiment was divided into two phases, where in the first         fluency. These measures are words per second, wordcount,
phase the subjects were tasked with creating cues for a set        length of pauses, and the number of pauses. Such pair of
of provided words. These words were randomly chosen                data is referred to as an objective-subjective pair or
from the British National Corpus with the criteria of being        subjective-objective approach [18]. The measures were
at most three syllables long and were either a noun, verb, or      used as an objective means of assessing fluency in relation
an adjective. Each speaker was given a set of ten words and        to the subjective evaluation of perceived fluency that were
they were asked to create two cues for each word. They             provided by the participants while listening to cues from
were asked not to use the words that they were hinting at.         their peers.
The cues that they provided were recorded and                         The research examined the correlation of perceived
concatenated into a single recording for each of the               fluency and phonetic measures analysed in the recording
speakers. These recordings always consisted of the first cue       data from phase one. The average level of perceived
for the word, three second pause provided for the guessers         fluency was calculated for each of the words from the
as thinking space, then the second cue for the word,               normalised fluency evaluations in the following way. Since
followed by another three second pause.                            the data was displayed as a chart, we had the perceived
   The recordings processed in this way were used in phase         fluency evaluations from each speaker as columns. Each of
two, where the subjects were asked to try and guess the            the cue pairs had an original evaluation value of one to
words to which the interlocutors were providing cues. Each         seven and was represented as a row. In order to normalise
subject listened to the recordings of all other subjects. They     the data, we took each of the evaluations and subtracted
were asked to listen to the cues and try to guess the word         from it the minimum score that the speaker provided in
that the interlocutor was providing the cues for. The              their entire column. This number was divided by the
success of guesses was recorded for future use. After the          difference between the maximum per column and
minimum per column. The result was a number between 0
and 1, where 0 represented the lowest score provided by the
speaker and 1 the highest score.

2.2 Data analysis

   The correlation of data was studied in four cases
calculating the Pearson correlation coefficient and also
multiple linear regression. Each pair for the calculation of
Pearson correlation coefficient consisted of perceived
proficiency evaluation, and a phonetic measure. The first
pair used words per second as the independent variable, the
second used wordcount, the third used the number of pause,
and the fourth used the total duration of inside cue pauses           Fig. 2. Correlation data for wordcount and perceived
as its independent variable.                                                                fluency

3    Results                                                         In the third pair of data sets, which consisted of the sum
3.1 Results for all speakers                                     of the number of inside cue pauses and perceived fluency, a
                                                                 Pearson r was computed to assess the relationship between
   As mentioned before, four pairs of data sets were created     perceived fluency and total pause count. We have not found
for the calculation of Pearson correlation coefficient. In the   a significant relationship suggesting that the pair does not
first pair of data sets, which consisted of words per second     correlate (r = -0.098, p < 0.579). The data visualisation is
and perceived fluency, a Pearson r was computed to assess        available in a scatterplot graph as shown in Fig. 3.
the relationship between perceived fluency and words per
second. We found positive significant relationship (r =
0.574, p < 0.001). The relationship between the two
variables is visualised in a scatterplot shown in Fig. 1.




                                                                     Fig. 3. Correlation data for the number of pauses and
                                                                                      perceived fluency

                                                                     In the fourth pair of data sets, which consisted of the
                                                                 total duration of pauses inside both cues per word and
      Fig. 1. Correlation data for words per minute and          perceived fluency, a Pearson r was computed to assess the
                     perceived fluency                           relationship between perceived fluency and total pause
                                                                 duration. We found negative significant relationship (r = -
    In the second pair of data sets, which consisted of the      0.479, p < 0.001). The data visualisation is visible in Figure
wordcount in both cues per word and perceived fluency, a         4.
Pearson r was computed to assess the relationship between
perceived fluency and wordcount. We found positive
significant relationship (r = 0.316, p < 0.001). The data sets
were visualised in a scatterplot graph as shown in Fig. 2.
     Fig. 4. Correlation data for the total inside cue pause                  p-value                p < 0.001
              duration and perceived fluency.
                                                                              R_wc_pf                    0.339
   A multiple linear regression was calculated to predict                     p-value                p < 0.001
perceived fluency based on the words per second,                              R_icpc_pf                 -0.069
wordcount, and pause duration. Pause count was omitted,
                                                                              p-value                p < 0.437
as it did not seem to have an effect on perceived fluency
based on the correlation result above. A significant                          R_icpd_pf                 -0.410
regression model was found (F (3,126) = 39.333, p <                           p-value                p < 0.001
0.001), with an R2 of 0.484. Subject’s predicted perceived
fluency is shown in Table 1. Subject’s perceived fluency            In their first pair of data sets, which consisted of words
increased by 0.068 for each word per second, by 0.013 for        per second and perceived fluency, the Pearson r suggests
each word, and decreased by -0.046 for each second in total      positive significant relationship (r = 0.500, p < 0,001).
pause duration. The coefficients in the table represent each        In their second pair of data sets, which consisted of the
of the phonetic measure that were used. The Intercept            wordcount in both cues per word and perceived fluency, the
represents the perceived fluency. All three measures were        Pearson r suggests positive significant relationship (r =
significant predictors of perceived fluency.                     0.339, p < 0,001).
                                                                    In their third pair of data sets, which consisted of the
                                                                 total number of inside cue pauses and perceived fluency,
                                                                 the Pearson r suggests no significant relationship (r = -
                                                                 0.069, p < 0,437).
                                                                    In their fourth pair of data sets, which consisted of the
                                                                 total duration of pauses inside both cues per word and
                                                                 perceived fluency, the Pearson r suggests negative
        Table 1. Results of multiple linear regression           significant relationship (r = -0.410, p < 0,001).
                       calculations                                 A multiple linear regression was calculated to predict
                                                                 perceived fluency based on the words per second,
        R Square   0.484                                         wordcount, and pause duration. Pause count was omitted,
                                                                 as it did not seem to have an effect on perceived fluency
                   Coef   t Stat P-value
                                                                 based on the correlation result above. A significant
        Intercept  0.243 3.076     0.003                         regression model was found (F (3,126) = 29.793, p <
        wps        0.068 2.195     0.030                         0.001), with an R2 of 0.415. Subject’s predicted perceived
                                                                 fluency is shown in Table 3. Subject’s perceived fluency
        wordcount 0.013 5.921      0.000
                                                                 increased by 0.049 for each word per second, by 0.014 for
        icp_dur   -0.046 -4.322    0.000                         each word, and decreased by -0.046 for each second in total
                                                                 pause duration. The coefficients in the table represent each
                                                                 of the phonetic measure that were used. The Intercept
3.2 Results for each proficiency group                           represents the perceived fluency All three measures were
                                                                 significant predictors of perceived fluency.
   The data was then divided into three proficiency groups
and was again analysed using the Pearson correlation               Table 3. Results of multiple linear regression calculation
coefficient and multiple linear regression. This was done in                            in group C1
order to study which phonetic measures influence the                     R Square   0.415
relationship between produced and perceived fluency in                              Coef   t Stat P-value
each of the proficiency groups. Three groups were created,
each consisting of either only C1 level speakers, B2 level               Intercept  0.238 2.772     0.006
speakers, or B1 level speakers. All the assessments made                 wps        0.049 1.441     0.152
by these speakers were taken into account and a new value                wordcount 0.014 5.856      0.000
for perceived fluency was calculated from their evaluations.
3.2.1 Level C1                                                           icp_dur   -0.046 -3.936    0.000
   Firstly, we will talk about the results for the group of C1
assessors. Four Pearson r values were computed to assess
the relationship between the four data pairs. In this group,     3.2.2 Level B2
only the perceived fluency values of the C1 subjects were
taken into account. The Pearson r values were also                  The second set of analyses was conducted on the B2
measured for their statistical significance with a p-value.      group. The results for the group are shown below in the
This data is visible in Table 2.                                 tables and they consist of four Pearson r values, which were
                                                                 computed to asses the relationship between the data pairs.
           Table 2. Pearson r results for group C1               In this group, only the perceived fluency values of the B2
             R_wps_pf                     0.500                  subjects were taken into account. The p-values were also
measured for their statistical significance. This data is       3.2.3 Level B1
visible in Table 4.
                                                                   The final group of assessors that we will talk about is the
           Table 4. Pearson r results for group B2              B1 group. The relationship between the four data pairs was
             R_wps_pf                   0.487                   assessed with the help of four Pearson r values, which were
             p-value                p < 0.001                   computed. These values were also measure for their
                                                                statistical significance with a p-value. All the data
             R_wc_pf                    0.257                   belonging to B1 group can be seen in Table 6.
             p-value                p < 0.003
             R_icpc_pf                  0.019                               Table 6. Person r results for group B1
             p-value                p < 0.828                                R_wps_pf                    0.579
             R_icpd_pf                 -0.408                                p-value                 p < 0.001
             p-value                p < 0.001                                R_wc_pf                     0.246
                                                                             p-value                 p < 0.003
   In their first pair of data sets, which consisted of words                R_icpc_pf                  -0.104
per second and perceived fluency, the Pearson r suggests
                                                                             p-value                 p < 0.309
positive significant relationship (r = 0.487, p < 0,001).
   In their second pair of data sets, which consisted of the                 R_icpd_pf                  -0.505
wordcount in both cues per word and perceived fluency, the                   p-value                 p < 0.001
Pearson r suggests positive significant relationship (r =
0.257, p < 0,003).                                                 In their first pair of data sets, which consisted of words
   In their third pair of data sets, which consisted of the     per second and perceived fluency, the Pearson r suggests
total number of inside cue pauses and perceived fluency,        positive significant relationship (r = 0.579, p < 0,001).
the Pearson r suggests no significant relationship (r =            In their second pair of data sets, which consisted of the
0.019, p < 0, 828).                                             wordcount in both cues per word and perceived fluency, the
   In their fourth pair of data sets, which consisted of the    Pearson r suggests positive significant relationship (r =
total duration of pauses inside both cues per word and          0.246, p < 0,003).
perceived fluency, the Pearson r suggests negative                 In their third pair of data sets, which consisted of the
significant relationship (r = -0.408, p < 0,001).               total number of inside cue pauses and perceived fluency,
   A multiple linear regression was calculated to predict       the Pearson r suggests no significant relationship (r = -
perceived fluency based on the words per second,                0.104, p < 0, 309).
wordcount, and pause duration. Pause count was omitted,            In their fourth pair of data sets, which consisted of the
as it did not seem to have an effect on perceived fluency       total duration of pauses inside both cues per word and
based on the correlation result above. A significant            perceived fluency, the Pearson r suggests negative
regression model was found (F (3,126) = 21.742, p <             significant relationship (r = -0.505, p < 0,001).
0.001), with an R2 of 0.341. Subject’s predicted perceived         A multiple linear regression was calculated to predict
fluency is shown in Table 5. Subject’s perceived fluency        perceived fluency based on the words per second,
increased by 0.071 for each word per second, by 0.013 for       wordcount, and pause duration. Pause count was omitted,
each word, and decreased by -0.046 for each second in total     as it did not seem to have an effect on perceived fluency
pause duration. The coefficients in the table represent each    based on the correlation result above. A significant
of the phonetic measure that were used. The Intercept           regression model was found (F (3,126) = 35.438, p <
represents the perceived fluency. All three measures were       0.001), with an R2 of 0.458. Subject’s predicted perceived
significant predictors of perceived fluency.                    fluency is shown in Table 7. Subject’s perceived fluency
                                                                increased by 0.079 for each word per second, by 0.013 for
  Table 5. Results of multiple linear regression calculation    each word, and decreased by -0.050 for each second in total
                       in group B2                              pause duration. All three measures were significant
        R Square   0.341                                        predictors of perceived fluency.
                   Coef   t Stat P-value
                                                                  Table 7. Results of multiple linear regression calculation
        Intercept  0.276 2.598     0.011                                               in group B1
        wps        0.071 1.697     0.092                                R Square   0.458
        wordcount 0.013 4.284      0.000                                           Coef   t Stat P-value
        icp_dur   -0.046 -3.192    0.002                                Intercept  0.239 2.697     0.008
                                                                        wps        0.079 2.257     0.026
                                                                        wordcount 0.013 5.032      0.000
                                                                        icp_dur   -0.050 -4.152    0.000
4    Discussion                                                   speech”, grant No. 2/0161/18 and also by University Grant
                                                                  Agency UGA “Manipulation of acoustic signal of speech
   In this study we set out to search for a statistically         for improvement of fluency in a foreign language and
significant correlation between perceived fluency and             targeted reduction of mother tongue interference”, grant
phonetic measures that would be observable across the data        No. I-19-208-02.
from all the speakers and also in groups, which consist of
assessors of the same proficiency level. We expected that
the correlation should be better with all subjects taken into     References
account as assessors.as opposed to only using assessors of
certain proficiency groups. The rationale behind this             [1] N.H. De Jong and T. Wempe, “Praat script to detect
statement is that the more varied points of view we have on            syllable nuclei and measure speech rate
assessment, the more accurate the results will be.                     automatically,” Behavior Research Methods 41, pp.
   The study found some of the phonetic measures seemed                385-390, 2009.
to correlate with perceived fluency much more in simple           [2] H. Kallio, A. Suni, P. Virkkunen, and J. Simko,
pair tests. One such measure is words per second. If we                “Prominence-based evaluation of L2 prosody,”
                                                                       Interspeech 2018, pp. 1838-1842, 2018.
look purely at its relationship to perceived fluency, we see
                                                                  [3] V. Ramanarayanan, P. Lange, K. Evanini, H. Molloy,
a moderately high positive correlation. However, this did              and D. Suendermann-Oeft, “Human and automated
not seem right, since such analysis did not take into account          scoring of fluency, pronunciation and intonation
the relation with the other measures. The pause count                  during human–machine spoken dialog interactions,”
showed no significant relationship. This could probably be             Interspeech 2017, pp. 1711–1715, 2017.
caused, because the subjects were mainly tasked with              [4] H. R. Bosker, H. Quené, T. Sanders, and N. H. Jong, .
guessing a word from the cues. Since they were probably                “The perception of fluency in native and non-native
more focused on the message, the number of pauses did not              speech,” Language Learning, 64 (3), pp. 579-614,
seem to play a role. They started noticing the pauses only             2014.
                                                                  [5] J. Kormos and M. Dénes, “Exploring measures and
when their duration was too long.
                                                                       perceptions of fluency in the speech of second
   Even though the Pearson r showed a lesser correlation in            language learners,” System 32 (2), pp. 145-164, 2004.
the initial analyses, this changed after a linear regression      [6] T. Rasinski, “The Fluent Reader: Oral Reading
analysis was used. This analysis took into account all the             Strategies for Building Word Recognition, Fluency,
data necessary for the correlation analysis. This means that           and Comprehension,” Scholastic Inc., 2003.
it measured the significance of all the measures in relation      [7] A. Hasselgreen, “Testing the Spoken English of
to perceived fluency at the same time and not only in                  Young Norwegians: A Study of Testing Validity and
individual pairs. The results of this analysis showed a                the Role of Smallwords in Contributing to Pupils'
different picture of the measure significance. The most                Fluency,” Cambridge University Press, 2005.
prominent became the wordcount with its positive                  [8] Z. Breznitz, “Fluency in Reading: Synchronization of
relationship, the second was the duration of pauses with a             Processes,” Routledge, 2006.
                                                                  [9] J. B. Gilquin and S. De Cock, “Errors and
negative relationship, and words per second were third with
                                                                       Disfluencies in Spoken Corpora,” John Benjamins
a positive relationship.                                               Publishing, 2013.
   The same ordering of measures was also observed in the         [10] A. Khateb and I. Bar-Kochva, “Reading Fluency:
group phase of analyses. The speakers were divided into                Current Insights from Neurocognitive Research and
groups based on their proficiency levels. In these groups              Intervention Studies,” Springer, 2016.
only their fluency assessments were taken into account. We        [11] P. Kendale, “WORKBOOK for Spoken English
saw a change in the strength of correlation of all the pairs in        Fluency Development – 4,” Independently Published,
all the groups. This means that pair one, which is the words           2017.
per second and perceived fluency pair, had a completely           [12] L. Wang, J. Zhang, F. Pan, B. Dong, and Y. Yan,
                                                                       “Automatic Fluency Assessment of Non/native
different value in all the pairs. This difference is easily
                                                                       English Reading,” Journal of Convergence
observed between the B2 pair one r = 0.487 and B1 pair                 Information Technology 7, pp. 636-642, 2012.
one r = 0.579. Such differences were observed across all          [13] P. Lennon, “Investigating fluency in EFL: A
the pairs and suggest that each different proficiency level            quantitative approach,” Language Learning, vol. 40,
evaluates fluency based on different criteria.                         pp. 387-417, 1990.
   The study showed that the best correlating data was            [14] H. Riggenbach, “Toward an understanding of fluency:
observed, when all speaker were used as assessors. This                A      microanalysis     of     non-native     speaker
suggests that the before mentioned differences in pair                 conversations,” Discourse Processes, vol. 14, pp. 423-
correlations are equalized. This offers a better correlation           441, 1991.
                                                                  [15] J. Kormos, “Speech production and second language
analysis partially also because of the higher number of
                                                                       acquisition,” Lawrence Erlbaum Associates, 2006.
assessors.                                                        [16] P. Lennon, “The lexical element in spoken second
                                                                       language fluency,” In H. Riggenbach (Ed.),
Acknowledgment                                                         Perspectives on fluency Ann Arbor, University of
                                                                       Michigan Press, pp. 25-42, 2000.
                                                                  [17] N. Segalowitz, “Cognitive bases of second language
  This work was funded by the Slovak Scientific Grant                  fluency,” New York: Routledge, 2010.
Agency VEGA “Automatic assessment of acute stress from
[18] N. H. De Jong, et. al. “Facets of Speaking
     Proficiency,”     Studies     in   Second Language
     Acquisition, vol. 34 (1), pp. 5-34, 2010.