=Paper=
{{Paper
|id=Vol-3224/Cpaper20
|storemode=property
|title=Plugin for automatisation of phonetic-phonological analysis and obtaining analytical feedback for Spanish learners
|pdfUrl=https://ceur-ws.org/Vol-3224/paper20.pdf
|volume=Vol-3224
|authors=Tamara Couto Fernández,Albina Sarymsakova,Nelly Condori-Fernández,Patricia Martín-Rodilla
|dblpUrl=https://dblp.org/rec/conf/sepln/FernandezSCM22
}}
==Plugin for automatisation of phonetic-phonological analysis and obtaining analytical feedback for Spanish learners==
<pdf width="1500px">https://ceur-ws.org/Vol-3224/paper20.pdf</pdf>
<pre>
Plugin for automatisation of phonetic-phonological analysis and
obtaining analytical feedback for Spanish learners
Plugin para la automatización del análisis fonético-fonológico y la obtención de
retroalimentación analítica para estudiantes de español

Tamara Couto-Fernández1, Albina Sarymsakova2, Nelly Condori-Fernández3 and Patricia
Martín-Rodilla4
134
      University of A Coruña, Faculty of Computer Science, Camiño do Lagar de Castro, 6, A Coruña, 15008, Spain
2
    University of A Coruña, Faculty of Philology, Campus da Zapateira, A Coruña, 15008, Spain


                   Abstract
                   We present in this article the Plugin for phonetic-phonological analysis in Spanish (PAFe),
                   which consists of a series of scripts (a code written with a programming language (Python) that,
                   implement three different intonation comparison algorithms of an ELE (Spanish as a foreign
                   language) student and a native speaker of Spanish), allowing, in turn, three different types of
                   analysis: global, tonal tendency and intersyllabic. In addition, PAFe has a database to keep a
                   history of different types of data (user profile, pronunciation exercises and audios) and a
                   graphical interface to include reports on pronunciation evolution in Praat, a tool for acoustic
                   analysis. PAFe is a software solution that offers new functionalities of Praat and allows the
                   following: (i) to perform a comparative analysis between the intonational patterns of an ELE
                   student and a native speaker; (ii) to report the evolution of the acquisition of such patterns in
                   Spanish thanks to the history of the stored data. In this way, automated feedback is provided to
                   both students and teachers.
                   Keywords 1
                   Praat, intonation analysis, ICT,
                   Python.


1. Introduction                                                                                     Nonetheless, no tool provides both facilities at the
                                                                                                    same time, nor offers to monitor the evolution of
                                                                                                    the students.
   The present work is framed in the area of
                                                                                                        For this reason, we have decided to develop a
natural language processing, specifically, in the
                                                                                                    system that complements language teaching, in
comparative-contrastive analysis of intonation for
                                                                                                    particular, one that can be used remotely or in
the didactic purposes provided by our original tool
                                                                                                    hybrid modalities.
PAFe. Despite the existence of some tools, such
                                                                                                        Our tool offers the functionality to perform an
as the Oplustil and Toledo [11] proposal, or the
                                                                                                    instant comparative analysis of a student's
study by Strik, Truong, Wet and Cucchiarini [8],
                                                                                                    pronunciation, taking as a reference the speech of
which offer results of phonetic-phonological
                                                                                                    a native speaker, and observing the evolution of
similarity or detect errors made in pronunciation.
                                                                                                    this through data stored in history.

SEPLN-PD 2022. Annual Conference of the Spanish Association for
Natural Language Processing 2022: Projects and Demonstrations,
September 21-23, 2022, A Coruña, Spain
EMAIL: albina.sarymsakova@udc.es (A. Sarymsakova);
tamara.cfernandez@udc.es            (T.       Couto-Fernández);
n.condori.fernandez@udc.es         (N.      Condori-Fernández);
patricia.martin.rodilla@udc.es (P. Martín-Rodilla)
ORCID: 0000-0003-0381-0239 (A. Sarymsakova); 0000-0002-
1044-3871 (N. Condori-Fernández); 0000-0002-1540-883X (P.
Martín-Rodilla)
               ©️ 2020 Copyright for this paper by its authors. Use permitted under Creative
               Commons License Attribution 4.0 International (CC BY 4.0).
               CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                               83
   For the development of our plugin, several              with calls to Praat and generates information
technologies have been used to support the work            windows directly from Python code files. The
done, such as Praat, Python and PostgreSQL.                intermediary between Praat and the data managed
                                                           in the database is Python.
2. Methodology                                                 We employ natural language processing and
                                                           audio processing techniques in our tool, taking as
                                                           our main source the human voice recordings of
    We start designing our work based on the               native speakers and students. Praat allows us to
following essential principles of intonation               extract quantitative information at the prosodic
analysis:                                                  level from the audios.
    1. We annotate the syllables of each speech                Subsequently, the native/student comparative
    act in a Praat textgrid (Boersma and Weenink           algorithms in terms of prosodic aspects that are
    [1]; we identify pitch values of all vowels in
                                                           presented and implemented by the tool can offer
    the syllables (voiced or voiced consonants are
                                                           comparative       information     between       two
    measured as well), using the Praat Script              native/student audios to provide feedback in
    developed by Mateo Ruiz [9, 10], which                 Spanish language learning. These algorithms are
    extracts the absolute values in Hz, relativises        an original contribution implemented in the tool
    them and draws the standardised melody                 since there was no algorithmic proposal of this
    graph;                                                 type for Spanish until now.
    2. we discriminate relevant frequency                      We have developed the PAFe Plugin following
    values between tonal segments from irrelevant
                                                           an iterative and incremental methodology based
    values; according to Cantero Serena [2, 3],            on agile technologies and scrum development
    Font-Rotchés and Cantero Serena [6, 7], less           methodology, based on the work of Schwaber and
    than 10% difference between segments is                Sutherland [12]. Figure 2: Example of User
    considered imperceptible.
                                                           Interface visualising the new functionalities added
    Once we have obtained the relevant data from
                                                           in Praat
the intonation analysis, we move on to the PAFe            In the following, we describe the development of
architecture.
                                                           our tool.
    Our project develops an extension to an
existing desktop application for acoustic speech
analysis: Praat. Therefore, we start from a
developed architecture to which a new module
(PAFe) is coupled (Figure 1) consisting of Praat
scripts, Python code and a PostgreSQL database.


Figure 1: Overall architecture of PAFe

    Praat, through its scripting, allows command
line calls to other systems, as described by
Dragos-PaulPop [5], thus making it possible to
extend the application through the use of other
languages and technologies, external to Praat.
This new module (PAFe) communicates with the
original system by employing new Praat scripts
that are associated with the application's menu            Figure 2: Example of User Interface visualising
items (see Figure 2), from which these files are           the new functionalities added in Praat
executed. Sometimes, the new module dispenses                 In the following, we describe the development
                                                           of our tool.


                                                      84
3. Solution: PAFe Plugin                                        syllable, the difference in pronunciation
                                                                concerning the reference audio is indicated, as
                                                                well as the percentage of similarity of tone and
   Our PAFe tool, in its final version, allows
                                                                the average difference between two audios is
comparative analysis by providing similarity
                                                                obtained. According to the results obtained
results and intonation graphs based on pitch
                                                                through this last type of analysis, both the
values2 and tonal tendency in each defined
                                                                similarity and the difference between the
segment and, finally, visualisation of a student's
                                                                reference audio and the learners' audio are
progress over time. We highlight the following
                                                                shown more accurately. Finally, we can see the
operations made possible by our plugin:
                                                                evolution of our students' results through the
   1. The application allows the creation of
                                                                option to view the history.
   different profiles to facilitate the process of
                                                                Finally, we show a flowchart (Figure 3) that
   managing the data uploaded by users.
                                                            provides information about the behaviour of our
   a) First of all, the teacher is registered.
                                                            plugin, exposing the functionalities and their
   b) A pupil is then assigned to the teacher
                                                            interrelation, as well as presenting the operators
   previously registered. This step avoids
                                                            that interact with the application and their
   confusion if there is more than one user of the
                                                            restrictions.
   same computer or laptop.
   c) Finally, the profile of a native Spanish
   speaker is recorded to upload the data that will
   serve as a reference for the programme;
   2. PAFe enables the management of WAV and
   TextGrid files3: our programme includes both
   storage and deletion of audio files and
   annotations;
   3. It also allows for different types of acoustic
   analysis (global analysis, tonal tendency
   analysis and intersyllabic analysis): the
   algorithm that performs the global analysis
   consists of dividing the previously saved
   audios of learners and native speakers of
   Spanish into about 1000 intervals (discarding
   silences) to obtain very precise comparative
   values. However, this type of analysis does not
   provide feedback about possible deviations in
   tone but provides generic data on the
   percentage similarity of the native speaker's            Figure 3: Use case diagram (PAFe functionalities
   and learner's audio. As far as tonal tendency            and main actors)
   analysis is concerned, the programme works
   with .TextGrid annotations and the previously
   saved .WAV audio files. In this case, the
                                                            4. Illustrative example of intersyllabic
   utterances are divided by words and, to obtain           analysis
   the similarity locally, it is indicated whether
   the pitch of each word has been reproduced                  In this section, we show how one type of
   correctly or not and, in case it has not been            comparative analysis is carried out. To perform
   reproduced correctly, the percentage of                  the intersyllabic analysis, it is necessary to fill in
   deviation is indicated; the percentage of pitch          a form (Figure 4) with the data that characterise
   similarity and the average difference between            the audio of the learner we want to compare.
   two audios are also obtained. Finally, the
   intersyllabic acoustic analysis is a comparative
   analysis, syllable by syllable, of the similarity
   between the tone realisation of a learner and
   that of a native speaker; in this case, for each

2                                                           3
    Tone frequency in Hz                                        File with tags segmenting associated audio


                                                       85
                                                           5. Conclusions
                                                               In conclusion, we highlight the following key
                                                           issues that we have addressed in this paper:
                                                               1. The PAFe tool allows different types of
                                                               comparative-contrastive analysis of the
                                                               intonation (global, tonal tendencies and
                                                               intersyllabic) of EFL learners and native
                                                               speakers of Spanish; Among them, we
Figures 4: Form for conducting an intersyllabic                consider the intersyllabic as the most accurate
analysis                                                       since the results of tonal difference appear
   The audios of that student that meet these                  syllable by syllable and show the tonal
properties are then filtered out and display a                 deviations of the students, and the global as the
window with a drop-down menu for the selection                 most efficient in terms of response time since
of the audio to be analysed. Once the audio is                 it does not require the uploading of TextGrids,
selected, the corresponding. TextGrid file is                  and the segmentation is done in an automated
selected in the same way.                                      way, as shown by the empirical data of the
   Each type of analysis returns different results.            Couto Fernández [4] work.
For the intersyllabic analysis, we show a                      2. This application has several functions; apart
similarity result per syllable and the average                 from performing the intonational analysis, it
percentage difference (Figure 5). Finally, we                  allows to store the audios, the . TextGrid files
obtain a graph with the tonal differentiation                  and the results of the analysis (the history) of
curves in each syllable for each audio (Figure 6).             each utterance according to the profile of the
                                                               speaker (student or native speaker of Spanish).
                                                               3. PAFe has been developed to achieve the
                                                               following didactic objectives: to facilitate the
                                                               work of teachers with regard to the
                                                               identification and correction of intonation
                                                               deviations (we have carried out an empirical
Figures 5: Intersyllabic analysis information                  analysis with teachers of Spanish as a foreign
                                                               language, where we measured the degree of
                                                               satisfaction with PAFe, with positive results,
                                                               as indicated in the work Couto Fernández [4];
                                                               to store the results of the analyses carried out
                                                               for future improvement; to serve as a self-
                                                               evaluation and self-correction tool for ELE
                                                               students, given that the tool itself allows them
                                                               to upload .WAV and . TextGrid files, run the
                                                               analyses and obtain the results without
                                                               constant help from teachers.
                                                               As a future line of research, we highlight the
                                                           need to measure this degree of feedback to
                                                           students empirically.
                                                               As far as we know, it is the only existing
                                                           solution both under Praat and outside Praat that
                                                           allows this type of analysis and offers feedback to
                                                           the student in the Spanish language. We highlight
Figure 6: Graph showing the tonal curves of each           that as feedback and self-evaluation, our tool
audio for each syllable (the X-axis represents the         offers the percentage of similarity and difference
syllable division of an utterance and the Y-axis           of pitch values so that the student can correct his
the pitch values).                                         pronunciation. Also, as future lines of work, we
                                                           plan to improve the graphical environment of the
                                                           plugin and open to the student, as an end user, the
                                                           possibility of its use via the web.


                                                      86
References
[1] P. Boersma, D. Weenink, Praat: doing
     phonetics by computer, 2019. URL:
     http://www.praat.org/.
[2] F. J. Cantero Serena, Teoría y análisis de la
     entonación, volume 54, 2002.
[3] F. J. Cantero Serena, Análisis prosódico del
     habla: más allá de la melodía, Comunicación
     Social: Lingüística, Medios Masivos, Arte,
     Etnología, Folclor y otras ciencias afines 2
     (2019) 485-498.
[4] T. Couto Fernández, Una herramienta de
     análisis del habla de audio para proporcionar
     retroalimentación      automática    a    los
     estudiantes en la pronunciación en español.
     UDC. A Coruña.
[5] Dragos-Paul Pop, Adam Altar, Designing an
     MVC Model for Rapid Web Application
     Development, Procedia Engineering 69
     (2014)            1172-1179.             DOI:
     10.1016/j.proeng.2014.03.106
[6] D. Font Rotchés, F. J. Cantero Serena, La
     melodía del habla: acento, ritmo y
     entonación, Eufonía: didáctica de la música
     (2008) 19-39.
[7] D. Font Rotchés, F. J. Cantero Serena,
     Melodic Analysis of Speech Method applied
     to Spanish and Catalan, Phonica 5 (2009) 33-
     47.
[8] H. Strik, K. Truong, F. Wet, C. Cucchiarini,
     Comparing different approaches for
     automatic pronunciation error detection,
     Speech Communication 51 (2009) 845–852.
     DOI: 10.1016/j.specom.2009.05.007
[9] M. Mateo Ruiz, Protocolo para la extracción
     de los datos tonales y curva estándar en
     análisis melódico del habla, Phonica 6 (2010)
     49-90.
[10] M. Mateo Ruiz, Scripts en Praat para la
     extracción de datos tonales y curva estándar,
     Phonica 6 (2010) 91-111.
[11] P. Oplustil, G. Toledo, Uso de una
     herramienta didáctica para la práctica de la
     entonación en hablantes no nativos de
     español, Sintagma: Revista de lingüística 31
     (2019) 37–50.
[12] K. Schwaber, J. Sutherland, La guía
     definitiva de scrum: Las reglas del juego,
     2020.


                                                     87

</pre>