<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Acoustic Analysis of Monophthongs in Tibetan of Yushu Dialect</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lingzhen Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yonghong Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Northwest Minzu University, China National Information Technology Research Institute</institution>
          ,
          <addr-line>Lanzhou</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>107</fpage>
      <lpage>113</lpage>
      <abstract>
        <p>Based on experimental phonetics, this paper further reveals the acoustic characteristics of monophthongs of Tibetan in Yushu dialect with the help of Adobe Audition 3.0, Praat and other speech analysis software. Firstly, the spectrogram is analyzed and the vowel acoustic parameters are extracted; Secondly, the formant pattern diagram and acoustic vowel diagram of Yushu dialect are drawn by using the values of F1, F2 and F3, which clearly reflect the acoustic characteristics and spatial distribution position of Yushu dialect monophthongs and the relationship between F1, F2, F3 and vowel acoustic characteristics. It is concluded that the lower the tongue position is, the larger value of F1 will be, and vice versa, the smaller value of F1 will be; The more anterior the tongue is, the larger value of F2 is; on the contrary, the smaller value of F2 is; The round lip effect can reduce the F2 value.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Yushu dialect</kwd>
        <kwd>monophthong</kwd>
        <kwd>experiment phonetics</kwd>
        <kwd>acoustic analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Located in the southwest of Qinghai province, Yushu region has jurisdiction over six counties:
Yushu, Chengduo, Baoqian, Zaduo, Zhiduo and Qumalai. With Tibetan as the main language, it is
located at the junction of Wei Zang, Kang and Anduo dialect areas. The overall phonetic appearance
presents a transitional feature. Yushu dialect is traditionally classified as Kang dialect [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Huang Bufan
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] believes that Yushu dialect has the nature of intermediary dialect or dialect chain due to the influence
of the three dialects. Yushu dialect can be regarded as a dialect juxtaposed with the three dialects. Its
phonetic features are: the initials system is greatly simplified, and the plosives, affricates and fricative
initials have the opposition of unvoiced (aspirated / non aspirated) and voiced [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]; Tones were initially
produced to make up for the confusion caused by the disappearance of many phonemes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]; Rich vowels,
with 2 to 5 compound vowels [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        A monophthong is a vowel with the same tongue position, lip shape and opening degree. It can exist
alone in a syllable without other vowels. At present, the research on the vowels of Yushu dialect mainly
includes: Huang Bufan's The phonetic characteristics and historical evolution law of Yushu Tibetan
language [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] thinks that the diversity of the vowel evolution of Yushu dialect is more prominent;
Dengzhen Wengmu's study on the phonology of Tibetan Yushu dialect[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] mentioned that Yushu dialect
has more simple vowels than other Tibetan Languages, and has 2 to 5 compound vowels; Sangta's
phonological study of Tibetan Yushu dialect [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]shows that most simple vowels in Yushu dialect are the
result of the loss and weakening of ancient Tibetan finals.
      </p>
      <p>In this experiment, Yushu is taken as the investigation point. According to the listening and
discrimination results, eight monophthongs of Yushu dialect are determined, which are: a、i、e、o、
u 、 ʊ 、 y 、 ə. In this paper, the eight monophthongs are described from the three-dimensional
spectrogram; Secondly, extracted the acoustic parameters and drawn formant patterns and acoustic
vowel diagrams. By exploring the linguistic value of Yushu Tibetan dialect, we hope to provide some
reference for the study of more single dialect and lay a certain foundation for phonological description.
Phonological description is an important work of language formal description, so this work is also the
most basic work of speech synthesis and recognition.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Experimental method</title>
    </sec>
    <sec id="sec-3">
      <title>2.1. Experimental materials</title>
      <p>
        The pronunciation vocabulary used in this experiment was selected from the Tibetan dialect
questionnaire [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and the pronunciation partners selected 557 commonly used words in the oral language
from monosyllabic words. See Table 1 below for examples of pronunciation materials.
      </p>
    </sec>
    <sec id="sec-4">
      <title>2.2. Pronunciation partner</title>
      <p>The pronunciation partner is a female college student (22 years old) from Northwest University for
Nationalities who has clear enunciation, and can speak authentic Yushu dialect without being affected
by other dialects. In order to ensure the accuracy of the signal, the partner is required to be familiar with
the materials and read each word twice while signal acquisition.</p>
    </sec>
    <sec id="sec-5">
      <title>2.3. Voice signal acquisition</title>
      <p>The recording was conducted in the professional recording room of Northwest University for
Nationalities, with good sealing and sound insulation. Recording equipment includes notebook
computer, microphone ecm-44b Lavalier microphone, eurorack ub1204fx-pro mixer, blaster X-Fi
surround5.1pro external sound card, etc.; The recording software is Adobe Audition3.0, which adopts
single channel recording, with sampling accuracy of 16 bits and sampling frequency of 22050hz. It can
complete the recording work with high efficiency and quality, control the recording process, monitor
the changes of technical indicators such as speech speed, energy and signal-to-noise ratio, and observe
the voice state of the speaker. The recording samples are stored in (*.wav) format.</p>
    </sec>
    <sec id="sec-6">
      <title>2.4. Experimental data processing and analysis</title>
      <p>After the original speech is preprocessed with Adobe Audition3.0, Matlab is used to cut it into speech
files corresponding to a single speech and a name, and Praat speech analysis software is used to mark
the voice. When marking, syllables are marked on the first layer, and initials and finals are marked on
the second layer, as shown in Fig.1, Praat speech analysis software was used to extract and analyze all
acoustic parameters in this study.</p>
    </sec>
    <sec id="sec-7">
      <title>3. Analysis of experimental results</title>
    </sec>
    <sec id="sec-8">
      <title>3.1. Spectrogram analysis</title>
      <p>Vowel is the most important component of voice, which is mainly reflected as formant in acoustics.
The formant is the resonant frequency of the sound cavity, which is generally expressed in F, and the
corresponding number is used to represent the number of formants. For vowels, F1 and F2 are closely
related to the height of the vowel tongue position, the front and back of the tongue position, and the
round spread of the lip shape. Therefore, the values of F1 and F2 will be taken as an important basis for
describing the acoustic characteristics of vowels in phonetics. Next, select the representative sounds of
eight vowels, draw a three-dimensional spectrogram, and show the acoustic characteristics of each
category of vowels by analyzing the spectrogram.
process, while F3 shows an upward trend, and then reaches a stable trend. Comparing the two diagrams,
the high frequency energy of /a/ is still very strong, and F1, F2 and F3 are relatively higher. Comparing
the two spectrograms, F1 and F2 of the former are higher. Combined with the difference of the size
opening of mouth, it is verified that F1 is related to the size opening of mouth (tongue position). The
larger the opening, the larger F1.</p>
      <p>From the spectrogram in Fig.6, F1 of the vowel /e/ is about 400hz, F2 is far away from F1, about
2200hz, F2 is close to F3, the frequency energy is strong, and the distribution is relatively uniform.
Compared with /i/, F2 and F3 are lower. Influenced by the front-end initials, F2 and F3 initially point
to the low frequency, then rise rapidly and transition to the stable stage. The lowest end in the figure is
the energy of fundamental frequency. Fig.7 is " knife " language spectrogram, F1 of vowel /ə/ is
relatively high. Influenced by the previous initials, the initial value of F2 is large, and then it drops
rapidly, which is very close to F3, about 1500hz. F4 and F5 have high values and relatively small energy.</p>
    </sec>
    <sec id="sec-9">
      <title>3.2. Vowel formant pattern of Yushu dialect</title>
      <p>Drawing different vowel formants into a formant pattern diagram is conducive to observing the
formant corresponding pattern between vowels, and can more vividly see the location and relationship
of each vowel formant. After extracting the acoustic parameters of vowels in voice samples and
averaging them, the frequencies of the first three formants F1, F2 and F3 of the eight monophthongs
are obtained respectively, with vowel as the abscissa and the frequencies as the ordinate, and draw the
formant pattern spectrogram of Yushu dialect, as shown in Fig.10:</p>
      <p>From the formant pattern of Yushu dialect, we can clearly see that each monophthong has its own
formant distribution characteristics. It mainly shows that F1 and F2 are different in value and relative
distance. According to the above Fig.10, F1 and F2 of /i/ are the largest, followed by /y/, the distance
between /a/ is the smallest, followed by /ə/.</p>
      <p>F1 values from small to large are:
/i/&lt;/y/&lt;/ ʊ/&lt;/ u/&lt;/e/&lt;/o/&lt;/ ə/&lt;/ a/；
F2 values in descending order are: /i/&lt;/e/&lt;/y/&lt;/a/&lt;/ ʊ/&lt;/ə/&lt;/ o/&lt;/u/.</p>
      <p>It can be found that F1 and F2 values roughly form an inverse relationship. However, there are
exceptions. For example, for the two monophthongs /e/ and /y/, the F1 value of /e/ is greater than /y/,
but the F2 value is also greater than /y/, and they do not form a strict inverse proportional relationship.
Considering that F2 is also related to the round spread of lip shape, that is, the round lip effect can
reduce the F2 value, because the round lip effect and the back position of tongue can make the front
resonant cavity larger when pronouncing.</p>
    </sec>
    <sec id="sec-10">
      <title>3.3. Acoustic vowel diagram of Yushu dialect</title>
      <p>
        The acoustic vowel diagram is different from the traditional vowel tongue bitmap. It is obtained
according to the objective values of F1 and F2. At the same time, F1 is the vertical coordinate and F2
is the horizontal coordinate. The coordinate origin is set in the upper right corner, making its relative
position roughly the same as that of the traditional vowel tongue bitmap. Jos (1948) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] believes that
although the formant frequencies of the same vowel uttered by different people are different, the relative
positions of each vowel on the acoustic vowel map are stable. The position of each vowel in the Fig.11
is obtained by averaging the formant frequencies of all samples of each vowel.
      </p>
    </sec>
    <sec id="sec-11">
      <title>4. Summary</title>
      <p>The eight monophthongs of Yushu dialect are: a、i、e、o、u、ʊ、y、ə. /a/ is the central back low
unrounded vowel, /i/ is the front high unrounded vowel, /u/ is the rear high rounded vowel, /y/ is the
front high rounded vowel, /o/ is the rear medium high rounded vowel, /e/ is the rear medium high
unrounded vowel, /ʊ/ is the middle high rounded lip vowel behind the center. /ə/ is the second half of
the high unrounded lip vowel. The distribution of formants was consistent: the higher the tongue
position was, the smaller the F1 value was; the lower the tongue position was, the larger the F1 value
was; The more anterior the tongue is, the greater the F2 value is. The more posterior the tongue is, the
smaller the F2 value is; In the same case, the round lip effect can reduce the value of F2.</p>
      <p>The tone of Yushu dialect is very special. Through the analysis and research of its pronunciation, it
can supplement the blank of the other three major Tibetan dialects and the world tone language. At the
same time, the acoustic analysis of Yushu dialect using experimental phonetics can promote the
development of phonetic information and visualization. It is hoped that with the development of science
and technology, computer technology and digital signal analysis technology can be more and more
applied in phonetics, and promote the further development of phonetics to fill the shortcomings of
traditional phonetics.</p>
    </sec>
    <sec id="sec-12">
      <title>5. Acknowledgements</title>
      <p>This work was financially supported by NSFC grant fund (No.11964034) and Research and
innovation Projects (No.2021CXZX-674).
6. References</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Peng</given-names>
            <surname>Jin</surname>
          </string-name>
          , Tibetan Jianzhi, 2nd. ed.,
          <source>Ethnic Publishing House</source>
          ,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Bufan</given-names>
            <surname>Huang</surname>
          </string-name>
          , Suonan jiangcai, Minghui Zhang,
          <article-title>Phonetic characteristics and historical evolution of Yushu Tibetan language</article-title>
          , Chinese Tibetology, (
          <year>1994</year>
          )
          <article-title>(2) 24</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Anseraga</surname>
          </string-name>
          ,
          <article-title>A survey of Tibetan Yushu dialect (Labu) phonology, Tibet studies</article-title>
          , (
          <year>2018</year>
          )
          <article-title>(1) 7</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Dengzhen</given-names>
            <surname>Wengmu</surname>
          </string-name>
          ,
          <article-title>Phonological study of Yushu dialect in Tibetan, Henan science</article-title>
          and technology, (
          <year>2015</year>
          )
          <article-title>(22) 1</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Sangta</surname>
          </string-name>
          ,
          <article-title>Phonological study of Yushu dialect in Tibetan, Master's thesis</article-title>
          , Northwest University for Nationalities,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Jiangping</given-names>
            <surname>Kong</surname>
          </string-name>
          , Tibetan dialect questionnaire, 2nd. ed., Commercial Press,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Yasheng</given-names>
            <surname>Jin</surname>
          </string-name>
          , Ruishan Zhang,
          <article-title>A study on the unit sound acoustics of Dongxiang language , Northwest ethnic studies</article-title>
          , (
          <year>2010</year>
          )
          <article-title>(4)10</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Joos</surname>
            ,
            <given-names>M. Acoustic</given-names>
          </string-name>
          <string-name>
            <surname>Phonetics</surname>
          </string-name>
          ,Language, 2nd. ed.,
          <source>No.24</source>
          , (
          <issue>suppl</issue>
          .2).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Gesang</given-names>
            <surname>Jumian</surname>
          </string-name>
          , Gesang Yangjing, Introduction to Tibetan dialect, 2nd. ed.,
          <source>Ethnic Publishing House</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Jiangping</surname>
            <given-names>Kong</given-names>
          </string-name>
          ,
          <article-title>Basic course of experimental phonetics</article-title>
          , 2nd. ed., Peking University Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>