<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">SwissText 2021 Task 3: Swiss German Speech to Standard German Text</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Michel</forename><surname>Pl</surname></persName>
							<email>michel.pluess@fhnw.ch</email>
							<affiliation key="aff0">
								<orgName type="department">Institute for Data Science University of Applied Sciences</orgName>
								<orgName type="institution">Arts Northwestern Switzerland</orgName>
								<address>
									<settlement>Windisch</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lukas</forename><surname>Neukom</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute for Data Science University of Applied Sciences</orgName>
								<orgName type="institution">Arts Northwestern Switzerland</orgName>
								<address>
									<settlement>Windisch</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Manfred</forename><surname>Vogel</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Institute for Data Science University of Applied Sciences</orgName>
								<orgName type="institution">Arts Northwestern Switzerland</orgName>
								<address>
									<settlement>Windisch</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">SwissText 2021 Task 3: Swiss German Speech to Standard German Text</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">82814C9D80934198697A25A3F3ED5F02</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T19:36+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We present the results and findings of SwissText 2021 Task 3 on Swiss German Speech to Standard German Text. Participants were asked to build a system translating Swiss German speech to Standard German text. The objective was to maximize the BLEU score on a new test set covering a large part of the Swiss German dialect landscape. Four teams participated, with the winning contribution achieving a BLEU score of 46.0.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Swiss German is a family of dialects spoken by around five million people in Switzerland. It is different from Standard German regarding phonetics, vocabulary, morphology, and syntax. Swiss German is mostly a spoken language. While it is also used in writing, particularly in informal text messages, it lacks a standardized writing system. This leads to difficulties for automated text processing such as spelling ambiguities and a huge vocabulary size. Therefore, most use cases for a Swiss German speech-to-text (STT) system require Standard German text as output. This can be viewed as a speech translation problem with similar source and target languages. For example, the Swiss German sentence "Ide Abfahrt hetter de sächsti Platz beleit" can be translated to the Standard German sentence "In der Abfahrt belegte er den sechsten Platz". Here, the sentence structure is very similar, but the past tense changes in Standard German.</p><p>Speech-to-text systems for well-resourced languages like English or Standard German work very well. <ref type="bibr" target="#b12">Zhang et al. (2020)</ref> set the current state-ofthe-art on the popular LibriSpeech test-other benchmark <ref type="bibr" target="#b6">(Panayotov et al., 2015)</ref> with a word error rate (WER) of 2.6 %. In comparison, the 2020 shared task on Swiss German STT <ref type="bibr" target="#b10">(Plüss et al., 2020)</ref>, this task's predecessor, was won by <ref type="bibr" target="#b4">Büchi et al. (2020)</ref> with a WER of 40.3 %.</p><p>The goal of this task is to spur further progress in the field of Swiss German STT by providing a larger labeled training set, an additional unlabeled training set, and a test set with a dialect distribution similar to the real distribution of Swiss German dialects in Switzerland.</p><p>The remainder of this paper is structured as follows: the task, the data, and the evaluation of submissions are described in section 2. An overview of the submissions and results of this task can be found in section 3. Section 4 wraps up the paper and gives directions for future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Task Description</head><p>The objective of the task is to build a sentencelevel Swiss German speech to Standard German text speech translation system. The submission with the best BLEU score <ref type="bibr" target="#b7">(Papineni et al., 2002)</ref> wins. Participants were encouraged to explore and combine suitable supervised, semi-supervised, and unsupervised learning approaches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Data</head><p>We provide two training datasets. The first one is the Swiss Parliaments Corpus <ref type="bibr" target="#b9">(Plüss et al., 2021)</ref>, a labeled 293-hours dataset of Swiss German debates from the Grosser Rat Kanton Bern parliament with corresponding Standard German sentencelevel transcriptions<ref type="foot" target="#foot_0">1</ref> . The second one is an unlabeled collection of 1208 hours of Swiss German debates from the Gemeinderat Zürich parliament<ref type="foot" target="#foot_1">2</ref> . The use of additional datasets is allowed, but has to be declared in the system description.</p><p>The test set created for this task, the All Swiss German Dialects Test Set, contains 13 hours of sentence-level Swiss German speech and Standard German text pairs<ref type="foot" target="#foot_2">3</ref> . The set is divided into two equally sized parts, a public part (score on this part was displayed in the public ranking while the task was running) and a private part (final ranking is based on this part, was not available while the task was running). The texts are from the Common Voice project<ref type="foot" target="#foot_3">4</ref> and were spoken by 178 speakers from all over Switzerland. It covers a large part of the Swiss German dialect landscape. Figure <ref type="figure" target="#fig_0">1</ref> compares the test set dialect distribution with the real distribution of Swiss German dialects in Switzerland. The comparison highlights the good match between the test set dialect distribution and the real distribution. There are some exceptions, e.g. there is no data from the cantons AI, AR, and OW due to their small size. Also, BE and SG speakers are overrepresented whereas ZH speakers are underrepresented. There was no distinction made between BL and BS during the collection of the dialect metadata for the test set. BS speakers are therefore included in BL.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Evaluation</head><p>The submissions are evaluated using BLEU score <ref type="bibr" target="#b7">(Papineni et al., 2002)</ref>. Our evaluation script, which uses the NLTK <ref type="bibr" target="#b3">(Bird et al., 2009</ref>) BLEU implementation, is open-source<ref type="foot" target="#foot_4">5</ref> . The private part of the test set is used for the final ranking. The test set contains the characters a-z, ä, ö, ü, and spaces, and the participants' models should support exactly these. Punctuation and casing are ignored for the evaluation. Numbers are spelled out. All other characters are removed from the submission (see evaluation script for details). Participants were therefore advised to replace each additional character in their training set with a sensible replacement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Results</head><p>Four teams participated in the shared task. Table <ref type="table" target="#tab_0">1</ref> shows the final ranking.</p><p>The team in first place, <ref type="bibr" target="#b0">Arabskyy et al. (2021)</ref>, achieved a BLEU score of 46.0. They use a hybrid system with a lexicon that incorporates translations, a first pass language model that deals with Swiss German particularities, an acoustic model transferlearned from a large Standard German dataset, and a strong neural language model for second pass rescoring.</p><p>Our baseline ranks second with 41.0 BLEU. The system is described in <ref type="bibr" target="#b9">(Plüss et al., 2021</ref>) (section 5). We train an end-to-end Conformer <ref type="bibr" target="#b5">(Gulati et al., 2020)</ref> model using a hybrid CTC / attention encoder-decoder framework. The training data consists of the Swiss Parliaments Corpus <ref type="bibr" target="#b9">(Plüss et al., 2021)</ref>, an additional 250 hours corpus of automatically aligned Swiss German parliament debates, and the Standard German Common Voice corpus <ref type="bibr" target="#b1">(Ardila et al., 2019)</ref>.</p><p>The team in third place, Ulasik et al. ( <ref type="formula">2021</ref>), achieved a BLEU score of 39.4. Their approach combines three models trained on multilingual, Standard German, and Swiss German data using ensembling.</p><p>The team called DeJa ranked fourth and achieved a BLEU score of 17.1. We have not received a system description for this submission.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion</head><p>We have described SwissText 2021 Task 3 on Swiss German Speech to Standard German Text. Submissions were evaluated on the All Swiss German Dialects Test Set, which we introduced in this work. It covers a large part of the Swiss German dialect landscape. Four teams participated in the task, with the winning team reaching a BLEU score of 46.0. The results are hard to compare to the results of this task's predecessor, GermEval 2020 Task 4 <ref type="bibr" target="#b10">(Plüss et al., 2020)</ref>, due to the different test set and metric. Last year's winning contribution achieved a WER of 40.3 %. In our experiments in <ref type="bibr" target="#b9">(Plüss et al., 2021)</ref>, ranking second in this year's task, we achieved a WER of 27.8 % on a test set comparable to Ger-mEval 2020 Task 4. The relative improvement of 31 % indicates that a lot of progress has been made in the field of Swiss German STT over the past year.</p><p>Despite recent advances in semi-supervised and unsupervised learning for STT, see e.g. <ref type="bibr" target="#b8">(Park et al., 2020)</ref> and <ref type="bibr" target="#b2">(Baevski et al., 2020)</ref>, none of the participants made use of the provided unlabeled training set. This seems to be a promising direction for further improvements of Swiss German STT given that the amount of available labeled training data is still comparatively small.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure1: Comparison of the dialect prevalence in Switzerland's German-speaking population with the All Swiss German Dialects Test Set. To make this comparison possible, a dialect is defined as the average dialect spoken in a canton.</figDesc><graphic coords="2,72.00,62.81,453.55,224.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Final ranking of the shared task. The BLEU column shows the BLEU score on the private 50 % of the All Swiss German Dialects Test Set.</figDesc><table><row><cell cols="2">Rank Team</cell><cell>BLEU</cell></row><row><cell>1</cell><cell>Arabskyy et al.</cell><cell>46.0</cell></row><row><cell>2</cell><cell>Plüss et al.</cell><cell>41.0</cell></row><row><cell>3</cell><cell>Ulasik et al.</cell><cell>39.4</cell></row><row><cell>4</cell><cell>DeJa</cell><cell>17.1</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://www.cs.technik.fhnw.ch/ i4ds-datasets</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://www.cs.technik.fhnw.ch/ i4ds-datasets</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://www.cs.technik.fhnw.ch/ i4ds-datasets</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://github.com/common-voice/ common-voice/tree/main/server/data/de</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://github.com/i4Ds/ swisstext-2021-task-3</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>We thank our participants for their interest in the shared task, for their participation, and for their timely feedback, which have helped us make this task a success.</p><p>We also thank Elias Schorr for his great work on the submission and evaluation website.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Dialectal speech recognition and translation of swiss german speech to standard german text: Microsoft&apos;s submission to swisstext</title>
		<author>
			<persName><forename type="first">Yuriy</forename><surname>Arabskyy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aashish</forename><surname>Agarwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Subhadeep</forename><surname>Dey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Oscar</forename><surname>Koller</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021. 2021</date>
		</imprint>
	</monogr>
	<note>In preparation</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Common voice: A massivelymultilingual speech corpus</title>
		<author>
			<persName><forename type="first">Rosana</forename><surname>Ardila</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Megan</forename><surname>Branson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kelly</forename><surname>Davis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Henretty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Kohler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Josh</forename><surname>Meyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Reuben</forename><surname>Morais</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lindsay</forename><surname>Saunders</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Francis</forename><forename type="middle">M</forename><surname>Tyers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gregor</forename><surname>Weber</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1912.06670</idno>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">wav2vec 2.0: A framework for self-supervised learning of speech representations</title>
		<author>
			<persName><forename type="first">Alexei</forename><surname>Baevski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Henry</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Abdelrahman</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><surname>Auli</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">Steven</forename><surname>Bird</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ewan</forename><surname>Klein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Edward</forename><surname>Loper</surname></persName>
		</author>
		<title level="m">Natural language processing with Python: analyzing text with the natural language toolkit</title>
				<imprint>
			<publisher>O&apos;Reilly Media, Inc</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Zhaw-init at germeval 2020 task 4: Low-resource speech-to-text</title>
		<author>
			<persName><forename type="first">Matthias</forename><surname>Büchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Anna</forename><surname>Malgorzata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manuela</forename><surname>Ulasik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Fernando</forename><surname>Hürlimann</surname></persName>
		</author>
		<author>
			<persName><surname>Benites</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mark</forename><surname>Pius Von Däniken</surname></persName>
		</author>
		<author>
			<persName><surname>Cieliebak</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Swiss Text Analytics Conference (SwissText) &amp; 16th Conference on Natural Language Processing</title>
				<meeting>the 5th Swiss Text Analytics Conference (SwissText) &amp; 16th Conference on Natural Language Processing</meeting>
		<imprint>
			<publisher>KON-VENS</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>SWIS-STEXT &amp; KONVENS 2020</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Conformer: Convolution-augmented Transformer for Speech Recognition</title>
		<author>
			<persName><forename type="first">Anmol</forename><surname>Gulati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chung-Cheng</forename><surname>Chiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Niki</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yu</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jiahui</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shibo</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhengdong</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yonghui</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ruoming</forename><surname>Pang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Interspeech</title>
				<meeting>Interspeech</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="5036" to="5040" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Librispeech: An asr corpus based on public domain audio books</title>
		<author>
			<persName><forename type="first">V</forename><surname>Panayotov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Povey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Khudanpur</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title>
				<imprint>
			<date type="published" when="2015">2015. 2015</date>
			<biblScope unit="page" from="5206" to="5210" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Bleu: a method for automatic evaluation of machine translation</title>
		<author>
			<persName><forename type="first">Kishore</forename><surname>Papineni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Salim</forename><surname>Roukos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Todd</forename><surname>Ward</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei-Jing</forename><surname>Zhu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 40th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="311" to="318" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Improved noisy student training for automatic speech recognition</title>
		<author>
			<persName><forename type="first">Daniel</forename><forename type="middle">S</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yu</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ye</forename><surname>Jia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chung-Cheng</forename><surname>Chiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bo</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yonghui</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Quoc</surname></persName>
		</author>
		<author>
			<persName><surname>Le</surname></persName>
		</author>
		<idno type="DOI">10.21437/interspeech.2020-1470</idno>
	</analytic>
	<monogr>
		<title level="m">Interspeech</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">Michel</forename><surname>Plüss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lukas</forename><surname>Neukom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christian</forename><surname>Scheller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manfred</forename><surname>Vogel</surname></persName>
		</author>
		<title level="m">Swiss parliaments corpus, an automatically aligned swiss german speech to standard german text corpus</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Germeval 2020 task 4: Low-resource speech-to-text</title>
		<author>
			<persName><forename type="first">Michel</forename><surname>Plüss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lukas</forename><surname>Neukom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manfred</forename><surname>Vogel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Swiss Text Analytics Conference (SwissText) &amp; 16th Conference on Natural Language Processing (KONVENS)</title>
				<meeting>the 5th Swiss Text Analytics Conference (SwissText) &amp; 16th Conference on Natural Language Processing (KONVENS)</meeting>
		<imprint>
			<date type="published" when="2020">2020. 2020</date>
		</imprint>
	</monogr>
	<note>SWISSTEXT &amp; KONVENS</note>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Zhaw-cai: Ensemble method for swiss speech to standard german text</title>
		<author>
			<persName><forename type="first">Anna</forename><surname>Malgorzata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Manuela</forename><surname>Ulasik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bogumila</forename><surname>Hurlimann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yves</forename><surname>Dubel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Silas</forename><surname>Kaufmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jan</forename><surname>Rudolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Katsiaryna</forename><surname>Deriu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hans-Peter</forename><surname>Mlynchyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mark</forename><surname>Hutter</surname></persName>
		</author>
		<author>
			<persName><surname>Cieliebak</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note>In preparation</note>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Pushing the limits of semisupervised learning for automatic speech recognition</title>
		<author>
			<persName><forename type="first">Yu</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">James</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><forename type="middle">S</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wei</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chung-Cheng</forename><surname>Chiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ruoming</forename><surname>Pang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Quoc</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yonghui</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><surname>Wu</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
