Perceptual evaluation of early versus late F0 peaks in the intonation structure of Czech question-word questions Pavel Šturm, Jan Volín Institute of Phonetics, Charles University, Prague, Czech Republic pavel.sturm@ff.cuni.cz, jan.volin@ff.cuni.cz Abstract Question-word questions in Czech lexically mark their interrogative function in the initial position: in their standard form, they begin with an interrogative lexeme. For many linguists, this is a sufficient reason for resigning on intonation marking, so they claim that the speech melody in these questions is identical to the melody of statements. A careful observation of the current Czech speech suggests otherwise. This paper presents a perceptual experiment in which Czech speakers evaluated two contrastive forms of the interrogative melody, specifically the one with a late peak modelled after statements (as suggested by some authors), and the one with an early peak modelled after our empirical data collected previously. Thirty-two listeners expressed a statistically significant preference for the early peak in a perception test. This outcome resonates with the sample of speech production of the questions. However, the late peak is also possible and acceptable: we assume that it might be a signal of contrastive emphasis or an implicational cue. Index Terms: Czech, intonation, perception, wh-questions, speech melody, nucleus placement 1. Introduction Grammarians identify various classes of interrogative sentences, but two major types are recognized quasi- universally: polar (yes-no) questions and specific (question- word) questions. In English, the latter are also known as wh- questions since the spelling of interrogative lexemes like when, where, who, why or what starts with the letters ‘wh’. In Czech, the language of our concern, an analogous term would be perhaps k-type since most interrogative expressions contain the velar plosive spelled as ‘k’ (e.g., kdy, kde, kdo, který, kolik, kam). In this study, we will use the term question-word questions (or QWQ). Czech is a West-Slavic language (related to Slovak or Polish) spoken primarily in Central Europe by about ten million people. With regard to question-word questions, traditional descriptions of Czech intonation suggest two possibilities, often discussed in quite a polemic manner. Some authors claim that the intonation nuclear pitch accent or nucleus (represented by a phrasal melodic peak) comes late in the sentence, thus making the melody identical to that of a statement (e.g., authoritative and often cited [1], [2], [3], [4], [5]). Others argue that the nucleus is typically anchored to the sentence-initial question word, which results in an early peak (e.g., [6], [7]). This solution would resonate with the views of some cross-language researchers (e.g. [8], [9], [10]) that the default nucleus position in QWQs is on the interrogative pronoun. Both early- and late-peak proponents, nevertheless, agree that the melodic pattern is falling, or in other words, that the boundary tone is low (cf. Figure 1 below). According to the former solution, the question Kde je moje svačina? (Where is my snack?) would have a peak linked to the word svačina (snack), whereas the latter solution links the peak to the word kde (where). Similarly, Kolik máte bodů? (How many points do you have?) would have the late peak associated with bodů (points), while the early peak solution would anchor the peak to the word kolik (how many). Unfortunately, this controversy in Czech intonology is based solely on introspection and private diary entries of casual observations. Romportl [2] carried out some instrumental analyses of F0 tracks, but his sample was small and collected under unclear circumstances. No other material-based study has been published to our best knowledge. It is fair to note, however, that both opposing camps admit not only the existence, but also phonological legality of the alternative forms. The core of the argument is then the typicality claim, i.e., the answer to the question which form is more frequent or archetypal (also felt as ‘normal’ or, in some approaches, unmarked). Unlike the previous efforts to decide the matter, we opted for ignoring our own opinion. Instead we carried out an empirical study of the problem. Initially (i.e., before this study), we prepared a recording session for 28 native speakers of Czech who were asked to act out short dialogues in which, among other things, various QWQs were scattered. They differed in contexts, numbers of syllables and, naturally, in lexical contents. The results [11] showed that QWQs without a contrastive context are read with a melodic peak in the early part of the phrase. The peak was usually associated with the second or third post-stress syllable due to the prevalence of L*+H pitch accent in current Czech. The resulting QWQ melodies thus differed from those of statements, which are also falling but without an early peak in their typical rendering. In the group of 28 speakers, only 4 used a late peak in QWQs, which would be predicted by some of the traditional authoritative descriptions. The question remained, however, about the perceptual aspect of the problem. Would listeners evaluate the early peak more ‘normal’ and the late peak as less ‘normal’, or vice versa? Which of the two forms is more at the forefront of their linguistic inventories, or in other words, less in need of special contextual justifications? Our null hypothesis would predict that there is no difference between early and late melodic peaks in terms of their evaluation by listeners. Yet the long- lasting dispute in Czech linguistics and our own informal observations suggest that the two forms do not have the same use and the same impact. Therefore, alternative hypotheses would state that subjects will favour one of the solutions proposed in the above-cited literature, preferring the early or the late peak in QWQs. Copyright  2019 ISCA INTERSPEECH 2019 September 15–19, 2019, Graz, Austria http://dx.doi.org/10.21437/Interspeech.2019-2082 1976