Abstract:
Numerous metrics have been developed in the attempt to characterize cross-linguistic differences in speech rhythm, particularly with regard to the rhythm class hypothesis, which holds that languages differ in whether they privilege regularity in timing of stress, syllables, or moras. This paper tests for consistency in the performance of different types of rhythm metrics, using speech corpora of English, German, Greek, Italian, Korean, and Spanish obtained from eight speakers of each language with three elicitation methods: read sentences, read running text, and spontaneous speech. Rhythm metrics tested were: interval-based metrics derived from durations of consonantal and vocalic intervals, low-frequency spectral analysis of the vocalic energy envelope, and a recently developed metric based upon the power ratio of foot and syllable intrinsic mode functions obtained from empirical mode decomposition of the vocalic envelope. For all languages, the metrics indicate that spontaneous speech exhibits more stress-timing like characteristics than read speech, having higher interval variability and more dominant stress-timescale periodicity in the envelope. Cross-linguistic differences emerged in some cases, but these were not entirely consistent across metrics and were affected by the elicitation method. Overall the data suggest that the elicitation effects (i.e., read versus spontaneous speech) are larger than differences between languages.