Speech tempo (speaking rate) varies both between and within speakers. These variations are important in speech communication, because they indicate emphasis and emotional involvement, and because they modulate other phonetic distinctions (e.g. between long and short vowels). But how large should changes in speech tempo be, in order to be noticed by a listener? What is the just noticeable difference for tempo in speech? Previous research on musical tempo indicates that the JND for musical tempo is about 5 to 10 per cent, depending on direction of change, base tempo, and other factors. In speech, however, articulatory gestures vary considerably in their intrinsic duration, yielding more ‘sloppy’ timing in speech. Hence, one would expect a larger JND for tempo in speech, as compared to music.
The JND for tempo in speech was first assessed by means of tempo drift detection, using dynamic stimuli. This method turned out to over-estimate the JND, as an artifact of natural pauses in the speech stimuli. The JND was then assessed in a 2IFC comparison experiment, using constant stimuli. Results suggest a JND of about 5 per cent of the natural base tempo. Although speech timing is more variable than music timing, due to intrinsic variability of articulatory gestures, tempo discriminability is nevertheless similar for speech and for music. The similar JNDs for speech and music suggest that listeners may compensate for speakers' articulatory perturbations of speech tempo, to retrieve the underlying base tempo.
If tempo variations between speakers and within speakers are easily noticeable, as indeed they seem to be, then these variations must exceed the observed JND of 5 per cent. This prediction was verified by means of a Dutch speech corpus, containing 15-minute interviews with 80 speakers. Between-speaker differences in average speech tempo typically exceed the observed JND. Tempo variations within speakers (for a subset of the 80 speakers) also exceed this JND. These findings underline the phonetic and communicative importance of tempo variations in speech.