Two partly overlapping frequency glides can be perceived as consisting of a long glide tone whose duration corresponds to the duration of the whole pattern, and a short tone with a duration corresponding to the duration of the overlap (Nakajima et al., 2000; Remijn & Nakajima, 2005). For example, a glide moves for 1400 ms from 367.2 to 965.7 Hz, and another glide of the same duration begins 1200 ms later and moves from 1035.5 to 2723.7 Hz. They overlap each other for 200 ms, keeping a distance of 0.3 octave. A typical percept of this pattern is a long ascending tone corresponding to the duration of the whole pattern accompanied by a short tone corresponding to the duration of the overlap. A model named 'the event construction model' explains the perception of the short tone in a simple manner: Because the onset of the second glide and the offset of the first glide are close to each other in time and frequency, they are connected perceptually to construct an illusory auditory event, i.e., the short tone. We are trying to relate the phenomenon to the perception of speech syllables. Wang and Nakajima (2004) succeeded in generating auditory stimulus patterns where illusory conjunctions of onsets and offsets seemed to cause the perception of Chinese syllables. For example, Chinese-speaking listeners perceived a syllable /yao/ in a pattern where the onset to begin /y/ and the offset to end /o/ belonged to physically different harmonic glides. In some cases, listeners perceived lexical tones that could be obtained only when the onset and the offset were connected to each other perceptually.