Speech sound and audio processing


I don't remember who my fascinating with the sounds of human speech and vocals (such as from singing) started. Clearly it contributed to my pursuing a graduate degree in conference interpretation rather than written translation. Over the last decade, as I dabbled in amateur singing and became an interpreter, this interest only grew stronger.

With this ever-present attention to speech voice, the downside is increased intolerance with online conference speakers who by comparison obviously are oblivious to the problems of speech sound they are imposing on their listeners. One salient example is speaking into a headset microphone while constantly producing too high a sound volume or making lots of plosives (consonants that go popping). The perpetrator of these audio issues has crossed the boundary of simply being indecorous to harming the hearing health of or causing mental distress in their audience.

Can you relate to the issues? If not, here is a quick sound snippet from some YouTube video that I happened to be listening now as I write this because a friend shared a link. Disclaimer: I don't know this Mandarin speaker; the choice is not personal, only illustrative of a generic speech problem many people cause or have to suffer.

!_attachments/梵古的聯想 Henry Chang, 07162022 (catured 20230521).mp3

This clip is full of plosives, and some of them create the unpleasant perception of Clipping (audio) though physically (as shown in the picture) there is none.

Hear the the series of plosives from 00:25 and see their corresponding waveform.

!_attachments/plosives (sounds like clipping).mp4

I bet in the future you will be able to recognize the presence of bad plosives (not all plosives are bad) by looking at just the waveform without hearing it.

!_attachments/Screen Shot 2023-05-21 at 12.30.32.png

My audio equipment