ChatGPT voice chat surprise, delight, and caution｜和ChatGPT中英雙語對話：既驚豔又心存警戒

I had my first experience with the just-announced new capability of OpenAI's ChatGPT to voice chat with a user. At first I shrugged it off as just another gimmicky feature, because experience with any bots through voice had been ... meh: it is limited to one single language, either English or Mandarin for me, but not both.

Well, not with ChatGPT. We already know that it gladly takes input text in any language, even mixed languages in the same paragraph or sentence. So I was surprised and delighted when the voice interface actually extends this mixed-language capability from text without imposing a single-language limitation.

In my default text prompt, I asked ChatGPT to respond in Traditional Chinese for Taiwan (my preferred flavor of Chinese) if my input text is in English (effectively, translate English to Chinese), and in English if my input text is anything but English (effectively, translate any non-nglish to English).

With that default setting, I had a short voice interaction with ChatGPT midway of watching a YouTube video in Mandarin, in which the physics professor talked about a certain "Hilbert Space" (希爾伯特空間), which has to do with quantum physics and was an alien concept for me. Curious, I asked ChatGPT "What is Hilbert Space?" in Mandarin.

It first responded in English, as my default prompt requested, but peppered the English response with technical terms in Mandarin. This is a delight, despite it temporarily screwed up the pronunciation for one term in Mandarin (but soon recovered).

Next, I asked it to "restate the response in Traditional Chinese for Taiwan." Now it gave me, by voice, the same response but in Mandarin, just as I requested.

In summary, ChatGPT is able to communicate in multiple languages in the same voice. The voice in this instance is basically a native-English voice with the ability to speak Mandarin with a slight and charming American-English accent. It also has an uncanny human-like prosody that made me feel I was talking to a real person. I feel AI will more easily win over human users' trust by mimicking human speech pattern (hesitation, pauses, etc.) and this is where we should be cautious. With that charming voice, we may be caught off guard and take whatever the AI bot says at face value, forgetting that hallucination is still an issue.

Watch my short voice interaction with ChatGPT.

觀看我和ChatGPT的一小段中英夾雜的對話影片，我對此次經驗的感想：

跟機器人語音對話：這個中英夾雜、抑揚頓挫簡直真人一般的ChatGPT 3.5機器人，太驚豔了。

帶著美式英語口音的中文，雖然第一次念錯「機率論」，但讓我飛躍「恐怖谷」，一下子進入情況，感覺已在跟真人說話。其他很多媒體平台上，我們早已在聽機器人唸稿，破綻太多，很假。所有中文機器人都還有太多破音字唸錯的破綻。

英語的，有一個讓我無法辨識，是有聲書摘軟體Blinkist，我聽過他們已開發出的至少四個AI機器人，英文唸得有模有樣，完全沒出錯也沒亂停頓，如果不是在最後自己透露是機器人，我會不疑有他，厲害！

ChatGPT這邊則以平易近人的風格完勝所有其他機器人。我想，AI日後打入人心、卸下真人心防的路數，應該就是口氣？