Xiangyinge Logo
Back to Blog
Guides & TutorialsBeginnerCantoneseSichuan DialectWu ChineseHokkien

Dialect Voice Cloning in 10 Seconds: Step-by-Step Guide

From recording samples to generating a custom voice, ideal for short videos, audiobooks, and customer service.

XiangYinGe Team

XiangYinGe Team

1/27/20255 Reading time

What is Voice Clone?

Voice Clone is an AI voice cloning technology that learns a speaker's voice characteristics from a short audio clip and generates a personalized AI voice model. You can then use this voice to convert any text into speech that sounds like you.

Unlike traditional TTS voices, cloned voices offer:

  • Personalization: Preserves your unique voice characteristics and speaking style
  • Multi-dialect: Supports voice cloning for 17 Chinese dialects
  • Low barrier: Only requires 10-20 seconds of clear audio
  • Instant availability: Ready to generate speech immediately after cloning

Supported Dialects

XiangYinGe's Voice Clone feature supports the following 17 dialects:

Region Supported Dialects
Mandarin Standard Mandarin, Northeastern, Henan, Shaanxi, Shandong, Tianjin, Gansu, Ningxia, Yunnan, Guizhou
Cantonese Cantonese (Guangdong)
Wu Shanghainese
Min Minnan (Hokkien)
Gan Jiangxi
Xiang Hubei
Hakka Shanxi

Three Steps to Clone Your Voice

Step 1: Prepare Your Audio

Record a 10-20 second clear voice clip. Keep in mind:

Environment Requirements

  • Choose a quiet environment, avoid background noise
  • Stay away from air conditioners, fans, and other continuous noise sources
  • Turn off phone notifications and other distractions

Recording Content

  • Use continuous, natural sentences
  • Avoid long pauses or hesitations
  • Maintain normal speaking speed
  • You can read a passage or speak freely

Technical Parameters

  • Supported formats: WAV, MP3, M4A
  • Recommended sample rate: 16kHz or higher
  • Recommended duration: 10-20 seconds (too short affects quality, too long provides no extra benefit)

Step 2: Upload and Name

  1. Open the TTS tool on XiangYinGe homepage
  2. In the dialect selection area, click the "My Voices" tab
  3. Click the "Clone My Voice" button
  4. Select or drag your audio file
  5. Give your voice an identifiable name (e.g., "My Cantonese")
  6. Check the authorization confirmation, click "Create Voice"

Step 3: Wait for Deployment and Use

After submission, the system will automatically process your audio:

  • Deploying: Voice is being trained and deployed, usually takes a few minutes
  • Ready: Voice is ready, you can start generating speech
  • Unavailable: Audio quality doesn't meet requirements, need to re-upload

Once ready, select your voice in "My Voices", enter text, and generate speech in your own dialect voice.

How to Record High-Quality Audio

Audio quality directly affects cloning results. Here are some practical tips:

Phone Recording

  • Use the built-in voice recorder app
  • Keep phone 15-20 cm from your mouth
  • Avoid handheld recording (placing on a table is more stable)

Computer Recording

  • External microphone works better
  • Recommend using professional software like Audacity
  • Export as WAV format for best quality

Suggested Recording Content

Here are some sample texts suitable for voice cloning:

Mandarin Example

The weather is really nice today, sunny and breezy. I'm planning to take a walk in the park and buy some fruit on the way back. I'm meeting friends for dinner tonight, heard there's a new Sichuan restaurant that's really authentic.

Cantonese Example

今日天气好好,阳光普照,凉风阵阵。我谂住去公园行下,顺便买啲生果返嚟。夜晚约咗朋友一齐食饭,听讲新开咗间川菜馆,味道好正。

Sichuan Dialect Example

今天天气巴适得很,太阳晒起暖和和的。我想去公园头逛哈,顺便买点水果回来。晚上约起朋友切吃饭,听说新开了家川菜馆,味道安逸。

FAQ

How long does cloning take?

Usually 2-5 minutes. If it shows "Deploying" for a long time, refresh the page to check the latest status.

Why does my voice show "Unavailable"?

Possible reasons include:

  • Audio duration less than 10 seconds
  • Too much background noise
  • Multiple people speaking in the audio
  • Audio quality too low (insufficient sample rate)

Try recording a clearer audio clip and try again.

How long will my cloned voice be saved?

Voice information is stored in your browser's local storage. Clearing browser data will cause voice loss. Make sure to confirm voice status before important use.

How many voices can I clone?

Currently there's no limit, you can clone as many different voices as needed.

Is voice cloning free?

Cloning itself is free. When using cloned voices to generate speech, charges are based on character count, same as system voices.

Is my audio data safe?

  • Audio is only used to train your personal voice model
  • Will not be used for other purposes or shared with third parties
  • You can delete created voices at any time

Use Cases

Content Creation

  • Short video dubbing: Dub videos with your own voice, maintain persona consistency
  • Podcast production: Quickly generate spoken content, improve production efficiency
  • Audiobooks: Read stories in a familiar voice, increase intimacy

Business Applications

  • Brand voice: Create a unified voice image for your brand
  • Customer service: Use real human voice to enhance customer experience
  • Navigation: Customize personalized navigation voice

Personal Use

  • Voice messages: Generate personalized voice messages
  • Learning aids: Record learning materials in your own voice
  • Fun creation: Have AI speak various dialects in your voice

Summary

XiangYinGe's Voice Clone feature allows everyone to have their own AI dialect voice. Just three simple steps—prepare audio, upload and name, wait for deployment—and you're ready to use.

Whether you're a content creator, business user, or dialect enthusiast, Voice Clone brings more possibilities to your creations.

Visit XiangYinGe homepage now and experience the magic of voice cloning!

Further Reading