Dialect Voice Cloning in 10 Seconds: Step-by-Step Guide
From recording samples to generating a custom voice, ideal for short videos, audiobooks, and customer service.
XiangYinGe Team
What is Voice Clone?
Voice Clone is an AI voice cloning technology that learns a speaker's voice characteristics from a short audio clip and generates a personalized AI voice model. You can then use this voice to convert any text into speech that sounds like you.
Unlike traditional TTS voices, cloned voices offer:
- Personalization: Preserves your unique voice characteristics and speaking style
- Multi-dialect: Supports voice cloning for 17 Chinese dialects
- Low barrier: Only requires 10-20 seconds of clear audio
- Instant availability: Ready to generate speech immediately after cloning
Supported Dialects
XiangYinGe's Voice Clone feature supports the following 17 dialects:
| Region | Supported Dialects |
|---|---|
| Mandarin | Standard Mandarin, Northeastern, Henan, Shaanxi, Shandong, Tianjin, Gansu, Ningxia, Yunnan, Guizhou |
| Cantonese | Cantonese (Guangdong) |
| Wu | Shanghainese |
| Min | Minnan (Hokkien) |
| Gan | Jiangxi |
| Xiang | Hubei |
| Hakka | Shanxi |
Three Steps to Clone Your Voice
Step 1: Prepare Your Audio
Record a 10-20 second clear voice clip. Keep in mind:
Environment Requirements
- Choose a quiet environment, avoid background noise
- Stay away from air conditioners, fans, and other continuous noise sources
- Turn off phone notifications and other distractions
Recording Content
- Use continuous, natural sentences
- Avoid long pauses or hesitations
- Maintain normal speaking speed
- You can read a passage or speak freely
Technical Parameters
- Supported formats: WAV, MP3, M4A
- Recommended sample rate: 16kHz or higher
- Recommended duration: 10-20 seconds (too short affects quality, too long provides no extra benefit)
Step 2: Upload and Name
- Open the TTS tool on XiangYinGe homepage
- In the dialect selection area, click the "My Voices" tab
- Click the "Clone My Voice" button
- Select or drag your audio file
- Give your voice an identifiable name (e.g., "My Cantonese")
- Check the authorization confirmation, click "Create Voice"
Step 3: Wait for Deployment and Use
After submission, the system will automatically process your audio:
- Deploying: Voice is being trained and deployed, usually takes a few minutes
- Ready: Voice is ready, you can start generating speech
- Unavailable: Audio quality doesn't meet requirements, need to re-upload
Once ready, select your voice in "My Voices", enter text, and generate speech in your own dialect voice.
How to Record High-Quality Audio
Audio quality directly affects cloning results. Here are some practical tips:
Recommended Recording Methods
Phone Recording
- Use the built-in voice recorder app
- Keep phone 15-20 cm from your mouth
- Avoid handheld recording (placing on a table is more stable)
Computer Recording
- External microphone works better
- Recommend using professional software like Audacity
- Export as WAV format for best quality
Suggested Recording Content
Here are some sample texts suitable for voice cloning:
Mandarin Example
The weather is really nice today, sunny and breezy. I'm planning to take a walk in the park and buy some fruit on the way back. I'm meeting friends for dinner tonight, heard there's a new Sichuan restaurant that's really authentic.
Cantonese Example
今日天气好好,阳光普照,凉风阵阵。我谂住去公园行下,顺便买啲生果返嚟。夜晚约咗朋友一齐食饭,听讲新开咗间川菜馆,味道好正。
Sichuan Dialect Example
今天天气巴适得很,太阳晒起暖和和的。我想去公园头逛哈,顺便买点水果回来。晚上约起朋友切吃饭,听说新开了家川菜馆,味道安逸。
FAQ
How long does cloning take?
Usually 2-5 minutes. If it shows "Deploying" for a long time, refresh the page to check the latest status.
Why does my voice show "Unavailable"?
Possible reasons include:
- Audio duration less than 10 seconds
- Too much background noise
- Multiple people speaking in the audio
- Audio quality too low (insufficient sample rate)
Try recording a clearer audio clip and try again.
How long will my cloned voice be saved?
Voice information is stored in your browser's local storage. Clearing browser data will cause voice loss. Make sure to confirm voice status before important use.
How many voices can I clone?
Currently there's no limit, you can clone as many different voices as needed.
Is voice cloning free?
Cloning itself is free. When using cloned voices to generate speech, charges are based on character count, same as system voices.
Is my audio data safe?
- Audio is only used to train your personal voice model
- Will not be used for other purposes or shared with third parties
- You can delete created voices at any time
Use Cases
Content Creation
- Short video dubbing: Dub videos with your own voice, maintain persona consistency
- Podcast production: Quickly generate spoken content, improve production efficiency
- Audiobooks: Read stories in a familiar voice, increase intimacy
Business Applications
- Brand voice: Create a unified voice image for your brand
- Customer service: Use real human voice to enhance customer experience
- Navigation: Customize personalized navigation voice
Personal Use
- Voice messages: Generate personalized voice messages
- Learning aids: Record learning materials in your own voice
- Fun creation: Have AI speak various dialects in your voice
Summary
XiangYinGe's Voice Clone feature allows everyone to have their own AI dialect voice. Just three simple steps—prepare audio, upload and name, wait for deployment—and you're ready to use.
Whether you're a content creator, business user, or dialect enthusiast, Voice Clone brings more possibilities to your creations.
Visit XiangYinGe homepage now and experience the magic of voice cloning!