当前位置：首页 > AI人工智能 > 正文内容

[语音克隆]IndexTTS: 高性能文本转语音系统 | High-Performance Text-to-Speech System

JackSee2025-06-16AI人工智能75

# Article Summary / 文章概要

本指南介绍由哔哩哔哩开发的强大文本转语音（TTS）系统**IndexTTS**，优化了中英文自然且高保真的语音合成。了解如何设置IndexTTS，探索其应用场景，并通过分步说明优化性能。本文适合内容创作者和开发者，帮助您有效利用IndexTTS。

PS:我用星爷的声音克隆试了一下，感觉还行吧，一般般。

原版声音：<audio controls=""><source src="https://img.cmc.cm/yuansheng.wav" type="audio/mpeg"></source></audio>

克隆声音：<audio controls=""><source src="https://img.cmc.cm/kelong.wav" type="audio/mpeg"></source></audio>

---

## 1. Project Introduction / 项目简介

**IndexTTS** 是一个基于 GPT 风格架构的工业级文本转语音（TTS）系统，基于 XTTS 和 Tortoise 算法构建。它在中文处理方面表现卓越，支持拼音纠错、多音字消歧、长尾字发音优化以及通过标点符号实现精准停顿控制。通过集成 BigVGAN2 提升音频质量，IndexTTS 使用 25,000 小时中文音频和 9,000 小时英文音频进行训练，在字词错误率（WER：1.3%）、扬声器相似性（SS：0.776）和主观音质评分（MOS：4.01）方面超越 XTTS、CosyVoice2、Fish-Speech 和 F5-TTS 等竞争对手。

**Note / 备注**:   
IndexTTS 支持语音克隆，可实现逼真的扬声器复制，并具有不错的情感语调支持。测试结果将在本文末尾提供。

**Key Features / 核心功能**:

- 准确的中文发音，支持拼音纠错和多音字消歧。
- 通过标点符号实现精细停顿控制。
- 高保真音频，支持语音克隆。
- 中英双语支持。

---

## 2. Application Scenarios / 应用场景
 
IndexTTS 支持多种应用场景：

- **Content Creation & Video Dubbing / 内容创作与视频配音**:
    为短视频、Vlog 或纪录片生成自然语音，提升创作效率。
- **Audiobooks & Online Education / 有声读物与在线教育**:
    为电子书、儿童绘本或教育视频提供高质量中英双语朗读。
- **Smart Customer Service & Voice Assistants / 智能客服与语音助手**: 
    适配多种语音风格，提升客服响应的自然度和多样性。
- **Entertainment & Virtual Character Voices / 娱乐与虚拟角色语音**: 
    为游戏配音、虚拟主播或 AI 歌手打造沉浸式体验。
- **Accessibility Technology / 无障碍辅助技术**:
    为视障用户提供高质量语音辅助，如屏幕阅读器或导航服务。

![IndexTTS Application Scenarios](https://img.cmc.cm/2025-06-16_200436.png)

---

## 3. Testing Environment / 测试环境

**System / 系统**: Microsoft Windows 11 Professional  
**CPU / 处理器**: Intel® Core™ i7-9700K (8 cores, 3.7–5.1 GHz) [传送门](https://www.intel.cn/content/www/cn/zh/products/sku/186604/intel-core-i79700k-processor-12m-cache-up-to-4-90-ghz/specifications.html?-425671941.1607548799)  
**GPU / 显卡**: GeForce RTX 2070 (8GB VRAM) [传送门](https://www.nvidia.cn/geforce/graphics-cards/rtx-2070/)  
**Memory / 内存**: 32GB  
**Storage / 硬盘**: 512GB SSD  
**Note / 备注**:测试在较老的外星人51MR1 上进行，散热有限。新系统可适当降低配置。确保您的配置达到或超过这些规格以获得流畅性能。

---

## 4. Project Resources / 项目资源

- **Official Website / 官方网站**: [传送门](https://index-tts.github.io/)
- **GitHub Repository / GitHub 仓库**: [传送门](https://github.com/index-tts/index-tts?tab=readme-ov-file)
- **Research Paper / 论文地址**: [传送门](https://arxiv.org/abs/2502.05512)

### Trial Platforms / 试用平台

- **ModelScope Community / 魔塔社区** (requires registration, offers 100 hours of free GPU compute for new users): [传送门](https://modelscope.cn/studios/IndexTeam/IndexTTS-Demo)
- **Hugging Face / 抱抱脸** (requires VPN for access): [传送门](https://huggingface.co/spaces/IndexTeam/IndexTTS)

---

## 5. Installation Methods / 安装方法

### Option 1: One-Click Installation (Recommended for Beginners) / 一键安装（推荐新手）

Download the one-click installation package (4.5GB) provided by Bilibili creator “十个骑士”:  
下载由哔哩哔哩创作者“十个骑士”提供的一键安装包（4.5GB）：

- **Bilibili Profile / 哔哩哔哩主页**: [传送门](https://space.bilibili.com/24851376)
- **YouTube Profile / YouTube 主页**: [传送门](https://www.youtube.com/@yhqqxq)
- **Quark Drive / 夸克网盘**: [传送门](https://pan.quark.cn/s/b68a7e489f39#/list/share) (Extraction Code: pwFs)
- **Baidu Drive / 百度网盘**: [传送门](https://pan.baidu.com/share/init?surl=ABMMtSD5_iYcMQ89_MG6SQ&pwd=d5eg) (Extraction Code: d5eg)
- **OneDrive**: [传送门](https://wx4ns-my.sharepoint.com/personal/ai_wx4ns_onmicrosoft_com/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fai%5Fwx4ns%5Fonmicrosoft%5Fcom%2FDocuments%2Findex%2Dtts&ga=1)

**Steps / 步骤**:

1. Download and extract the package.
2. Run the `.exe` file (avoid Chinese characters in file paths).
3. Access the interface at `http://127.0.0.1:7860/` in your browser.  
    下载并解压安装包。  
    运行 `.exe` 文件（避免路径中包含中文字符）。  
    在浏览器中访问 `http://127.0.0.1:7860/` 查看界面。

![One-Click Installation Interface](https://img.cmc.cm/2025-06-16_204002.png)  
![IndexTTS Web Interface](https://img.cmc.cm/2025-06-16_204337.png)

### Option 2: Official Installation (Recommended for Advanced Users) / 官方安装（推荐动手能力强的用户）

#### Environment Setup / 环境设置

1. **Clone the Repository / 克隆仓库**:
    
```bash
git clone https://github.com/index-tts/index-tts.git
```
2. **Install Dependencies / 安装依赖项**:  
 
    创建新的 Conda 环境并安装依赖项：    
```bash
conda create -n index-tts python=3.10
conda activate index-tts
apt-get install ffmpeg
``` 
安装 PyTorch：
    
```bash
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118
```
    
    **Note for Windows Users / Windows 用户注意**:
如果遇到 `pynini` 安装错误，请使用：
    
```bash
conda install -c conda-forge pynini==2.1.6
    pip install WeTextProcessing --no-deps
```
将 IndexTTS 安装为包：
```bash
cd index-tts
    pip install -e .
```
3. **Download Models / 下载模型**:  
使用 `huggingface-cli` 下载：

```bash
huggingface-cli download IndexTeam/IndexTTS-1.5 \ config.yaml bigvgan_discriminator.pth bigvgan_generator.pth bpe.model dvae.pth gpt.pth unigram_12000.vocab \ --local-dir checkpoints
    ``` 
中国用户可使用镜像加速下载：
```bash 
export HF_ENDPOINT="https://hf-mirror.com"
```
或者使用 `wget`：   
```bash
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/bigvgan_discriminator.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/bigvgan_generator.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/bpe.model -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/dvae.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/gpt.pth -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/unigram_12000.vocab -P checkpoints
wget https://huggingface.co/IndexTeam/IndexTTS-1.5/resolve/main/config.yaml -P checkpoints
```
4. **Run Test Script / 运行测试脚本**:
```bash
Place your prompt audio in 'test_data' and rename it to 'input.wav'
python indextts/infer.py
```
5. **Use as a Command-Line Tool / 作为命令行工具使用**:
 ```bash
indextts "大家好，我现在正在bilibili 体验 ai 科技，说实话，来之前我绝对想不到！AI技术已经发展到这样匪夷所思的地步了！" \
      --voice reference_voice.wav \
      --model_dir checkpoints \
      --config checkpoints/config.yaml \
      --output output.wav
```
查看更多选项：
```bash
indextts --help
```
6. **Web Demo / 网页演示**:
```bash
pip install -e ".[webui]" --no-build-isolation
python webui.py
```
 在浏览器中访问 `http://127.0.0.1:7860` 查看演示。

#### Sample Code / 示例代码

```bash
from indextts.infer import IndexTTS
tts = IndexTTS(model_dir="checkpoints", cfg_path="checkpoints/config.yaml")
voice = "reference_voice.wav"
text = "大家好，我现在正在bilibili 体验 ai 科技，说实话，来之前我绝对想不到！AI技术已经发展到这样匪夷所思的地步了！比如说，现在正在说话的其实是B站为我现场复刻的数字分身，简直就是平行宇宙的另一个我了。如果大家也想体验更多深入的AIGC功能，可以访问 bilibili studio，相信我，你们也会吃惊的。"
tts.infer(voice, text, "output.wav")
```

---

## 6. Notes and Tips / 注意事项

1. **Performance Considerations / 性能注意事项**:  
    在测试配置下生成 10 分钟音频需要约 40 分钟（占用 3–4GB 内存、3–7GB 显存、20% CPU）。为获得更好性能，建议使用 Linux 系统或更高配置的 PC，散热需良好。
![Performance Metrics](https://img.cmc.cm/2025-06-16_195902.png) 
![Resource Usage](https://img.cmc.cm/2025-06-16_195656.png)
2. **File Path Restrictions / 文件路径限制**:  
    避免文件路径中使用中文字符或空格，以防 Python 环境问题。
3. **Disable Download Managers / 禁用下载管理器**:  
    在设置过程中禁用浏览器下载插件（如 IDM），以避免 GUI 启动问题。
---

## 7. Evaluation Summary / 测评总结

1. **Local Deployment / 本地部署**: 本地部署适度推荐（8.2/10 分）。一键包简化了设置，但高性能 PC 可获得最佳效果。
2. **Comparison / 对比**:  与同类 TTS 系统相比，IndexTTS 在本地部署和中文处理方面表现突出。
    为远程访问，可在 `launch()` 中设置 `share=True` 进行公开共享。使用专用 IP 进行端口映射，或为非专用 IP 设置内网穿透。
4. **Documentation / 文档**: 查看官方文档以获取详细指导。

[AI绘图] OmniGen2：开源版Flux.1 Kontext图片修改神器|OmniGen2: Open-source Version of Flux.1 Kontext Image Editin2025-06-24

JackSee'BLOG

[语音克隆]IndexTTS: 高性能文本转语音系统 | High-Performance Text-to-Speech System

相关文章

发表评论

Copyright @ CMC.CM 2022-2025

Powered By Z-BlogPHP. Theme by TOYEAN.

JackSee'BLOG

[语音克隆]IndexTTS: 高性能文本转语音系统 | High-Performance Text-to-Speech System

相关文章

发表评论取消回复

Copyright @ CMC.CM 2022-2025

Powered By Z-BlogPHP. Theme by TOYEAN.

发表评论