Agent skill

tts-skill

MiniMax TTS API - 文本转语音、声音克隆、声音设计

Stars 324
Forks 42

Install this agent skill to your Project

npx add-skill https://github.com/notedit/happy-skills/tree/main/skills/utils/tts-skill

Metadata

Additional technical details for this skill

tags
minimax, tts, voice, audio, speech

SKILL.md

MiniMax TTS Skill

这个 Skill 提供 MiniMax TTS API 的完整封装,支持文本转语音、声音克隆和声音设计功能。

快速开始

1. 环境配置

确保已设置环境变量:

bash
export MINIMAX_API_KEY="your-api-key"

详细配置说明见 setup.md

2. 使用 Python 模块

python
import sys
import os

# 获取 skill 目录路径
skill_dir = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, os.path.join(skill_dir, "assets"))

from minimax_tts import text_to_audio, list_voices, voice_clone, voice_design, play_audio

功能概览

功能 函数 说明
文本转语音 text_to_audio() 将文本转换为语音文件
列出声音 list_voices() 获取可用的声音列表
声音克隆 voice_clone() 基于音频文件克隆声音
声音设计 voice_design() 根据文字描述生成声音
播放音频 play_audio() 播放音频文件

详细文档

  • 环境配置 - API Key 和依赖安装
  • 文本转语音 - TTS 功能详解
  • 声音列表 - 可用声音和筛选
  • 声音克隆 - 克隆自定义声音
  • 声音设计 - 根据描述生成声音

快速示例

文本转语音

python
text_to_audio(
    text="你好,欢迎使用 MiniMax TTS 服务!",
    voice_id="female-shaonv",
    output_path="./hello.mp3"
)

列出可用声音

python
voices = list_voices(voice_type="system")
for voice in voices:
    print(f"{voice['voice_id']}: {voice['name']}")

声音克隆

python
voice_clone(
    voice_id="my-custom-voice",
    audio_file="./sample.mp3",
    voice_name="我的声音"
)

声音设计

python
voice_design(
    prompt="一个温柔的年轻女性声音,带有轻微的南方口音",
    preview_text="你好,这是我的声音"
)

支持的模型

模型 说明
speech-02-hd 高清版本,音质最佳
speech-02-turbo 快速版本,延迟低
speech-01-hd 旧版高清
speech-01-turbo 旧版快速
speech-2.6-hd 2.6 版高清
speech-2.6-turbo 2.6 版快速

常用声音 ID

系统预设声音

  • female-shaonv - 少女音
  • female-yujie - 御姐音
  • female-chengshu - 成熟女声
  • male-qingnian - 青年男声
  • male-chengshu - 成熟男声

更多声音请使用 list_voices() 查询。

Expand your agent's capabilities with these related and highly-rated skills.

notedit/happy-skills

screenshot-analyzer

Analyze product screenshots to extract feature lists and generate development task checklists. Use when: (1) Analyzing competitor product screenshots for feature extraction, (2) Generating PRD/task lists from UI designs, (3) Batch analyzing multiple app screens, (4) Conducting competitive analysis from visual references.

324 42
Explore
notedit/happy-skills

feature-pipeline

Execute implementation tasks from design documents using markdown checkboxes. Use when (1) implementing features from feature-analyzer output, (2) resuming interrupted work, (3) batch executing tasks. Triggers on 'start implementation', 'run tasks', 'resume'.

324 42
Explore
notedit/happy-skills

feature-analyzer

Turn ideas into fully formed designs and specs through natural collaborative dialogue. Use when planning new features, designing architecture, or making significant changes to the codebase.

324 42
Explore
notedit/happy-skills

feature-dev

Guided feature development with codebase understanding and architecture focus. Use for implementing features systematically: explore → clarify → design → implement → test → review.

324 42
Explore
notedit/happy-skills

issue-flow

AI-Native Issue-Driven development workflow. From GitHub Issue to merged PR: parse issue, explore codebase, design technical plan, execute with agent team, create PR, and cleanup. Use when a user wants to implement a GitHub Issue end-to-end: `/issue-flow #123` or `/issue-flow` to pick from open issues.

324 42
Explore
notedit/happy-skills

video-producer

End-to-end Remotion video production from natural language briefs. Orchestrates narrative structure, scene animation, visual style, and rendering to produce complete promotional videos. Use when a user wants to create a complete video (product promo, typographic piece, social media animation) — not just individual animation effects. Coordinates gsap-animation, spring-animation, and react-animation skills as building blocks.

324 42
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results