Speech.sh

Speech.sh

Command-line text-to-speech utility with MCP integration using OpenAI's API.

5
Stars
1
Forks
5
Watchers
0
Issues
Speech.sh provides robust text-to-speech capabilities from the command line by leveraging OpenAI's API and offers Model Context Protocol (MCP) compatibility for integration with AI assistants. Supporting multiple voice options, adjustable speech speed, and both tts-1 and tts-1-hd models, it ensures seamless audio conversion and playback. Flexible API key management, auto-caching, and reliable retry logic make it suitable for diverse environments. Additional features include security-focused processing, dependency validation, and compatibility with both ffmpeg and mplayer audio players.

Key Features

Command-line interface for text-to-speech
Multiple voice options (onyx, alloy, echo, fable, nova, shimmer)
Adjustable speech speed
Support for tts-1 and tts-1-hd models
Flexible API key management
Automatic audio file caching
Robust retry and timeout logic for network reliability
Compatibility with ffmpeg and mplayer
Model Context Protocol (MCP) integration
Security-focused processing and dependency validation

Use Cases

Automating text-to-speech tasks in shell scripts
Integrating speech synthesis in AI assistant pipelines
Batch converting large volumes of text to speech
Developing custom voicebots or notification systems
Creating accessible workflows for visually impaired users
Generating audio for announcements or alerts
Enhancing developer tools with speech functionality
Supporting language learning with auditory feedback
Building voice-enabled command-line interfaces
Reducing redundant API calls with automation

README

Speech.sh

A powerful command-line utility for text-to-speech conversion using OpenAI's API.

Features

  • Convert text to speech with a simple command
  • Multiple voice options (onyx, alloy, echo, fable, nova, shimmer)
  • Adjustable speech speed (0.25 to 4.0)
  • Support for both tts-1 and tts-1-hd models
  • Flexible API key management (command-line, environment variable, or file)
  • Automatic caching to avoid duplicate API calls
  • Robust retry mechanism for handling network issues
  • Support for both ffmpeg and mplayer for audio playback
  • MCP (Model Context Protocol) compatibility for integration with AI assistants

Installation

  1. Clone this repository:

    bash
    git clone https://github.com/j3k0/speech.sh.git
    cd speech.sh
    
  2. Make the scripts executable:

    bash
    chmod +x speech.sh mcp.sh launch
    
  3. Ensure you have the required dependencies:

    • curl
    • jq
    • Either ffmpeg or mplayer (ffmpeg preferred)

Usage

Basic usage:

bash
./speech.sh --text "Hello, world!"

With more options:

bash
./speech.sh --text "Hello, world!" --voice nova --speed 1.2 --model tts-1-hd

Options

-h, --help          Show help message and exit
-t, --text TEXT     Text to convert to speech (required)
-v, --voice VOICE   Voice model to use (default: onyx)
-s, --speed SPEED   Speech speed (default: 1.0)
-o, --output FILE   Output file path (default: auto-generated)
-a, --api_key KEY   OpenAI API key
-m, --model MODEL   TTS model to use (default: tts-1)
-p, --player PLAYER Audio player to use: auto, ffmpeg, or mplayer (default: auto)
    --verbose       Enable verbose logging
-V, --verbose       Same as --verbose
-r, --retries N     Number of retry attempts for API calls (default: 3)
-T, --timeout N     Timeout in seconds for API calls (default: 30)

API Key Configuration

The script accepts an OpenAI API key in three ways (in order of precedence):

  1. Command-line argument: --api_key "your-api-key"
  2. Environment variable: export OPENAI_API_KEY="your-api-key"
  3. A file named API_KEY in the script's directory

Advanced Features

Auto-caching

The script caches audio files by default to avoid unnecessary API calls. If you request the same text with the same voice and speed, it will reuse the previously generated audio file.

Retry Logic

The script includes sophisticated retry logic for API calls:

  • Automatically retries failed API calls (default: 3 attempts)
  • Implements exponential backoff for reliability
  • Uses native curl retry mechanism when available
  • Configurable timeout and retry values

Audio Player Options

You can choose your preferred audio player:

  • --player auto: Use ffmpeg if available, fall back to mplayer (default)
  • --player ffmpeg: Force using ffmpeg
  • --player mplayer: Force using mplayer

MCP Integration

The mcp.sh script provides Model Context Protocol compatibility, allowing the text-to-speech functionality to be used by MCP-compatible AI assistants like Claude.

To use the MCP server:

bash
# Start the MCP server using the launch script
./launch

For detailed instructions on using the MCP integration, see MCP_README.md.

Security Considerations

The script takes several steps to ensure security:

  • Uses proper JSON handling with jq for parameter processing
  • Implements proper array-based parameter passing to prevent shell injection
  • Validates needed dependencies before execution
  • Uses error handling throughout the execution process

Examples

Convert text to speech with default settings:

bash
./speech.sh --text "Hello, world!"

Use a different voice:

bash
./speech.sh --text "Hello, world!" --voice nova

Adjust the speech speed:

bash
./speech.sh --text "Hello, world!" --speed 1.5

Save to a specific file:

bash
./speech.sh --text "Hello, world!" --output hello.mp3

Use environment variable for API key:

bash
export OPENAI_API_KEY="your-api-key"
./speech.sh --text "Hello, world!"

Troubleshooting

If you encounter issues:

  1. Enable verbose logging with the --verbose flag
  2. Check that your OpenAI API key is valid
  3. Verify that all dependencies are installed
  4. Ensure you have internet connectivity
  5. Check the permissions of the output directory

Contributors

  • Jean-Christophe Hoelt
  • Claude AI (Anthropic)

License

GPL

Star History

Star History Chart

Repository Owner

j3k0
j3k0

User

Repository Details

Language Shell
Default Branch main
Size 67 KB
Contributors 2
License GNU General Public License v3.0
MCP Verified Nov 12, 2025

Programming Languages

Shell
100%

Tags

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

We respect your privacy. Unsubscribe at any time.

Related MCPs

Discover similar Model Context Protocol servers

  • Offorte MCP Server

    Offorte MCP Server

    Bridge AI agents with Offorte proposal automation via the Model Context Protocol.

    Offorte MCP Server enables external AI models to create and send proposals through Offorte by implementing the Model Context Protocol. It facilitates automation workflows between AI agents and Offorte's proposal engine, supporting seamless integration with chat interfaces and autonomous systems. The server provides a suite of tools for managing contacts, proposals, templates, and automation sets, streamlining the proposal creation and delivery process via standardized context handling. Designed for extensibility and real-world automation, it leverages Offorte's public API to empower intelligent business proposals.

    • 4
    • MCP
    • offorte/offorte-mcp-server
  • MCP OpenAI Server

    MCP OpenAI Server

    Seamlessly connect OpenAI's models to Claude via Model Context Protocol.

    MCP OpenAI Server acts as a Model Context Protocol (MCP) bridge allowing Claude Desktop to access and interact with multiple OpenAI chat models. It enables users to leverage models such as GPT-4o and O1 directly from Claude using a straightforward message-passing interface. The server supports easy integration through configuration and provides basic error handling. Designed for use with Node.js and requiring an OpenAI API key, it is tailored for macOS with support for other platforms in progress.

    • 69
    • MCP
    • mzxrai/mcp-openai
  • Model Context Protocol Server for Home Assistant

    Model Context Protocol Server for Home Assistant

    Seamlessly connect Home Assistant to LLMs for natural language smart home control via MCP.

    Enables integration between a local Home Assistant instance and language models using the Model Context Protocol (MCP). Facilitates natural language monitoring and control of smart home devices, with robust API support for state management, automation, real-time updates, and system administration. Features secure, token-based access, and supports mobile and HTTP clients. Designed to bridge Home Assistant environments with modern AI-driven automation.

    • 468
    • MCP
    • tevonsb/homeassistant-mcp
  • mcp-cli

    mcp-cli

    A command-line inspector and client for the Model Context Protocol

    mcp-cli is a command-line interface tool designed to interact with Model Context Protocol (MCP) servers. It allows users to run and connect to MCP servers from various sources, inspect available tools, resources, and prompts, and execute commands non-interactively or interactively. The tool supports OAuth for various server types, making integration and automation seamless for developers working with MCP-compliant servers.

    • 391
    • MCP
    • wong2/mcp-cli
  • OpenAI MCP Server

    OpenAI MCP Server

    Bridge between Claude and OpenAI models using the MCP protocol.

    OpenAI MCP Server enables direct querying of OpenAI language models from Claude via the Model Context Protocol (MCP). It provides a configurable Python server that exposes OpenAI APIs as MCP endpoints. The server is designed for seamless integration, requiring simple configuration updates and environment variable setup. Automated testing is supported to verify connectivity and response from the OpenAI API.

    • 77
    • MCP
    • pierrebrunelle/mcp-server-openai
  • FastMCP

    FastMCP

    The fast, Pythonic way to build MCP servers and clients.

    FastMCP is a production-ready framework for building Model Context Protocol (MCP) applications in Python. It streamlines the creation of MCP servers and clients, providing advanced features such as enterprise authentication, composable tools, OpenAPI/FastAPI generation, server proxying, deployment tools, and comprehensive client libraries. Designed for ease of use, it offers both standard protocol support and robust utilities for production deployments.

    • 20,201
    • MCP
    • jlowin/fastmcp
  • Didn't find tool you were looking for?

    Be as detailed as possible for better results