Minimax

By ropon GitHub

minimax text-to-speech

Overview

What is MiniMax?

MiniMax is an official Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech and video/image generation APIs. It allows clients to generate speech, clone voices, and create videos and images.

How to use MiniMax?

To use MiniMax, obtain an API key from the MiniMax platform, install the required packages, and configure your MCP client (like Claude Desktop or Cursor) to connect to the MiniMax server.

Key features of MiniMax?

Text to audio conversion with various voice options.
Voice cloning using provided audio files.
Video generation from prompts.
Image generation from prompts.

Use cases of MiniMax?

Broadcasting segments of news with generated audio.
Cloning voices for personalized audio content.
Generating videos for educational or entertainment purposes.
Creating images based on textual descriptions.

FAQ from MiniMax?

Can MiniMax generate audio in multiple languages?

Yes! MiniMax supports multiple languages for text-to-speech conversion.

Is there a cost associated with using MiniMax?

Yes, using certain features may incur costs based on usage.

How do I troubleshoot API key errors?

Ensure that your API key matches the host you are connecting to and check for any typos.

Content

export

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech and video/image generation APIs. This server allows MCP clients like Claude Desktop, Cursor, Windsurf, OpenAI Agents and others to generate speech, clone voices, generate video, generate image and more.

Documentation

中文文档
MiniMax-MCP-JS - Official JavaScript implementation of MiniMax MCP

Quickstart with MCP Client

Get your API key from MiniMax.
Install uv (Python package manager), install with curl -LsSf https://astral.sh/uv/install.sh | sh or see the uv repo for additional install methods.

Claude Desktop

Go to Claude > Settings > Developer > Edit Config > claude_desktop_config.json to include the following:

{
  "mcpServers": {
    "MiniMax": {
      "command": "uvx",
      "args": [
        "minimax-mcp"
      ],
      "env": {
        "MINIMAX_API_KEY": "<insert-your-api-key-here>",
        "MINIMAX_MCP_BASE_PATH": "<local-output-dir-path>",
        "MINIMAX_API_HOST": "https://api.minimaxi.chat",
        "MINIMAX_API_RESOURCE_MODE": "<optional, [url|local], url is default, audio/image/video are downloaded locally or provided in URL format>"
      }
    }
  }
}

⚠️ Warning: The API key needs to match the host. If an error "API Error: invalid api key" occurs, please check your api host:

Global Host：https://api.minimaxi.chat (note the extra "i")
Mainland Host：https://api.minimax.chat

If you're using Windows, you will have to enable "Developer Mode" in Claude Desktop to use the MCP server. Click "Help" in the hamburger menu in the top left and select "Enable Developer Mode".

Cursor

Go to Cursor -> Preferences -> Cursor Settings -> MCP -> Add new global MCP Server to add above config.

That's it. Your MCP client can now interact with MiniMax through these tools:

Transport

We support two transport types: stdio and sse.

stdio	SSE
Run locally	Can be deployed locally or in the cloud
Communication through `stdout`	Communication through `network`
Input: Supports processing `local files` or valid `URL` resources	Input: When deployed in the cloud, it is recommended to use `URL` for input

Available Tools

tool	description
`text_to_audio`	Convert text to audio with a given voice
`list_voices`	List all voices available
`voice_clone`	Clone a voice using provided audio files
`generate_video`	Generate a video from a prompt
`text_to_image`	Generate a image from a prompt

Example usage

⚠️ Warning: Using these tools may incur costs.

1. broadcast a segment of the evening news

2. clone a voice

3. generate a video

4. generate images

No tools information available.

ElevenLabs MCP Server by MCP-Mirror

Mirror of

elevenlabs text-to-speech

View Details

MCPollinations Multimodal MCP Server by pinkpixel-dev

A Model Context Protocol (MCP) server that enables AI assistants to generate images, text, and audio through the Pollinations APIs. Supports customizable parameters, image saving, and multiple model options.

flux text-to-speech

View Details

TTS MCP Server by 136William136

tts text-to-speech

View Details