MLX Whisper MCP Server

MLX Whisper MCP Server

By kachiO GitHub

Local MCP server for MLX Whisper transcription

mcp transcription
Overview

What is MLX Whisper MCP Server?

MLX Whisper MCP Server is a local Model Context Protocol (MCP) server designed for audio transcription using the MLX Whisper model on Apple Silicon Macs.

How to use MLX Whisper MCP Server?

To use the server, run the command uv run mlx_whisper_mcp.py in your terminal. Ensure you have Python 3.12 or higher and the necessary dependencies installed.

Key features of MLX Whisper MCP Server?

  • Transcribes audio files directly from disk.
  • Supports transcription from base64-encoded audio data.
  • Utilizes the high-quality mlx-community/whisper-large-v3-turbo model.
  • Automatic dependency management with uv.
  • Provides rich console output for debugging.

Use cases of MLX Whisper MCP Server?

  1. Transcribing meetings or lectures recorded as audio files.
  2. Translating audio recordings from one language to another.
  3. Integrating with applications like Claude Desktop for enhanced audio processing capabilities.

FAQ from MLX Whisper MCP Server?

  • What are the system requirements?

You need an Apple Silicon Mac and Python 3.12 or higher.

  • Can I use it on non-Apple Silicon Macs?

No, this server is specifically designed for Apple Silicon Macs.

  • How do I troubleshoot import errors?

Ensure you are running on an Apple Silicon Mac and that all dependencies are correctly installed.

Content

MLX Whisper MCP Server

A simple Model Context Protocol (MCP) server that provides audio transcription capabilities using MLX Whisper on Apple Silicon Macs.

Features

  • Transcribe audio files directly from disk
  • Transcribe audio from base64-encoded data
  • Download and transcribe YouTube videos
  • Uses the high-quality mlx-community/whisper-large-v3-turbo model
  • Self-contained script with automatic dependency management via uv run
  • Rich console output for easy debugging
  • Saves transcription text files alongside audio files

Requirements

  • Python 3.12 or higher
  • Apple Silicon Mac (M-series)
  • uv installed (pip install uv or curl -sS https://astral.sh/uv/install.sh | bash)

Quick Start

Run directly with uv run:

uv run mlx_whisper_mcp.py

That's it! The script will automatically install its own dependencies and start the MCP server.

Using with Claude Desktop

  1. Edit your Claude Desktop configuration file:
# On macOS:
code ~/Library/Application\ Support/Claude/claude_desktop_config.json

# On Windows:
code %APPDATA%\Claude\claude_desktop_config.json
  1. Add the MLX Whisper MCP server configuration:
{
  "mcpServers": {
    "mlx-whisper": {
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/mlx_whisper_mcp/",
        "run",
        "mlx_whisper_mcp.py"
      ]
    }
  }
}
  1. Restart Claude Desktop

Available Tools

The server provides the following tools:

1. transcribe_file

Transcribes an audio file from a path on disk.

Parameters:

  • file_path: Path to the audio file
  • language: (Optional) Language code to force a specific language
  • task: "transcribe" or "translate" (translates to English)

2. transcribe_audio

Transcribes audio from base64-encoded data.

Parameters:

  • audio_data: Base64-encoded audio data
  • language: (Optional) Language code to force a specific language
  • file_format: Audio file format (wav, mp3, etc.)
  • task: "transcribe" or "translate" (translates to English)

3. download_youtube

Downloads a YouTube video.

Parameters:

  • url: YouTube video URL
  • keep_file: If True, keeps the downloaded file (default: True)

4. transcribe_youtube

Downloads and transcribes a YouTube video.

Parameters:

  • url: YouTube video URL
  • language: (Optional) Language code to force a specific language
  • task: "transcribe" or "translate" (translates to English)
  • keep_file: If True, keeps the downloaded file (default: True)

Example Prompts for Claude Desktop

How It Works

This server uses the MCP Python SDK to expose MLX Whisper's transcription capabilities to clients like Claude. When a transcription is requested:

  1. The audio data is received (either as a file path, base64-encoded data, or YouTube URL)
  2. For YouTube URLs, the video is downloaded to ~/.mlx-whisper-mcp/downloads
  3. For base64 data, a temporary file is created
  4. MLX Whisper is used to perform the transcription
  5. The transcription text is saved to a .txt file alongside the audio file
  6. The transcription text is returned to the client
  7. Temporary files are cleaned up (unless keep_file=True)

Troubleshooting

  • Import Error: If you see an error about MLX Whisper not being found, make sure you're running on an Apple Silicon Mac
  • File Not Found: Make sure you're using absolute paths when referencing audio files
  • Memory Issues: Very long audio files may cause memory pressure with the large model
  • YouTube Download Errors: Some videos may be restricted or require authentication
  • JSON Errors: If you see "not valid JSON" errors in logs, make sure server logging output is properly directed to stderr

License

Apache License 2.0 See LICENSE for details.

No tools information available.

This is a basic MCP Server-Client Impl using SSE

mcp server-client
View Details

-

mcp model-context-protocol
View Details

Buttplug.io Model Context Protocol (MCP) Server

mcp buttplug
View Details

MCP web search using perplexity without any API KEYS

mcp puppeteer
View Details

free MCP server hosting using vercel

mcp mantle-network
View Details

MCPHubs is a website that showcases projects related to Anthropic's Model Context Protocol (MCP)

mcp mcp-server
View Details