joinly.ai

Enable AI agents to join and participate in your meetings.

View on GitHub Visit Website

393

Stars

Forks

393

Watchers

Issues

joinly.ai is an open-source middleware that allows AI agents to join, interact, and execute tasks during live video calls across platforms like Zoom, Google Meet, and Microsoft Teams. Leveraging an MCP (Model Context Protocol) server, it provides essential meeting tools, modular integrations for TTS/STT, and supports any LLM provider. It enables real-time conversational flow, ensuring natural interactions and privacy-focused, self-hosted deployment.

Key Features

Live AI agent interaction in meetings

Cross-platform support for Google Meet, Zoom, Microsoft Teams

Bring-your-own-LLM capability

Modular support for TTS/STT providers

Conversational flow with interruption handling

Real-time task execution by agents

Open-source and privacy-first design

MCP server to provide meeting tools and resources

Browser-based meeting participation

Self-hosted deployment option

Use Cases

Allowing AI agents to take meeting notes and transcripts

Automating actions and tasks during calls

Providing real-time information to participants

Facilitating multi-speaker conversations

Integrating with productivity tools like Notion

Assisting with real-time translation and summaries

Enabling automated issue tracking in project management tools

Supporting hands-free meeting assistance

Enhancing team collaboration with AI-driven insights

Testing AI conversational skills in live scenarios

README

joinly.ai is a connector middleware designed to enable AI agents to join and actively participate in video calls. Through its MCP server, joinly.ai provides essential meeting tools and resources that can equip any AI agent with the skills to perform tasks and interact with you in real time during your meetings.

Want to dive right in? Jump to the Quickstart! Want to know more? Visit our website!

[!IMPORTANT]
Don't want the hustle of setting everything up? Try our cloud first! ☁️🚀

:sparkles: Features

Live Interaction: Lets your agents execute tasks and respond in real-time by voice or chat within your meetings
Conversational flow: Built-in logic that ensures natural conversations by handling interruptions and multi-speaker interactions
Cross-platform: Join Google Meet, Zoom, and Microsoft Teams (or any available over the browser)
Bring-your-own-LLM: Works with all LLM providers (also locally with Ollama)
Choose-your-preferred-TTS/STT: Modular design supports multiple services - Whisper/Deepgram for STT and Kokoro/ElevenLabs/Deepgram for TTS (and more to come...)
100% open-source, self-hosted and privacy-first :rocket:

:video_camera: Demos

GitHub

In this demo video, joinly answers the question 'What is Joinly?' by accessing the latest news from the web. It then creates an issue in a GitHub demo repository.

Notion

In this demo video, we connect joinly to our notion via MCP and let it edit the content of a page content live in the meeting.

Any ideas what we should build next? Write us! :rocket:

:zap: Quickstart

Run joinly via Docker with a basic conversational agent client.

[!IMPORTANT] Prerequisites: Docker installation

Create a new folder joinly or clone this repository (not mandatory for the following steps). In this directory, create a new .env file with a valid API key for the LLM provider you want to use, e.g. OpenAI:

[!TIP] You can find the OpenAI API key here

Dotenv

# .env
# for OpenAI LLM
# change key and model to your desired one
JOINLY_LLM_MODEL=gpt-4o
JOINLY_LLM_PROVIDER=openai
OPENAI_API_KEY=your-openai-api-key

[!NOTE] See .env.example for complete configuration options including Anthropic (Claude) and Ollama setups. Replace the placeholder values with your actual API keys and adjust the model name as needed. Delete the placeholder values of the providers you don't use.

Pull the Docker image (~2.3GB since it packages browser and models):

bash

docker pull ghcr.io/joinly-ai/joinly:latest

Launch your meeting in Zoom, Google Meet or Teams and let joinly join the meeting using the meeting link as <MeetingURL>. Then, run the following command from the folder where you created the .env file:

bash

docker run --env-file .env ghcr.io/joinly-ai/joinly:latest --client <MeetingURL>

:red_circle: Having trouble getting started? Let's figure it out together on our discord!

:technologist: Run an external client

In Quickstart, we ran the Docker Container directly as a client using --client. But we can also run it as a server and connect to it from outside the container, which allows us to connect other MCP servers. Here, we run an external client using the joinly-client package and connect it to the joinly MCP server.

[!IMPORTANT] Prerequisites: do the Quickstart (except the last command), install uv, and open two terminals

Start the joinly server in the first terminal (note, we are not using --client here and forward port 8000):

bash

docker run -p 8000:8000 ghcr.io/joinly-ai/joinly:latest

While the server is running, start the example client implementation in the second terminal window to connect to it and join a meeting:

bash

uvx joinly-client --env-file .env <MeetingUrl>

Add MCP servers to the client

Add the tools of any MCP server to the agent by providing a JSON configuration. The configuration file can contain multiple entries under "mcpServers" which will all be available as tools in the meeting (see fastmcp client docs for config syntax):

json

{
    "mcpServers": {
        "localServer": {
            "command": "npx",
            "args": ["-y", "package@0.1.0"]
        },
        "remoteServer": {
            "url": "http://mcp.example.com",
            "auth": "oauth"
        }
    }
}

Add for example a Tavily config for web searching, then run the client using the config file, here named config.json:

bash

uvx joinly-client --env-file .env --mcp-config config.json <MeetingUrl>

:wrench: Configurations

Configurations can be given via env variables and/or command line args. Here is a list of common configuration options, which can be used when starting the docker container:

bash

docker run --env-file .env -p 8000:8000 ghcr.io/joinly-ai/joinly:latest <MyOptionArgs>

Alternatively, you can pass --name, --lang, and provider settings as command line arguments using joinly-client, which will override settings of the server:

bash

uvx joinly-client <MyOptionArgs> <MeetingUrl>

Basic Settings

In general, the docker image provides an MCP server which is started by default. But to quickly get started, we also include a client implementation that can be used via --client. Note, in this case no server is started and no other client can connect to it.

bash

# Start directly as client; default is as server, to which an external client can connect
--client <MeetingUrl>

# Change participant name (default: joinly)
--name "AI Assistant"

# Change language of TTS/STT (default: en)
# Note, availability depends on the TTS/STT provider
--lang de

# Change host & port of the joinly MCP server
--host 0.0.0.0 --port 8000

Providers

Text-to-Speech

bash

# Kokoro (local) TTS (default)
--tts kokoro
--tts-arg voice=<VoiceName>  # optionally, set different voice

# ElevenLabs TTS, include ELEVENLABS_API_KEY in .env
--tts elevenlabs
--tts-arg voice_id=<VoiceID>  # optionally, set different voice

# Deepgram TTS, include DEEPGRAM_API_KEY in .env
--tts deepgram
--tts-arg model_name=<ModelName>  # optionally, set different model (voice)

Transcription

bash

# Whisper (local) STT (default)
--stt whisper
--stt-arg model_name=<ModelName>  # optionally, set different model (default: base), for GPU support see below

# Deepgram STT, include DEEPGRAM_API_KEY in .env
--stt deepgram
--stt-arg model_name=<ModelName>  # optionally, set different model

Debugging

bash

# Start browser with a VNC server for debugging;
# forward the port and connect to it using a VNC client
--vnc-server --vnc-server-port 5900

# Logging
-v  # or -vv, -vvv

# Help
--help

GPU Support

We provide a Docker image with CUDA GPU support for running the transcription and TTS models on a GPU. To use it, you need to have the NVIDIA Container Toolkit installed and CUDA >= 12.6. Then pull the CUDA-enabled image:

bash

docker pull ghcr.io/joinly-ai/joinly:latest-cuda

Run as client or server with the same commands as above, but use the joinly:{version}-cuda image and set --gpus all:

bash

# Run as server
docker run --gpus all --env-file .env -p 8000:8000 ghcr.io/joinly-ai/joinly:latest-cuda -v
# Run as client
docker run --gpus all --env-file .env ghcr.io/joinly-ai/joinly:latest-cuda -v --client <MeetingURL>

By default, the joinly image uses the Whisper model base for transcription, since it still runs reasonably fast on CPU. For cuda, it automatically defaults to distil-large-v3 for significantly better transcription quality. You can change the model by setting --stt-arg model_name=<model_name> (e.g., --stt-arg model_name=large-v3). However, only the respective default models are packaged in the docker image, so it will start to download the model weights on container start.

:test_tube: Create your own agent

You can also write your own agent and connect it to our joinly MCP server. See the code examples for the joinly-client package or the client_example.py if you want a starting point that doesn't depend on our framework.

The joinly MCP server provides following tools and resources:

Tools

join_meeting - Join meeting with URL, participant name, and optional passcode
leave_meeting - Leave the current meeting
speak_text - Speak text using TTS (requires text parameter)
send_chat_message - Send chat message (requires message parameter)
mute_yourself - Mute microphone
unmute_yourself - Unmute microphone
get_chat_history - Get current meeting chat history in JSON format
get_participants - Get current meeting participants in JSON format
get_transcript - Get current meeting transcript in JSON format, optionally filtered by minutes
get_video_snapshot - Get an image from the current meeting, e.g., view a current screenshare

Resources

transcript://live - Live meeting transcript in JSON format, including timestamps and speaker information. Subscribable for real-time updates when new utterances are added.

:building_construction: Developing joinly.ai

For development we recommend using the development container, which installs all necessary dependencies. To get started, install the DevContainer Extension for Visual Studio Code, open the repository and choose Reopen in Container.

The installation can take some time, since it downloads all packages as well as models for Whisper/Kokoro and the Chromium browser. At the end, it automatically invokes the download_assets.py script. If you see errors like Missing kokoro-v1.0.onnx, run this script manually using:

bash

uv run scripts/download_assets.py

We'd love to see what you are using it for or building with it. Showcase your work on our discord

:pencil2: Roadmap

Meeting

Meeting chat access
Camera in video call with status updates
Enable screen share during video conferences
Participant metadata and joining/leaving
Improve browser agent capabilities

Conversation

Speaker attribute for transcription
Improve client memory: reduce token usage, allow persistence across meetings events
Improve End-of-Utterance/turn-taking detection
Human approval mechanism from inside the meeting

Integrations

Showcase how to add agents using the A2A protocol
Add more provider integrations (STT, TTS)
Integrate meeting platform SDKs
Add alternative open-source meeting provider
Add support for Speech2Speech models

:busts_in_silhouette: Contributing

Contributions are always welcome! Feel free to open issues for bugs or submit a feature request. We'll do our best to review all contributions promptly and help merge your changes.

Please check our Roadmap and don't hesitate to reach out to us!

:memo: License

This project is licensed under the MIT License ‒ see the LICENSE file for details.

:speech_balloon: Getting help

If you have questions or feedback, or if you would like to chat with the maintainers or other community members, please use the following links:

Star History

Repository Owner

joinly-ai

Organization

Repository Details

Language Python

Default Branch main

Size 1,617 KB

Contributors 4

License MIT License

MCP Verified Nov 11, 2025

Programming Languages

Python

98.89%

Dockerfile

0.97%

Shell

0.14%

Topics

agentic-ai ai-agent ai-tool conversational-ai llm mcp meeting-agent meeting-assistant meeting-notes productivity python transcription voice-ai

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Related MCPs

Discover similar Model Context Protocol servers

Klavis

One MCP server for AI agents to handle thousands of tools.

Klavis provides an MCP (Model Context Protocol) server with over 100 prebuilt integrations for AI agents, enabling seamless connectivity with various tools and services. It offers both cloud-hosted and self-hosted deployment options and includes out-of-the-box OAuth support for secure authentication. Klavis is designed to act as an intelligent connector, streamlining workflow automation and enhancing agent capability through standardized context management.

⭐ 5,447
MCP
Klavis-AI/klavis

Linked API MCP

Connect LinkedIn to AI assistants for automated engagement and research.

Linked API MCP enables AI assistants like Claude, Cursor, and VS Code to interact with LinkedIn through a secure cloud browser. It allows automated searching for leads, profile analysis, messaging, and market research on LinkedIn. The solution is designed for sales, recruitment, and market research professionals who want to automate LinkedIn workflows. Integration is streamlined, offering a set of tools accessible to popular AI-powered platforms.

⭐ 23
MCP
Linked-API/linkedapi-mcp

deploy-mcp

Universal Deployment Tracker for AI Assistants

deploy-mcp offers a universal tool for tracking and monitoring deployments across various platforms within AI conversation environments. It eliminates the need for dashboard navigation by integrating deployment status and management directly into AI assistants via a standardized protocol. The tool supports multiple platforms such as Vercel, Netlify, and Cloudflare Pages, allowing for real-time monitoring and multi-platform configuration using secure tokens. Its streamlined setup enables users to check deployment status, view logs, and monitor progress through conversational commands.

⭐ 3
MCP
alexpota/deploy-mcp

MetaTrader MCP Server

Let AI assistants trade for you using natural language.

MetaTrader MCP Server is a bridge that connects AI assistants such as Claude and ChatGPT to the MetaTrader 5 trading platform via the Model Context Protocol (MCP). It enables users to perform trading actions on MetaTrader 5 through natural language instructions. The system supports real-time data access, full account management, and secure local credential handling, offering both MCP and REST API interfaces.

⭐ 120
MCP
ariadng/metatrader-mcp-server

Transcribe MCP

Automate audio transcription with seamless AI assistant integration.

Transcribe MCP enables fast and high-quality audio transcriptions by integrating directly with AI assistants such as Claude, Windsurf, and Cursor. It supports both local and cloud-based workflows, offering features like word-level timestamps, speaker separation, and multi-language support. Installation is streamlined via pre-built MCP Bundles and secure integration URLs. The system also provides convenient management of balance, transcription content, and collaboration options.

⭐ 5
MCP
transcribe-app/mcp-transcribe

GitMCP

Instantly turn any GitHub repository into an AI-ready documentation hub.

GitMCP is a free, open-source, remote Model Context Protocol (MCP) server that gives AI assistants real-time access to the latest documentation and code from any GitHub repository. It transforms any GitHub project into an accessible documentation hub, enabling AI tools to deliver accurate results, reduce hallucinations, and improve code correctness. Supporting both specific and generic server modes, it allows seamless integration into developer workflows with zero setup. GitMCP emphasizes privacy, flexibility, and up-to-date information retrieval.

⭐ 6,916
MCP
idosal/git-mcp

View all Alternatives

Didn't find tool you were looking for?

Search AI Tools

joinly.ai

Key Features

Use Cases

README

:sparkles: Features

:video_camera: Demos

GitHub

Notion

:zap: Quickstart

:technologist: Run an external client

Add MCP servers to the client

:wrench: Configurations

Basic Settings

Providers

Text-to-Speech

Transcription

Debugging

GPU Support

:test_tube: Create your own agent

Tools

Resources

:building_construction: Developing joinly.ai

:pencil2: Roadmap

:busts_in_silhouette: Contributing

:memo: License

:speech_balloon: Getting help

Star History

Repository Owner

Repository Details

Programming Languages

Tags

Topics

Related MCPs