LLM development tool - AI tools

BenchLLM is a tool for evaluating LLM-powered applications. It allows users to build test suites, generate quality reports, and choose between automated, interactive, or custom evaluation strategies.
- Other

PromptsLabs is a community-driven platform providing copy-paste prompts to test the performance of new LLMs. Explore and contribute to a growing collection of prompts.
- Free

Laminar is an open-source platform that enables developers to trace, evaluate, label, and analyze Large Language Model (LLM) applications with minimal code integration.
- Freemium
- From 25$

LM Studio is a desktop application that allows users to run Large Language Models (LLMs) locally and offline, supporting various architectures including Llama, Mistral, Phi, Gemma, DeepSeek, and Qwen 2.5.
- Free

LMQL is a programming language designed for large language models, offering robust and modular prompting with types, templates, and constraints.
- Free

Flowise is an open-source low-code platform that enables developers to build customized LLM orchestration flows and AI agents through a drag-and-drop interface.
- Freemium
- From 35$

GPT-LLM Playground is a macOS application designed for advanced experimentation and testing with Language Learning Models (LLMs). It offers features like multi-model support, versioning, and custom endpoints.
- Free

Inductor enables developers to rapidly prototype, evaluate, and improve LLM applications, ensuring high-quality app delivery.
- Freemium

LM Studio is a user-friendly desktop application that allows users to run various large language models (LLMs) locally and offline, including Llama 2, PN3, Falcon, Mistral, StarCoder, and GEMMA models from Hugging Face.
- Free

PromptMage is a Python framework that streamlines the development of complex, multi-step applications powered by Large Language Models (LLMs), offering version control, testing capabilities, and automated API generation.
- Other

ModelBench enables teams to rapidly deploy AI solutions with no-code LLM evaluations. It allows users to compare over 180 models, design and benchmark prompts, and trace LLM runs, accelerating AI development.
- Free Trial
- From 49$

Gentrace is an LLM evaluation platform designed for AI teams to test and automate evaluations of generative AI products and agents. It facilitates collaborative development and ensures high-quality LLM applications.
- Usage Based

Libretto offers comprehensive LLM monitoring, automated prompt testing, and optimization tools to ensure the reliability and performance of your AI applications.
- Freemium
- From 180$

Rig is a Rust-based framework for building modular and scalable LLM applications. It offers a unified LLM interface, Rust-powered performance, and advanced AI workflow abstractions.
- Free

Weights & Biases (W&B) Weave is a comprehensive framework designed for tracking, experimenting with, evaluating, deploying, and enhancing LLM-based applications.
- Other

Langfuse provides an open-source platform for tracing, evaluating, and managing prompts to debug and improve LLM applications.
- Freemium
- From 59$

Aider is a command-line tool that enables pair programming with LLMs to edit code in your local git repository. It supports various LLMs and offers top-tier performance on software engineering benchmarks.
- Free

GeneratorLLMs is a tool that creates standardized `llms.txt` files by extracting core website content. This improves how Large Language Models (LLMs) understand websites and enhances AI visibility.
- Free

LM-Kit provides .NET developers with tools for AI agent customization, creation, and orchestration. It enables multimodal generative AI systems integration in C# and VB.NET applications.
- Paid
- From 1000$

Langtail is a comprehensive testing platform that enables teams to test and debug LLM-powered applications with a spreadsheet-like interface, offering security features and integration with major LLM providers.
- Freemium
- From 99$

Agenta is an LLM engineering platform offering tools for prompt engineering, versioning, evaluation, and observability in a single, collaborative environment.
- Freemium
- From 49$

promptfoo is an open-source LLM testing tool designed to help developers secure and evaluate their language model applications, offering features like vulnerability scanning and continuous monitoring.
- Freemium

A sophisticated token counting tool that helps users manage token limits across multiple language models including GPT-4, Claude-3, and Llama-3, with client-side processing for maximum security.
- Free

Keywords AI is a comprehensive developer platform for LLM applications, offering monitoring, debugging, and deployment tools. It serves as a Datadog-like solution specifically designed for LLM applications.
- Freemium
- From 7$

LiteLLM provides a simplified and standardized way to interact with over 100 large language models (LLMs) using a consistent OpenAI-compatible input/output format.
- Free

Missing Studio is an open-source AI platform designed for developers to build and deploy generative AI applications. It offers tools for managing LLMs, optimizing performance, and ensuring reliability.
- Free

Phoenix accelerates AI development with powerful insights, allowing seamless evaluation, experimentation, and optimization of AI applications in real time.
- Freemium
Featured Tools

Kindo
Enterprise-Ready Agentic Security for DevOps and SecOps Automation
JuicyTalk
Chat or Create Your Own Best AI Girlfriend or Boyfriend Online Free
BestFaceSwap
Change faces in videos and photos with 3 simple clicks
Fellow
#1 AI Meeting Assistant
Celevu
Celebrity Image Licensing Platform
Cloudairy
AI-driven collaboration and design platform for teamsDidn't find tool you were looking for?