LLM evaluation platform - AI tools

PromptsLabs A Library of Prompts for Testing LLMs

PromptsLabs is a community-driven platform providing copy-paste prompts to test the performance of new LLMs. Explore and contribute to a growing collection of prompts.

Free

Laminar The AI engineering platform for LLM products

Laminar is an open-source platform that enables developers to trace, evaluate, label, and analyze Large Language Model (LLM) applications with minimal code integration.

Freemium
From 25$

Autoblocks Improve your LLM Product Accuracy with Expert-Driven Testing & Evaluation

Autoblocks is a collaborative testing and evaluation platform for LLM-based products that automatically improves through user and expert feedback, offering comprehensive tools for monitoring, debugging, and quality assurance.

Freemium
From 1750$

Agenta End-to-End LLM Engineering Platform

Agenta is an LLM engineering platform offering tools for prompt engineering, versioning, evaluation, and observability in a single, collaborative environment.

Freemium
From 49$

LangWatch Monitor, Evaluate & Optimize your LLM performance with 1-click

LangWatch empowers AI teams to ship 10x faster with quality assurance at every step. It provides tools to measure, maximize, and easily collaborate on LLM performance.

Paid
From 59$

Humanloop The LLM evals platform for enterprises to ship and scale AI with confidence

Humanloop is an enterprise-grade platform that provides tools for LLM evaluation, prompt management, and AI observability, enabling teams to develop, evaluate, and deploy trustworthy AI applications.

Freemium

Libretto LLM Monitoring, Testing, and Optimization

Libretto offers comprehensive LLM monitoring, automated prompt testing, and optimization tools to ensure the reliability and performance of your AI applications.

Freemium
From 180$

GPT–LLM Playground Your Comprehensive Testing Environment for Language Learning Models

GPT-LLM Playground is a macOS application designed for advanced experimentation and testing with Language Learning Models (LLMs). It offers features like multi-model support, versioning, and custom endpoints.

Free

Langfuse Open Source LLM Engineering Platform

Langfuse provides an open-source platform for tracing, evaluating, and managing prompts to debug and improve LLM applications.

Freemium
From 59$

EleutherAI Empowering Open-Source Artificial Intelligence Research

EleutherAI is a research institute focused on advancing and democratizing open-source AI, particularly in language modeling, interpretability, and alignment. They train, release, and evaluate powerful open-source LLMs.