<h1 align="center">
<a href="https://prompts.chat">
<img style="vertical-align:middle" height="200"
Sign in to like and favorite skills
Supercharge Your LLM Application Evaluations ๐
Documentation | Quick start | Join Discord | Blog | NewsLetter | Careers
Objective metrics, intelligent test generation, and data-driven insights for LLM apps
Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient evaluation workflows. Don't have a test dataset ready? We also do production-aligned test set generation.
Pypi:
pip install ragas
Alternatively, from source:
pip install git+https://github.com/vibrantlabsai/ragas
The fastest way to get started is to use the
ragas quickstart command:
# List available templates ragas quickstart # Create a RAG evaluation project ragas quickstart rag_eval # Specify where you want to create it. ragas quickstart rag_eval -o ./my-project
Available templates:
rag_eval - Evaluate RAG systemsComing Soon:
agent_evals - Evaluate AI agentsbenchmark_llm - Benchmark and compare LLMsprompt_evals - Evaluate prompt variationsworkflow_eval - Evaluate complex workflowsragas comes with pre-built metrics for common evaluation tasks. For example, Aspect Critique evaluates any aspect of your output using DiscreteMetric:
import asyncio from openai import AsyncOpenAI from ragas.metrics import DiscreteMetric from ragas.llms import llm_factory # Setup your LLM client = AsyncOpenAI() llm = llm_factory("gpt-4o", client=client) # Create a custom aspect evaluator metric = DiscreteMetric( name="summary_accuracy", allowed_values=["accurate", "inaccurate"], prompt="""Evaluate if the summary is accurate and captures key information. Response: {response} Answer with only 'accurate' or 'inaccurate'.""" ) # Score your application's output async def main(): score = await metric.ascore( llm=llm, response="The summary of the text is..." ) print(f"Score: {score.value}") # 'accurate' or 'inaccurate' print(f"Reason: {score.reason}") if __name__ == "__main__": asyncio.run(main())
Note: Make sure your
environment variable is set.OPENAI_API_KEY
Find the complete Quickstart Guide
In the past 2 years, we have seen and helped improve many AI applications using evals. If you want help with improving and scaling up your AI application using evals.
๐ Book a slot or drop us a line: [email protected].
If you want to get more involved with Ragas, check out our discord server. It's a fun community where we geek out about LLM, Retrieval, Production issues, and more.
+----------------------------------------------------------------------------+ | +----------------------------------------------------------------+ | | | Developers: Those who built with `ragas`. | | | | (You have `import ragas` somewhere in your project) | | | | +----------------------------------------------------+ | | | | | Contributors: Those who make `ragas` better. | | | | | | (You make PR to this repo) | | | | | +----------------------------------------------------+ | | | +----------------------------------------------------------------+ | +----------------------------------------------------------------------------+
We welcome contributions from the community! Whether it's bug fixes, feature additions, or documentation improvements, your input is valuable.
At Ragas, we believe in transparency. We collect minimal, anonymized usage data to improve our product and guide our development efforts.
โ No personal or company-identifying information
โ Open-source data collection code
โ Publicly available aggregated data
To opt-out, set the
RAGAS_DO_NOT_TRACK environment variable to true.
@misc{ragas2024, author = {VibrantLabs}, title = {Ragas: Supercharge Your LLM Application Evaluations}, year = {2024}, howpublished = {\url{https://github.com/vibrantlabsai/ragas}}, }