<h1 align="center">
<a href="https://prompts.chat">
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Sign in to like and favorite skills
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Ragas is an evaluation toolkit for Large Language Model (LLM) applications. It provides objective metrics for evaluating LLM applications, test data generation capabilities, and integrations with popular LLM frameworks.
The repository contains:
src/ragas/ directory)
ragas.experimentalChoose the appropriate installation based on your needs:
# RECOMMENDED: Minimal dev setup (79 packages - fast) make install-minimal # FULL: Complete dev environment (383 packages - comprehensive) make install # OR manual installation: # Create a virtual environment python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate` # Minimal dev setup (uses [project.optional-dependencies].dev-minimal) uv pip install -e ".[dev-minimal]" # Full dev setup (uses [dependency-groups].dev) uv sync --group dev
uv pip install with optional dependencies for selective installationuv sync with dependency groups for comprehensive environment managementdev-minimal vs dev clearly distinguish the two approachesThe project uses a UV workspace configuration for managing multiple packages:
# Install uv sync # Install examples separately uv sync --package ragas-examples # Build specific workspace package uv build --package ragas-examples
Workspace Members:
ragas (main package) - Located in src/ragas/ragas-examples (examples package) - Located in examples/The workspace ensures consistent dependency versions across packages and enables editable installs of workspace members.
# Setup and installation make install-minimal # Minimal dev setup (79 packages - recommended) make install # Full dev environment (383 packages - complete) # Code quality make format # Format and lint all code make type # Type check all code make check # Quick health check (format + type, no tests) # Testing make test # Run all unit tests make test-e2e # Run end-to-end tests # CI/Build make run-ci # Run complete CI pipeline make clean # Clean all generated files # Documentation make build-docs # Build all documentation make serve-docs # Serve documentation locally # Benchmarks make benchmarks # Run performance benchmarks make benchmarks-docker # Run benchmarks in Docker
# Run all tests (from root) make test # Run specific test (using pytest -k flag) make test k="test_name" # Run end-to-end tests make test-e2e # Direct pytest commands for more control uv run pytest tests/unit -k "test_name" uv run pytest tests/unit -v
# Build all documentation (from root) make build-docs # Serve documentation locally make serve-docs
# Run all benchmarks locally make benchmarks # Run benchmarks in Docker make benchmarks-docker
The repository has the following structure:
/ # Main ragas project ├── src/ragas/ # Source code including experimental features │ └── experimental/ # Experimental features ├── tests/ # All tests (core + experimental) │ └── experimental/ # Experimental tests ├── examples/ # Example code ├── pyproject.toml # Build config ├── docs/ # Documentation ├── scripts/ # Build/CI scripts ├── Makefile # Build commands └── README.md # Repository overview
The Ragas core library provides metrics, test data generation and evaluation functionality for LLM applications:
Metrics - Various metrics for evaluating LLM applications including:
Test Data Generation - Automatic creation of test datasets for LLM applications
Integrations - Integrations with popular LLM frameworks like LangChain, LlamaIndex, and observability tools
The experimental features are now integrated into the main ragas package:
ragas.experimentalTo use experimental features:
from ragas import Dataset from ragas import experiment from ragas.backends import get_registry
To view debug logs for any module:
import logging # Configure logging for a specific module (example with analytics) analytics_logger = logging.getLogger('ragas._analytics') analytics_logger.setLevel(logging.DEBUG) # Create a console handler and set its level console_handler = logging.StreamHandler() console_handler.setLevel(logging.DEBUG) # Create a formatter and add it to the handler formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s') console_handler.setFormatter(formatter) # Add the handler to the logger analytics_logger.addHandler(console_handler)
[project.optional-dependencies].dev-minimal for fast development (79 packages)[dependency-groups].dev for comprehensive development (383 packages)make install-minimal for most development tasks, make install for full ML stack work