<h1 align="center">
<a href="https://prompts.chat">
[](https://gitlab.com/ai9804501/abc/-/pipelines)
Sign in to like and favorite skills
ABC (AI Benchmark Cluster) is an advanced LLM benchmarking platform that evaluates AI models against human educational standards. The system provides comprehensive testing across multiple subjects and educational levels, from elementary school to PhD, using Ollama for model execution.
Educational Level Benchmarking: Compare LLM performance against:
Subject Areas:
Automated Documentation: Self-generating performance reports and analysis through GitLab CI/CD pipelines
Pass/Fail Grading: Objective evaluation criteria for each educational level
abc/ ├── docs/ # Documentation and benchmark results │ ├── results/ # Auto-generated benchmark results │ ├── analysis/ # Performance analysis reports │ └── comparisons/ # Educational level comparisons ├── src/ # Source code │ ├── analysis/ # Analysis and metrics │ ├── benchmarking/ # Core benchmarking system │ ├── costs/ # Resource usage tracking │ ├── database/ # Results storage │ ├── pipeline/ # CI/CD pipeline integration │ ├── runner/ # Ollama model runners │ └── testing/ # Test suites by subject ├── tests/ # Test framework └── templates/ # Report templates
pyenv and uvglab CLI toolkubectl and helm for Kubernetes deploymentsRun the environment check script to verify your setup:
./scripts/check_dev.sh
This script will validate the installation of all required tools and provide installation instructions for any missing components.
The recommended way to run ABC is using Docker Compose, which ensures consistent environment and dependencies across all platforms.
git clone https://gitlab.com/ai9804501/abc.git cd abc
docker-compose up -d
git clone https://gitlab.com/ai9804501/abc.git cd abc
curl https://ollama.ai/install.sh | sh
python3 --version # Should output Python 3.12.x
pip install uv
uv venv source .venv/bin/activate # On Windows: .venv\Scripts\activate uv pip install -e ".[dev]"
docker-compose exec app python -m src.pipeline.cli run-benchmarks
ollama serve
ollama pull llama2 # Add other models as needed
python -m src.pipeline.cli run-benchmarks
Reports are automatically generated in the GitLab CI pipeline and can be found in:
docs/results/pages/benchmarks/git checkout -b feature/your-feature-name
pytest
The project uses GitLab CI/CD with the following stages:
The application can be deployed to Kubernetes using Helm:
kubectl config use-context your-cluster-context
# Kubernetes deployment configuration has been removed # Please refer to Docker Compose for deployment
Note: Kubernetes deployment configuration has been removed from this project. Please use Docker Compose for deployment as described above.
Required GitLab CI/CD variables:
KUBE_CONFIG: Base64 encoded kubeconfig fileCI_REGISTRY_USER: GitLab registry usernameCI_REGISTRY_PASSWORD: GitLab registry passwordGITLAB_TOKEN: Token for wiki updatesInstall pre-commit hooks to ensure code quality:
uv pip install pre-commit pre-commit install
This will run linters and formatters before each commit.
MIT License - see LICENSE file for details