Claude Code Agent Context: Sieveo Phase III

Version: 3.0 | Date: 2025-10-03 | Phase: III (Containerized Microservices)

Project Overview

Sieveo is an intelligent knowledge discovery and RAG (Retrieval-Augmented Generation) platform that enables technical professionals to search and analyze code repositories using natural language queries. Phase III transforms the Phase II CLI application into containerized, production-ready microservices with enhanced performance, scalability, and observability.

Sub Agents

Before analyzing a request and creating TODOs, review the following instructions to determine if you should forward the request to a sub-agent.

Use the scrum-master agent for all requirements, planning, task, and progress reporting requests.
Use the python-dev agent for all for development of new features, modifying existing features, and resolving defects/issues with features
Use the tester agent for all testing and quality checks including - formatting, typing, unit, integration, and end to end (e2e), and code coverage checks and tests

Technical Architecture

Language/Version: Python 3.13+ with modern type hints and async/await patterns Primary Dependencies: FastAPI, Haystack framework, Pydantic v2, Structlog, OpenTelemetry Vector Databases: Modular architecture supporting Qdrant (primary), Elasticsearch (enterprise), PostgreSQL+pgvector (cost-effective), Chroma (legacy) Storage: Hybrid storage with vector DB for embeddings, PostgreSQL for metadata, Redis for caching Job Queue: Celery with Redis broker for async indexing operations Testing: pytest with 70% integration testing focus, Ragas for RAG evaluation Deployment: Docker Compose (dev) and Kubernetes (prod) with Helm charts Target Platform: Containerized microservices with auto-scaling (Linux/macOS development, cloud deployment) Project Type: Three-service microservices architecture (Query, Index, Admin)

Phase III Key Capabilities

Microservices Architecture

Query Service (Port 8000): Read-optimized search operations with multi-tier caching
Index Service (Port 8001): Write-optimized document ingestion with distributed job queue
Admin Service (Port 8002): Control plane for user management, API keys, and system health
Service Communication: Internal REST APIs with OpenAPI 3.1 specifications
Service Discovery: Kubernetes DNS or Docker Compose hostnames
Independent Scaling: Horizontal scaling for Query/Index, vertical for Admin

Vector Database Modular Architecture

Abstract Interface: VectorStoreInterface ABC with strict type hints
Qdrant Backend: High-performance vector search with gRPC (5-10x faster, recommended for production)
Elasticsearch Backend: Hybrid BM25+kNN search for enterprise deployments with existing infrastructure
PostgreSQL+pgvector Backend: Unified database for vectors AND metadata, ACID transactions, lowest operational complexity
Chroma Backend: Simple deployment for development and Phase II backward compatibility
Factory Pattern: Configuration-driven backend selection without code changes
Migration Tools: Data transfer between all backends with 100% integrity verification

Multi-Tier Caching System

L1 Cache: In-memory LRU per Query Service instance (40-50% hit rate, <1ms latency)
L2 Cache: Redis distributed cluster (20-30% hit rate, <10ms latency)
Target Performance: 70%+ combined cache hit ratio, 50x latency improvement for cached queries
Smart Invalidation: Event-driven cache invalidation via Redis pub/sub
Cache Strategy: Query embeddings (infinite TTL), search results (1 hour TTL)

FastAPI Service Framework

Async-First Design: Native async/await for I/O-bound operations
Pydantic v2 Validation: Runtime validation with compile-time type hints
Dependency Injection: Service layer access via FastAPI dependencies
OpenAPI 3.1: Auto-generated API documentation with Swagger UI
Middleware Stack: Authentication, logging, rate limiting, compression, CORS

Local Reranking Models

BGE Reranker: Local deployment via sentence-transformers (no API costs)
Model Selection: bge-reranker-base (dev), bge-reranker-large (prod), bge-reranker-v2-m3 (multilingual)
Performance: ~50ms local inference vs 200-500ms API calls (4-10x faster)
Quality: 0.95+ NDCG@10 on MS MARCO benchmark
Integration: Rerank top-100 candidates � top-10 final results

Observability Stack

Distributed Tracing: OpenTelemetry with automatic FastAPI instrumentation
Metrics: Prometheus-compatible metrics with Grafana dashboards
Structured Logging: JSON logs with correlation IDs and trace context (structlog)
Health Checks: Liveness, readiness, and startup probes for Kubernetes
Performance Monitoring: P50/P95/P99 latency tracking, cache hit ratios, throughput metrics

Deployment & Operations

Docker Compose: Local development with profiles for vector DB selection
Kubernetes: Production deployment with Helm charts and auto-scaling
Blue-Green Deployment: Zero-downtime updates with health check coordination
Multi-Architecture: Support for amd64 and arm64 container images
Resource Management: CPU/memory requests and limits for all services

Security & Access Control

API Key Authentication: Bearer token authentication via Admin Service
Role-Based Access Control (RBAC): Viewer, contributor, and admin roles
Scoped Permissions: Fine-grained access control (search:read, index:write, admin:manage)
Rate Limiting: Per-API-key throttling with configurable limits
Audit Logging: Immutable audit trail with user attribution and correlation IDs

Performance Characteristics

Search Latency (P50): <50ms (cached), <100ms (warm cache), <500ms (cold cache)
Search Latency (P95): <100ms (target vs 2000ms in Phase II = 20x improvement)
Cache Hit Ratio: 70%+ combined L1+L2 (40-50% L1, 20-30% L2)
Indexing Throughput: 1000+ documents/minute (10x faster than Phase II)
Concurrent Users: 1000+ simultaneous connections (100x scale vs Phase II)
Service Availability: 99.9% target (max 43 minutes downtime per month)
Auto-Scaling: Query service scales within 30 seconds of load increase

Constitutional Compliance

Data Source Agnostic

Vector database abstraction enables backend selection without code changes
Unified ingestion interface maintained through Haystack pipeline architecture
GitHub repositories and local folders as pluggable data sources

Retrieval-First Architecture

Hybrid search (semantic + keyword) maintains focus on retrieval accuracy
BGE reranking enhances result quality with minimal latency impact
Measurable quality metrics through search relevance scoring and NDCG

Haystack Pipeline Integration

All indexing uses Haystack component patterns
Pipeline configurations externalized and version-controlled
Maintains consistency with Phase I and Phase II implementations

Incremental Processing

Incremental indexing preserved from Phase II with distributed job queue
Resumable and fault-tolerant processing with Celery task chains
Minimal reprocessing through Git-based change detection

Observable Knowledge Operations

Enhanced structured logging with correlation IDs across all services
Distributed tracing with OpenTelemetry for end-to-end visibility
Performance monitoring with Prometheus metrics and Grafana dashboards
Knowledge base statistics exposed through Admin Service health endpoints

Data Contracts & Type Safety

Comprehensive Pydantic v2 models with runtime validation across all services
Type hints mandatory across all Phase III functionality (mypy strict mode)
OpenAPI 3.1 specifications for all REST APIs with contract testing

Extensible Plugin Architecture

Vector store interface exemplifies plugin architecture with factory pattern
Multiple backends (Qdrant, Elasticsearch, Chroma) without code changes
Embedding model interchangeability maintained from Phase II
Service decomposition enables independent service evolution

Resource Efficiency & Sustainability

Multi-tier caching reduces vector DB load by 70%
Memory-conscious processing with configurable batch sizes
Efficient incremental processing reduces computational waste
Local reranking models eliminate per-request API costs ($0 vs $2/M requests)

Key Phase III Entities

Service Domain Models

SearchQuery: Search request with filters, mode, and reranking options
SearchResult: Individual result with score, metadata, and context
SearchResponse: Complete response with results, latency, and cache status
CacheEntry: Cache entry with TTL, access count, and expiration

Index Domain Models

IndexingJob: Job tracking with state machine (QUEUED � RUNNING � COMPLETED/FAILED)
DocumentIngestion: Document to be processed and indexed
ProcessingStatus: Real-time progress tracking for indexing operations
JobQueue: Celery queue metadata and worker allocation

Admin Domain Models

APIKey: Authentication credential with role, scopes, and rate limits
UserRole: Permission set (viewer, contributor, admin)
AuditLog: Immutable audit record with user attribution
SystemHealth: Service health with component status aggregation

Vector Store Models

VectorSearchRequest: Backend-agnostic search parameters
VectorSearchResult: Normalized result across all backends
VectorDocument: Document with embedding for indexing
CollectionInfo: Collection metadata and statistics

Development Workflow

Git Worktree Pattern

Feature branch:
```
003-review-the-proposal
```
Complete isolation from Phase II codebase during development
Controlled merge process after comprehensive validation

Testing Strategy

Integration Testing (70%): End-to-end service workflows and API contracts
Contract Testing (20%): Service interface compliance and OpenAPI validation
Unit Testing (10%): Critical algorithm validation and edge cases
RAG Testing: Ragas framework for search quality validation

Implementation Priorities

Vector Store Abstraction: Foundation for all data access with pluggable backends
Caching Layer: L1 + L2 multi-tier caching required by Query Service
Query Service: Read-optimized search with FastAPI and async patterns
Index Service: Write-optimized ingestion with Celery job queue
Admin Service: Control plane for user/key management and health checks
Observability Stack: OpenTelemetry tracing, Prometheus metrics, Grafana dashboards
Deployment Automation: Docker Compose and Kubernetes with Helm charts
Zero-Downtime Strategy: Blue-green deployment with health check coordination

Security Considerations

API Key Authentication: SHA-256 hashed keys, never store plaintext
Token Scope Validation: Fine-grained permissions with role-based access
Rate Limit Enforcement: Per-API-key throttling with Redis tracking
Audit Trails: Comprehensive logging with correlation ID tracking and immutability
Input Validation: Pydantic v2 validation for all request payloads
TLS/HTTPS: Encrypted communication for all external APIs

Integration Points

External Services

Vector Databases: Qdrant (gRPC), Elasticsearch (HTTP), Chroma (HTTP)
Redis: Caching (L2), job queue (Celery broker), pub/sub (cache invalidation)
PostgreSQL: Metadata persistence for jobs, users, API keys, audit logs

Internal Services

Query Service: Search operations with caching and reranking
Index Service: Document ingestion with job queue and workers
Admin Service: User management, API keys, health aggregation
Service Communication: Internal REST APIs with correlation ID propagation

Observability Services

Prometheus: Metrics scraping from
```
/metrics
```
endpoints
Grafana: Dashboard visualization with pre-built panels
Jaeger: Distributed trace collection via OpenTelemetry

Migration from Phase II

Backward Compatibility

Phase II documents remain fully searchable in Phase III vector stores
Phase II CLI maintained with automatic detection of Phase III services
Existing configuration migrated to new multi-service architecture
API contracts extended (not breaking) from Phase II

Data Migration

Automatic detection and migration of Phase II Chroma storage
Repository registration preserved with enhanced metadata
Configuration schema upgrade with validation
Vector store migration tools for Chroma � Qdrant/Elasticsearch

Phase III Success Metrics

Performance Requirements (10 total)

P95 search latency <100ms (20x faster than Phase II)
P50 search latency <50ms (16x faster than Phase II)
1000+ concurrent users (100x scale vs Phase II)
1000 docs/min indexing (10x faster than Phase II)
70%+ cache hit ratio (new capability)
99.9% service availability (production SLA)
Auto-scaling within 30 seconds

Functional Requirements (52 total)

Three-service microservices architecture with independent scaling
Multi-protocol API support (REST, gRPC deferred to Phase IV)
Hybrid search with BGE reranking
Multi-tier caching with event-driven invalidation
Role-based access control with API keys
Distributed tracing and structured logging
Zero-downtime blue-green deployments
Modular vector database architecture with 3 backends

Quality Requirements

Type safety with Pydantic v2 and mypy strict mode
OpenAPI 3.1 specifications for all APIs
Observable operations with correlation tracking
Resource efficiency with sustainable processing
Comprehensive error handling with recovery hints
Contract testing for service interfaces

Design Artifacts

Data Models

Location:

/workspace/specs/003-review-the-proposal/data-model.md

Comprehensive Pydantic v2 models for:

Query Service (SearchQuery, SearchResult, SearchResponse, CacheEntry)
Index Service (IndexingJob, DocumentIngestion, ProcessingStatus, JobQueue)
Admin Service (APIKey, UserRole, AuditLog, SystemHealth)
Vector Store (VectorSearchRequest, VectorSearchResult, VectorDocument, CollectionInfo)

API Contracts

Location:

/workspace/specs/003-review-the-proposal/contracts/rest-api.yaml

OpenAPI 3.1 specification defining:

Query Service endpoints (search, batch search)
Index Service endpoints (repository indexing, job management)
Admin Service endpoints (API keys, users, health, audit logs)
Health check endpoints (liveness, readiness, startup)
Authentication and rate limiting contracts
Error response formats with recovery hints

Vector Store Interface

Location:

/workspace/specs/003-review-the-proposal/contracts/vector-store-interface.py

Abstract interface contract with:

VectorStoreInterface ABC with async methods
Search operations (search, hybrid_search, batch_search)
Document management (upsert, delete, delete_by_filter)
Collection management (create, delete, get_info, list)
Health and statistics (health_check, get_statistics)
Factory pattern for backend selection
Comprehensive type hints and validation

Service Contracts

Location:

/workspace/specs/003-review-the-proposal/contracts/service-contracts.md

Service interaction patterns:

Inter-service communication protocols (REST, correlation IDs)
Data flow diagrams for Query, Index, and Admin services
Cache invalidation contracts (event-driven pub/sub)
Error handling and circuit breaker patterns
Observability contracts (tracing, metrics, logging)
Security contracts (authentication, authorization, rate limiting)
SLA targets (availability, latency, throughput)

Quick Start Guide

Location:

/workspace/specs/003-review-the-proposal/quickstart.md

Step-by-step deployment guide:

Prerequisites and system requirements
Docker Compose quick start (<10 minutes)
Test scenarios (health checks, indexing, search, batch operations)
Vector database selection (Qdrant, Elasticsearch, Chroma)
Monitoring and metrics (Grafana, Prometheus)
Troubleshooting common issues
Production deployment (Kubernetes)

This context provides Claude Code with comprehensive understanding of Sieveo Phase III containerized microservices architecture, capabilities, and implementation requirements for effective development assistance.

Claude Code Agent Context: Sieveo Phase III

Version: 3.0 | Date: 2025-10-03 | Phase: III (Containerized Microservices)

Project Overview

Sub Agents

Before analyzing a request and creating TODOs, review the following instructions to determine if you should forward the request to a sub-agent.

Use the scrum-master agent for all requirements, planning, task, and progress reporting requests.
Use the python-dev agent for all for development of new features, modifying existing features, and resolving defects/issues with features
Use the tester agent for all testing and quality checks including - formatting, typing, unit, integration, and end to end (e2e), and code coverage checks and tests

Query Service (Port 8000): Read-optimized search operations with multi-tier caching
Index Service (Port 8001): Write-optimized document ingestion with distributed job queue
Admin Service (Port 8002): Control plane for user management, API keys, and system health
Service Communication: Internal REST APIs with OpenAPI 3.1 specifications
Service Discovery: Kubernetes DNS or Docker Compose hostnames
Independent Scaling: Horizontal scaling for Query/Index, vertical for Admin

Vector Database Modular Architecture

Abstract Interface: VectorStoreInterface ABC with strict type hints
Qdrant Backend: High-performance vector search with gRPC (5-10x faster, recommended for production)
Elasticsearch Backend: Hybrid BM25+kNN search for enterprise deployments with existing infrastructure
PostgreSQL+pgvector Backend: Unified database for vectors AND metadata, ACID transactions, lowest operational complexity
Chroma Backend: Simple deployment for development and Phase II backward compatibility
Factory Pattern: Configuration-driven backend selection without code changes
Migration Tools: Data transfer between all backends with 100% integrity verification

Multi-Tier Caching System

L1 Cache: In-memory LRU per Query Service instance (40-50% hit rate, <1ms latency)
L2 Cache: Redis distributed cluster (20-30% hit rate, <10ms latency)
Target Performance: 70%+ combined cache hit ratio, 50x latency improvement for cached queries
Smart Invalidation: Event-driven cache invalidation via Redis pub/sub
Cache Strategy: Query embeddings (infinite TTL), search results (1 hour TTL)

FastAPI Service Framework

Async-First Design: Native async/await for I/O-bound operations
Pydantic v2 Validation: Runtime validation with compile-time type hints
Dependency Injection: Service layer access via FastAPI dependencies
OpenAPI 3.1: Auto-generated API documentation with Swagger UI
Middleware Stack: Authentication, logging, rate limiting, compression, CORS

Local Reranking Models

BGE Reranker: Local deployment via sentence-transformers (no API costs)
Model Selection: bge-reranker-base (dev), bge-reranker-large (prod), bge-reranker-v2-m3 (multilingual)
Performance: ~50ms local inference vs 200-500ms API calls (4-10x faster)
Quality: 0.95+ NDCG@10 on MS MARCO benchmark
Integration: Rerank top-100 candidates � top-10 final results

Observability Stack

Distributed Tracing: OpenTelemetry with automatic FastAPI instrumentation
Metrics: Prometheus-compatible metrics with Grafana dashboards
Structured Logging: JSON logs with correlation IDs and trace context (structlog)
Health Checks: Liveness, readiness, and startup probes for Kubernetes
Performance Monitoring: P50/P95/P99 latency tracking, cache hit ratios, throughput metrics

Deployment & Operations

Docker Compose: Local development with profiles for vector DB selection
Kubernetes: Production deployment with Helm charts and auto-scaling
Blue-Green Deployment: Zero-downtime updates with health check coordination
Multi-Architecture: Support for amd64 and arm64 container images
Resource Management: CPU/memory requests and limits for all services

Security & Access Control

API Key Authentication: Bearer token authentication via Admin Service
Role-Based Access Control (RBAC): Viewer, contributor, and admin roles
Scoped Permissions: Fine-grained access control (search:read, index:write, admin:manage)
Rate Limiting: Per-API-key throttling with configurable limits
Audit Logging: Immutable audit trail with user attribution and correlation IDs

Performance Characteristics

Search Latency (P50): <50ms (cached), <100ms (warm cache), <500ms (cold cache)
Search Latency (P95): <100ms (target vs 2000ms in Phase II = 20x improvement)
Cache Hit Ratio: 70%+ combined L1+L2 (40-50% L1, 20-30% L2)
Indexing Throughput: 1000+ documents/minute (10x faster than Phase II)
Concurrent Users: 1000+ simultaneous connections (100x scale vs Phase II)
Service Availability: 99.9% target (max 43 minutes downtime per month)
Auto-Scaling: Query service scales within 30 seconds of load increase

Constitutional Compliance

Data Source Agnostic

Vector database abstraction enables backend selection without code changes
Unified ingestion interface maintained through Haystack pipeline architecture
GitHub repositories and local folders as pluggable data sources

Retrieval-First Architecture

Hybrid search (semantic + keyword) maintains focus on retrieval accuracy
BGE reranking enhances result quality with minimal latency impact
Measurable quality metrics through search relevance scoring and NDCG

Haystack Pipeline Integration

All indexing uses Haystack component patterns
Pipeline configurations externalized and version-controlled
Maintains consistency with Phase I and Phase II implementations

Incremental Processing

Incremental indexing preserved from Phase II with distributed job queue
Resumable and fault-tolerant processing with Celery task chains
Minimal reprocessing through Git-based change detection

Observable Knowledge Operations

Enhanced structured logging with correlation IDs across all services
Distributed tracing with OpenTelemetry for end-to-end visibility
Performance monitoring with Prometheus metrics and Grafana dashboards
Knowledge base statistics exposed through Admin Service health endpoints

Data Contracts & Type Safety

Comprehensive Pydantic v2 models with runtime validation across all services
Type hints mandatory across all Phase III functionality (mypy strict mode)
OpenAPI 3.1 specifications for all REST APIs with contract testing

Extensible Plugin Architecture

Vector store interface exemplifies plugin architecture with factory pattern
Multiple backends (Qdrant, Elasticsearch, Chroma) without code changes
Embedding model interchangeability maintained from Phase II
Service decomposition enables independent service evolution

Resource Efficiency & Sustainability

Multi-tier caching reduces vector DB load by 70%
Memory-conscious processing with configurable batch sizes
Efficient incremental processing reduces computational waste
Local reranking models eliminate per-request API costs ($0 vs $2/M requests)

Key Phase III Entities

Service Domain Models

SearchQuery: Search request with filters, mode, and reranking options
SearchResult: Individual result with score, metadata, and context
SearchResponse: Complete response with results, latency, and cache status
CacheEntry: Cache entry with TTL, access count, and expiration

Index Domain Models

IndexingJob: Job tracking with state machine (QUEUED � RUNNING � COMPLETED/FAILED)
DocumentIngestion: Document to be processed and indexed
ProcessingStatus: Real-time progress tracking for indexing operations
JobQueue: Celery queue metadata and worker allocation

Admin Domain Models

APIKey: Authentication credential with role, scopes, and rate limits
UserRole: Permission set (viewer, contributor, admin)
AuditLog: Immutable audit record with user attribution
SystemHealth: Service health with component status aggregation

Vector Store Models

VectorSearchRequest: Backend-agnostic search parameters
VectorSearchResult: Normalized result across all backends
VectorDocument: Document with embedding for indexing
CollectionInfo: Collection metadata and statistics

Development Workflow

Git Worktree Pattern

Feature branch:
```
003-review-the-proposal
```
Complete isolation from Phase II codebase during development
Controlled merge process after comprehensive validation

Testing Strategy

Integration Testing (70%): End-to-end service workflows and API contracts
Contract Testing (20%): Service interface compliance and OpenAPI validation
Unit Testing (10%): Critical algorithm validation and edge cases
RAG Testing: Ragas framework for search quality validation

Implementation Priorities

Vector Store Abstraction: Foundation for all data access with pluggable backends
Caching Layer: L1 + L2 multi-tier caching required by Query Service
Query Service: Read-optimized search with FastAPI and async patterns
Index Service: Write-optimized ingestion with Celery job queue
Admin Service: Control plane for user/key management and health checks
Observability Stack: OpenTelemetry tracing, Prometheus metrics, Grafana dashboards
Deployment Automation: Docker Compose and Kubernetes with Helm charts
Zero-Downtime Strategy: Blue-green deployment with health check coordination

Security Considerations

API Key Authentication: SHA-256 hashed keys, never store plaintext
Token Scope Validation: Fine-grained permissions with role-based access
Rate Limit Enforcement: Per-API-key throttling with Redis tracking
Audit Trails: Comprehensive logging with correlation ID tracking and immutability
Input Validation: Pydantic v2 validation for all request payloads
TLS/HTTPS: Encrypted communication for all external APIs

Integration Points

External Services

Vector Databases: Qdrant (gRPC), Elasticsearch (HTTP), Chroma (HTTP)
Redis: Caching (L2), job queue (Celery broker), pub/sub (cache invalidation)
PostgreSQL: Metadata persistence for jobs, users, API keys, audit logs

Internal Services

Query Service: Search operations with caching and reranking
Index Service: Document ingestion with job queue and workers
Admin Service: User management, API keys, health aggregation
Service Communication: Internal REST APIs with correlation ID propagation

Observability Services

Prometheus: Metrics scraping from
```
/metrics
```
endpoints
Grafana: Dashboard visualization with pre-built panels
Jaeger: Distributed trace collection via OpenTelemetry

Migration from Phase II

Backward Compatibility

Phase II documents remain fully searchable in Phase III vector stores
Phase II CLI maintained with automatic detection of Phase III services
Existing configuration migrated to new multi-service architecture
API contracts extended (not breaking) from Phase II

Data Migration

Automatic detection and migration of Phase II Chroma storage
Repository registration preserved with enhanced metadata
Configuration schema upgrade with validation
Vector store migration tools for Chroma � Qdrant/Elasticsearch

Phase III Success Metrics

Performance Requirements (10 total)

P95 search latency <100ms (20x faster than Phase II)
P50 search latency <50ms (16x faster than Phase II)
1000+ concurrent users (100x scale vs Phase II)
1000 docs/min indexing (10x faster than Phase II)
70%+ cache hit ratio (new capability)
99.9% service availability (production SLA)
Auto-scaling within 30 seconds

Functional Requirements (52 total)

Three-service microservices architecture with independent scaling
Multi-protocol API support (REST, gRPC deferred to Phase IV)
Hybrid search with BGE reranking
Multi-tier caching with event-driven invalidation
Role-based access control with API keys
Distributed tracing and structured logging
Zero-downtime blue-green deployments
Modular vector database architecture with 3 backends

Quality Requirements

Type safety with Pydantic v2 and mypy strict mode
OpenAPI 3.1 specifications for all APIs
Observable operations with correlation tracking
Resource efficiency with sustainable processing
Comprehensive error handling with recovery hints
Contract testing for service interfaces

Design Artifacts

Data Models

Location:

/workspace/specs/003-review-the-proposal/data-model.md

Comprehensive Pydantic v2 models for:

Query Service (SearchQuery, SearchResult, SearchResponse, CacheEntry)
Index Service (IndexingJob, DocumentIngestion, ProcessingStatus, JobQueue)
Admin Service (APIKey, UserRole, AuditLog, SystemHealth)
Vector Store (VectorSearchRequest, VectorSearchResult, VectorDocument, CollectionInfo)

API Contracts

Location:

/workspace/specs/003-review-the-proposal/contracts/rest-api.yaml

OpenAPI 3.1 specification defining:

Query Service endpoints (search, batch search)
Index Service endpoints (repository indexing, job management)
Admin Service endpoints (API keys, users, health, audit logs)
Health check endpoints (liveness, readiness, startup)
Authentication and rate limiting contracts
Error response formats with recovery hints

Vector Store Interface

Location:

/workspace/specs/003-review-the-proposal/contracts/vector-store-interface.py

Abstract interface contract with:

VectorStoreInterface ABC with async methods
Search operations (search, hybrid_search, batch_search)
Document management (upsert, delete, delete_by_filter)
Collection management (create, delete, get_info, list)
Health and statistics (health_check, get_statistics)
Factory pattern for backend selection
Comprehensive type hints and validation

Service Contracts

Location:

/workspace/specs/003-review-the-proposal/contracts/service-contracts.md

Service interaction patterns:

Inter-service communication protocols (REST, correlation IDs)
Data flow diagrams for Query, Index, and Admin services
Cache invalidation contracts (event-driven pub/sub)
Error handling and circuit breaker patterns
Observability contracts (tracing, metrics, logging)
Security contracts (authentication, authorization, rate limiting)
SLA targets (availability, latency, throughput)

Quick Start Guide

Location:

/workspace/specs/003-review-the-proposal/quickstart.md

Step-by-step deployment guide:

Prerequisites and system requirements
Docker Compose quick start (<10 minutes)
Test scenarios (health checks, indexing, search, batch operations)
Vector database selection (Qdrant, Elasticsearch, Chroma)
Monitoring and metrics (Grafana, Prometheus)
Troubleshooting common issues
Production deployment (Kubernetes)

Claude Code Agent Context: Sieveo Phase III

Related Skills

Nano Banana Pro

Markdown Converter

1password