CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is an AI-powered document processing and chat system built as a monorepo. The system allows users to upload PDF documents, process them using AI models, and engage in intelligent conversations about the document content.

Core Components:

Backend: Bun + Elysia API server with LangChain, Pinecone, and multiple LLM providers
Frontend: SvelteKit 2.x application with Tailwind CSS v4
Vector Database: Pinecone for semantic search and document embeddings
AI Models: OpenAI, HuggingFace, Ollama, and AWS Bedrock support
Monitoring: LangSmith integration for AI model performance tracing

Key Features:

PDF document upload and processing with LangSmith tracing
Vector embeddings and semantic search via Pinecone
Multi-model AI chat (OpenAI, HuggingFace, Ollama, AWS Bedrock)
Real-time streaming responses with Server-Sent Events
Session-based document management with ULID-based tracking
LangSmith trace visualization and performance monitoring

Common Development Commands

Workspace Commands (from root)

# Installation and workspace management
bun install                    # Install all workspace dependencies using catalog
bun run workspace:info         # Show workspace package information
bun run workspace:update       # Update all dependencies to latest versions
bun run workspace:check        # Run typecheck and lint across workspace

# Development
bun run dev                    # Start both backend and frontend in parallel
bun run dev:backend            # Start only backend (port 3001)
bun run dev:frontend           # Start only frontend (port 5174)

# Building
bun run build                  # Build both backend and frontend sequentially
bun run build:watch            # Build both packages in watch mode (parallel)
bun run build:backend          # Build only backend
bun run build:frontend         # Build only frontend

# Testing
bun run test                   # Run root-level Bun tests
bun run test:packages          # Run tests in all packages
bun run test:watch             # Run tests in watch mode
bun run test:coverage          # Run tests with coverage reporting

# Code quality
bun run typecheck              # Run TypeScript checking across all packages
bun run lint                   # Lint all packages with Biome
bun run lint:fix               # Lint and auto-fix issues
bun run format                 # Format all code with Biome

# Maintenance
bun run clean                  # Clean all build artifacts and caches
bun run clean:packages         # Clean only package-level artifacts

Backend Development

cd packages/backend
# Dependencies managed by workspace catalog - install from root

# Development
bun run dev                    # Start development server with hot reload (port 3001)
bun run start                  # Start production server

# Building
bun run build                  # Build optimized bundle for Bun runtime
bun run build:watch            # Build in watch mode

# Testing and Quality
bun run test                   # Run Bun native tests
bun run test:watch             # Run tests in watch mode
bun run test:coverage          # Run tests with coverage
bun run typecheck              # TypeScript type checking
bun run lint                   # Lint with Biome
bun run lint:fix               # Lint and auto-fix
bun run format                 # Format code with Biome

# Maintenance
bun run clean                  # Clean build artifacts

Frontend Development

cd packages/frontend
# Dependencies managed by workspace catalog - install from root

# Development
bun run dev                    # Start Vite dev server (port 5174)
bun run preview                # Preview production build

# Building
bun run build                  # Build for production
bun run build:watch            # Build in watch mode

# SvelteKit specific
bun run check                  # SvelteKit sync and type checking
bun run check:watch           # Continuous SvelteKit checking

# Testing and Quality
bun run test                   # Run Vitest tests
bun run test:watch             # Run tests in watch mode
bun run test:coverage          # Run tests with coverage
bun run typecheck              # TypeScript type checking
bun run lint                   # Lint with Biome
bun run lint:fix               # Lint and auto-fix
bun run format                 # Format code with Biome

# Maintenance
bun run clean                  # Clean build artifacts and cache

Workspace Configuration

Bun Workspace with Catalog System

This project uses Bun workspaces with catalogs for optimal dependency management and development experience. The workspace follows bun-reviewer.md best practices for production-ready patterns.

Key Features

✅ Dependency Catalog: Centralized version management for shared dependencies
✅ Bun Optimization: Frozen lockfile, peer dependencies, and caching configuration
✅ TypeScript Workspace: Composite projects with cross-package type checking
✅ Advanced Build Config: Environment-aware optimization with
```
bun.config.ts
```
✅ Quality Assurance: Biome for linting/formatting, comprehensive testing setup
✅ Performance Monitoring: Development and production monitoring scripts

Workspace Structure

ollama-prompting/
├── package.json              # Root workspace with catalog
├── tsconfig.json             # Root TypeScript configuration
├── bun.config.ts             # Advanced Bun configuration
├── biome.json                # Code quality configuration
├── scripts/                  # Workspace utilities
│   ├── validate-env.ts       # Environment validation
│   ├── dev-utils.ts          # Development tools
│   └── prod-monitoring.ts    # Production monitoring
└── packages/
    ├── backend/              # @ollama-prompting/backend
    └── frontend/             # @ollama-prompting/frontend

Dependency Catalog Benefits

// Root package.json - Centralized version management
"workspaces": {
  "packages": ["packages/*"],
  "catalog": {
    "typescript": "^5.9.2",
    "elysia": "^1.4.5",
    "svelte": "^5.38.10"
    // ... all shared dependencies
  }
}

// Package usage - Always in sync
"dependencies": {
  "elysia": "catalog:",
  "typescript": "catalog:"
}

Advanced Bun Configuration

The

bun.config.ts

provides:

Performance Monitoring: Nanosecond precision timing and alerts
Environment-Aware Settings: Different configs for dev/prod
Test Configuration: Coverage thresholds and reporting
Security Settings: Trusted dependencies and validation
Preload Scripts: Automatic environment validation and utilities

// Key configuration highlights
export default {
  preload: ["./scripts/validate-env.ts"],
  test: {
    coverage: { enabled: true, threshold: 80 }
  },
  build: {
    target: "bun",
    minify: isProduction,
    sourcemap: isDevelopment ? "inline" : "external"
  }
}

TypeScript Workspace Setup

Root
tsconfig.json
: Composite project with path mapping
Package extends: Each package extends root configuration
Cross-package imports: Type-safe imports between packages
Incremental builds: Faster compilation with declaration maps

// Path mapping for workspace packages
"paths": {
  "@ollama-prompting/backend/*": ["packages/backend/src/*"],
  "@ollama-prompting/frontend/*": ["packages/frontend/src/*"]
}

Code Quality with Biome

Comprehensive linting and formatting configuration:

Performance rules: Optimized for Bun runtime patterns
Security rules: No dangerous patterns, validated globals
Svelte support: Frontend-specific overrides
Workspace globals:
```
Bun
```
,
```
globalThis
```
, development tools

Runtime Detection Patterns

All code includes proper runtime detection for Bun/Node compatibility:

// Runtime-aware utilities
const isBun = () => typeof Bun !== 'undefined';

// Performance measurement
const timer = isBun() ? Bun.nanoseconds() : performance.now() * 1_000_000;

// File operations
const data = isBun()
  ? await Bun.file('config.json').json()
  : JSON.parse(await fs.readFile('config.json', 'utf-8'));

Architecture Overview

System Design Philosophy

Based on the architecture documentation (@docs/architecture.md), this system follows a modern, scalable design:

Core Goals:

Document Intelligence: Extract and understand complex document structures
Semantic Search: Find relevant information using vector embeddings
Intelligent Chat: Provide contextual responses about document content
Scalable Architecture: Handle multiple documents and concurrent users
Developer Experience: Modern tooling with Bun, Svelte, and TypeScript

Backend Structure (

/packages/backend

)

Framework: Native Bun HTTP server with TypeScript (no framework dependencies)
Services Layer: Modular services in
```
/packages/backend/src/services/
```
- ```
chatService.ts
```
  - Core chat functionality with graceful Couchbase handling
- ```
ragService.ts
```
  - Couchbase vector store RAG implementation
- ```
pineconeService.ts
```
  - Standard Pinecone vector operations with advanced query processing
- ```
pineconeDocumentService.ts
```
  - Session-based document processing with enhanced metadata
- ```
streamService.ts
```
  - Real-time response streaming with Server-Sent Events
- ```
langsmithService.ts
```
  - AI model performance tracing/monitoring with hierarchical threading
- ```
kafkaDbService.ts
```
  - SQLite-based Kafka message storage and retrieval
- ```
modelFactory.ts
```
  - Multi-model support factory
Models: Support for Ollama, OpenAI, HuggingFace, AWS Bedrock, and Groq
Database:
- Primary: Pinecone for vector storage (two separate implementations)
- Optional: Couchbase for vector operations and chat history
- Local: SQLite for Kafka message storage

Frontend Structure (

/packages/frontend

)

Framework: SvelteKit 2.x with Svelte 5.35+
Build Tool: Vite 7.x with proxy configuration for API routing
Routes: Located in
```
/packages/frontend/src/routes/
```
- ```
/
```
  - Home page with navigation
- ```
/ollama-chat
```
  - Direct Ollama chat interface
- ```
/rag-chat
```
  - RAG-enhanced document chat interface with LangSmith tracing
- ```
/simon
```
  - Alternative chat interface
Components: Enhanced components in
```
/packages/frontend/src/components/
```
- ```
LangSmithTraceViewer.svelte
```
  - Real-time trace visualization
- ```
Navigation.svelte
```
  - Application navigation
Styling: Tailwind CSS v4.1+ with comprehensive plugin ecosystem
- ```
@tailwindcss/typography
```
  - Rich text styling
- ```
@tailwindcss/forms
```
  - Form component styling
- ```
@tailwindcss/container-queries
```
  - Modern responsive design
- ```
@tailwindcss/aspect-ratio
```
  - Aspect ratio utilities

RAG Pipeline Architecture

The system implements three separate RAG pipelines for different use cases:

1. Standard Pinecone RAG (

/api/rag/chat/pinecone

)

Service:
```
pineconeService.ts
```
Purpose: General-purpose document processing with advanced query classification
Features:
- Advanced keyword extraction and query enhancement
- Legal document analysis with specialized prompts
- Token-aware text chunking for AWS Bedrock (8,192 token limit)
- Embedding averaging for large documents
- Query intent classification (legal_analysis, general_inquiry, etc.)
Upload:
```
/api/rag/upload-pdf/pinecone
```
Chat:
```
/api/rag/chat/pinecone
```

2. Session-Based Document RAG (

/api/rag/chat/pinecone-document

)

Service:
```
pineconeDocumentService.ts
```
Purpose: Session-isolated document processing with enhanced metadata
Features:
- ULID-based session management for document isolation
- Enhanced PDF metadata extraction (title, author, dates)
- Session parent trace creation for LangSmith threading
- Direct AWS Bedrock Claude integration (bypasses model factory)
- Session-specific vector storage with filename filtering
Upload:
```
/api/rag/upload-pdf/pinecone-document
```
Chat:
```
/api/rag/chat/pinecone-document
```

3. Couchbase Vector RAG (

/api/rag/chat/couchbase

)

Service:
```
ragService.ts
```
Purpose: Alternative vector storage using Couchbase Vector Search
Features:
- Couchbase cluster integration with graceful fallback
- LangChain CouchbaseVectorStore integration
- Model factory integration for flexible LLM selection
- Comprehensive error handling for connection issues
Upload:
```
/api/rag/upload-pdf/couchbase
```
Chat:
```
/api/rag/chat/couchbase
```

RAG Data Flow Architecture

Document Processing Flow:

PDF Upload → PDF Loading (LangChain) → Text Splitting →
Token Estimation → Text Chunking → AWS Bedrock Embeddings →
Vector Storage (Pinecone/Couchbase) → Session/Metadata Storage

Chat Query Flow:

User Query → Query Enhancement → Embedding Generation →
Vector Similarity Search → Context Retrieval →
Prompt Construction → LLM Generation → Streaming Response

Advanced Features:

Smart Chunking: Automatic text splitting with token awareness
Query Classification: Intent-based prompt selection
Session Isolation: Document-specific contexts using ULID sessions
Metadata Filtering: Precise document retrieval with filename/session filters
Streaming Responses: Real-time response delivery via Server-Sent Events
LangSmith Integration: Comprehensive tracing and performance monitoring

API Endpoints

Core Chat Endpoints

GET /api/ollama/models
- List available Ollama models
GET /api/ollama/chat/history
- Retrieve chat history
POST /api/ollama/chat
- Direct Ollama chat (streaming)
GET /api/rag/model-info
- Get current model configuration

RAG Document Processing Endpoints

POST /api/rag/upload-pdf/pinecone
- Upload PDF to standard Pinecone RAG

POST /api/rag/upload-pdf/pinecone-document
- Upload PDF to session-based RAG

POST /api/rag/upload-pdf/couchbase
- Upload PDF to Couchbase vector store

RAG Chat Endpoints

POST /api/rag/chat/pinecone
- Chat with standard Pinecone documents
- Required:
```
{ message, sessionId, filename }
```
- Optional:
```
{ queryType }
```
  for intent classification
POST /api/rag/chat/pinecone-document
- Chat with session-based documents
- Required:
```
{ message, sessionId, filename }
```
- Uses: Direct AWS Bedrock Claude integration
POST /api/rag/chat/couchbase
- Chat with Couchbase vector documents
- Required:
```
{ message, sessionId, filename }
```
- Features: Model factory integration

Kong Integration Endpoints

POST /api/kong/chat
- Kong Konnect AI chat with streaming
GET /api/kong/kafka/:topic
- Kong Kafka topic integration

Kafka Storage Endpoints

GET /api/kafka
- Retrieve stored Kafka messages
GET /api/kafka/topics
- List available Kafka topics

Key Integration Points

API Communication: Frontend proxy routes
```
/api/*
```
to backend on port 3001
Streaming: Server-Sent Events for real-time response streaming
Environment Variables: Backend uses
```
.env
```
for all configuration
Vector Search: Documents embedded and stored in Pinecone for semantic retrieval
Session Management: ULID-based session tracking per document
LangSmith Tracing: All endpoints support trace headers for performance monitoring

Model Configuration

The backend uses a factory pattern (

modelFactory.ts

) to support multiple LLM providers:

Ollama: Local models for offline use
OpenAI: GPT models for chat, embeddings via API
HuggingFace: Alternative models with local hosting support
AWS Bedrock: Claude models for chat, Titan embeddings (recommended)
Groq: High-performance inference

Important Patterns

Service Pattern: All major functionality is encapsulated in service classes
Streaming: Uses Server-Sent Events for real-time responses
Error Handling: Graceful degradation when Couchbase is unavailable
Type Safety: Full TypeScript support in both frontend and backend

Environment Configuration

Required Environment Variables (Backend .env)

Based on @docs/architecture.md and @docs/adr/0003-use-pinecone-vector-database.md:

# Pinecone Configuration (Primary Vector Database)
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX_NAME=your_index_name
PINECONE_NAMESPACE=your_namespace

# AI Model Configuration
OPENAI_API_KEY=your_openai_api_key
HUGGINGFACE_API_KEY=your_huggingface_api_key
HUGGINGFACE_MODEL=facebook/opt-1.3b
HUGGINGFACE_EMBEDDING_MODEL=nlpaueb/legal-bert-base-uncased

# LangSmith Configuration (for AI tracing)
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langchain_api_key
LANGCHAIN_PROJECT=your_project_name

# AWS Bedrock Configuration
AWS_REGION=eu-central-1
AWS_ACCESS_KEY_ID=your_access_key_or_DUMMY
AWS_SECRET_ACCESS_KEY=your_secret_key_or_DUMMY
AWS_BEARER_TOKEN_BEDROCK=your_bearer_token

# Bedrock Models
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v1
BEDROCK_CHAT_MODEL=anthropic.claude-3-5-sonnet-20240620-v1:0
BEDROCK_MAX_TOKENS=4096
BEDROCK_TEMPERATURE=0.7
BEDROCK_TOP_P=0.9

# Model Selection
CHAT_MODEL=OpenAI|HuggingFace|Ollama|Bedrock
OPENAI_MODEL=gpt-3.5-turbo
EMBEDDING_MODEL=Bedrock

# Ollama Configuration (if using local models)
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama2

# Optional: Couchbase (for chat history)
COUCHBASE_URL=couchbase://localhost
COUCHBASE_USERNAME=your_username
COUCHBASE_PASSWORD=your_password
COUCHBASE_CHAT_BUCKET=chat
COUCHBASE_CHAT_SCOPE=_default
COUCHBASE_CHAT_COLLECTION=_default

# Application Configuration
PORT=3001

Using This Workspace as a Template

Template-Ready Configuration

This workspace is designed as a production-ready template that can be easily adapted for new Bun projects. It follows the comprehensive patterns outlined in bun-reviewer.md and serves as an agnostic template for:

✅ Template Features

Universal Bun workspace patterns that work for any project type
Technology-agnostic catalog system for flexible dependency management
Comprehensive development tooling (linting, formatting, testing, monitoring)
Production-ready configurations for deployment and monitoring
Cross-platform compatibility with Node.js fallbacks

🔄 Quick Template Usage

To use this workspace structure for a new project:

Copy Configuration Files:

# Essential template files
cp package.json bun.config.ts tsconfig.json biome.json new-project/
cp -r scripts/ new-project/scripts/

Update Package Names and Catalog:

// Update root package.json
{
  "name": "your-project-monorepo",
  "workspaces": {
    "catalog": {
      // Add your project-specific dependencies
      "your-framework": "^1.0.0"
    }
  }
}

Adapt Package Structure:

# Create your packages following the pattern
mkdir -p packages/your-backend packages/your-frontend
# Copy and adapt package.json templates from existing packages

Customize Bun Configuration:

// Update bun.config.ts for your specific needs
export default {
  workspace: {
    packages: {
      "your-backend": { target: "bun" },
      "your-frontend": { target: "browser" }
    }
  }
}

📋 Template Checklist

When adapting this template:

CRITICAL: Run workspace validation before any changes:
```
bun run scripts/workspace-validator.ts
```
Detect framework types in packages to avoid configuration conflicts
Update all
```
@ollama-prompting/*
```
references to your project name
Customize the workspace catalog with your dependencies
Apply framework-specific TypeScript configurations (see Framework Guide below)
Adapt the scripts in
```
scripts/
```
for your environment variables
Update Biome configuration for your code style preferences
Validate TypeScript paths (no multiple wildcards allowed)
Customize test configuration for your testing strategy
Run final validation:
```
bun run workspace:check
```
Update CLAUDE.md with your project-specific information

⚠️ Critical Framework-Specific Configurations

BEFORE modifying any TypeScript configurations, run:

bun run scripts/workspace-validator.ts --detect-frameworks

SvelteKit Projects:

// ❌ NEVER DO THIS - Causes configuration conflicts
{
  "extends": ["../../tsconfig.json", "./.svelte-kit/tsconfig.json"],
  "compilerOptions": {
    "baseUrl": ".",
    "paths": { "@/*": ["./src/*"] } // CONFLICT!
  }
}

// ✅ CORRECT - SvelteKit-first approach
{
  "extends": "./.svelte-kit/tsconfig.json",
  "compilerOptions": {
    "composite": true,
    "declaration": true
    // NO baseUrl/paths - use svelte.config.js instead
  }
}

// ✅ Path aliases in svelte.config.js
export default {
  kit: {
    alias: {
      '@': './src',
      '@/components': './src/components'
    }
  }
}

React/Next.js Projects:

// ✅ SAFE - Can extend root workspace config
{
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "jsx": "react-jsx",
    "baseUrl": ".",
    "paths": {
      "@/*": ["./src/*"] // Single wildcard OK
    }
  }
}

Node.js/Express/Elysia Projects:

// ✅ SAFE - Full workspace integration
{
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "types": ["bun-types", "node"],
    "baseUrl": ".",
    "paths": {
      "@/*": ["./src/*"],
      "@/services/*": ["./src/services/*"]
    }
  }
}

🏗️ Architecture Patterns

The template provides these reusable patterns:

1. Dependency Management:

// Centralized version control through catalog
"catalog": {
  "shared-dep": "^1.0.0"
}
// Package references always stay in sync
"dependencies": {
  "shared-dep": "catalog:"
}

2. Cross-Package TypeScript:

// Root tsconfig with workspace references
"references": [
  { "path": "./packages/package-a" },
  { "path": "./packages/package-b" }
]

3. Runtime Detection:

// Universal pattern for Bun/Node compatibility
const isBun = () => typeof Bun !== 'undefined';

4. Performance Monitoring:

// Built-in performance measurement
const measure = isBun() ? Bun.nanoseconds() : performance.now() * 1_000_000;

5. Environment Validation:

// Automatic environment validation on startup
// Configurable per project in scripts/validate-env.ts

🔧 Customization Points

For Different Project Types:

Full-Stack Apps: Keep backend/frontend structure
Library Monorepos: Replace with lib/cli/docs structure
Microservices: Use service-a/service-b/shared structure
Tool Development: Use core/cli/plugins structure

Technology Adaptations:

React Frontend: Replace Svelte dependencies in catalog
Express Backend: Replace Elysia dependencies in catalog
Database Projects: Add database tooling to catalog
CLI Tools: Focus on Bun build optimizations

🚀 Agent Instructions Compatibility

This template is designed to work seamlessly with the

@bun-reviewer

agent patterns:

# When requesting workspace setup:
"Set up a Bun workspace with catalogs following the ollama-prompting template"

# Agent will understand:
- Catalog-based dependency management
- Production-ready bun.config.ts
- Comprehensive TypeScript setup
- Quality assurance tooling
- Performance monitoring patterns

The workspace structure serves as a reference implementation that agents can follow for consistent, production-ready Bun workspace setups.

Troubleshooting

Workspace-Specific Issues

⚠️ CRITICAL: Common Implementation Errors (LEARN FROM THESE!)

Error 1: TypeScript Path Pattern Wildcards

# ❌ ERROR MESSAGE:
# Invalid pattern "packages/*/src/*", must have at most one "*" character

# 🔍 ROOT CAUSE:
# Multiple wildcards in TypeScript path patterns are invalid

# ✅ PREVENTION:
bun run scripts/workspace-validator.ts  # Run BEFORE implementation

# ✅ FIX:
# Replace generic patterns with explicit ones:
"@backend/*": ["packages/backend/src/*"]  # ✅ Single wildcard
"@frontend/*": ["packages/frontend/src/*"]  # ✅ Single wildcard

Error 2: SvelteKit Configuration Conflicts

# ❌ ERROR MESSAGE:
# "baseUrl and/or paths in your tsconfig.json interferes with SvelteKit"

# 🔍 ROOT CAUSE:
# SvelteKit auto-generates tsconfig.json and conflicts with custom baseUrl/paths

# ✅ PREVENTION:
# Detect SvelteKit before applying workspace patterns:
bun run scripts/workspace-validator.ts --detect-frameworks

# ✅ FIX:
# 1. Remove baseUrl/paths from tsconfig.json
# 2. Add aliases to svelte.config.js instead:
export default {
  kit: {
    alias: {
      '@': './src',
      '@/components': './src/components'
    }
  }
}

Error 3: Catalog Reference Loops

# ❌ ERROR: Dependencies not resolving from catalog

# 🔍 ROOT CAUSE:
# Circular references or missing catalog entries

# ✅ PREVENTION:
bun install --dry-run  # Test before committing
bun run workspace:info  # Check catalog status

# ✅ FIX:
# Verify catalog syntax in root package.json:
"workspaces": {
  "catalog": {
    "typescript": "^5.9.2",  // ✅ Valid version
    "zod": "catalog:"        // ❌ Circular reference
  }
}

🛡️ Error Prevention Protocol

MANDATORY Steps for ALL Workspace Implementations:

Pre-Implementation Validation:

# Run BEFORE making ANY changes
bun run scripts/workspace-validator.ts

Framework Detection:

# Detect frameworks to apply correct patterns
grep -r "@sveltejs/kit" packages/*/package.json   # SvelteKit
grep -r '"react"' packages/*/package.json         # React
grep -r '"next"' packages/*/package.json          # Next.js

Post-Implementation Testing:

# Test configuration works
bun run typecheck
bun run workspace:check
bun run dev:backend & sleep 5 && kill %1  # Quick startup test
bun run dev:frontend & sleep 5 && kill %1

Final Validation:

# Ensure no errors before completion
bun run scripts/workspace-validator.ts
echo "✅ Workspace validation passed"

Catalog Dependency Problems

Problem: Dependencies not resolving from catalog

Solution:

# Verify catalog configuration
bun run workspace:info

# Reinstall with catalog
rm -rf node_modules packages/*/node_modules bun.lockb
bun install

# Check for catalog syntax errors
bun run workspace:check

TypeScript Workspace Errors

Problem: Cross-package imports not working

Solution:

# Rebuild TypeScript references
bun run typecheck --build

# Check path mapping in tsconfig.json
# Verify package references are correct

Bun Configuration Issues

Problem:

bun.config.ts

not being recognized

Symptoms:

Preload scripts not running
Test configuration ignored
Build settings not applied

Solution:

# Verify config syntax
bun --config ./bun.config.ts --help

# Check for TypeScript compilation errors
bun run typecheck

# Ensure scripts/ directory exists with proper permissions
ls -la scripts/

Biome Linting Conflicts

Problem: Linting errors in Svelte or test files

Solution:

# Check Biome configuration overrides
cat biome.json | grep -A 10 "overrides"

# Run with specific configuration
bun run lint --config-path ./biome.json

# Fix auto-fixable issues
bun run lint:fix

Performance Script Errors

Problem: Development or production monitoring scripts failing

Solution:

# Check Zod validation in validate-env.ts
bun run scripts/validate-env.ts

# Verify environment variables
bun run workspace:check

# Test runtime detection
node -e "console.log(typeof Bun)"  # Should be 'undefined'
bun -e "console.log(typeof Bun)"   # Should be 'object'

Frontend-Backend Connection Issues

Problem: Frontend shows "Failed to fetch models" with "Available Models: 0"

Root Cause: Frontend cannot reach backend API endpoints

Solution:

Verify Vite Proxy Configuration in

frontend/vite.config.ts

server: {
  port: 5174,
  proxy: {
    '/api': {
      target: 'http://localhost:3001',
      changeOrigin: true,
      secure: false,
    },
  },
}

Use Relative URLs in frontend code:

// ✅ Correct
const MODELS_URL = "/api/ollama/models";

// ❌ Incorrect (causes CORS/proxy issues)
const MODELS_URL = "http://localhost:3001/api/ollama/models";

Restart Frontend Server after Vite config changes:
```
bun run dev
```

Database Connection Issues

Pinecone Connection Issues

Pinecone is the PRIMARY database (per ADR-0003). If Pinecone fails:

RAG functionality will not work
Document upload and processing will fail
Check API key and index configuration

AWS Bedrock Token Limits

Amazon Titan Text Embeddings has an 8,192 token limit. The system automatically handles this by:

Token Estimation: Roughly 1 token ≈ 4 characters
Smart Chunking: Splits text by sentences, then words if needed
Embedding Averaging: Creates averaged embeddings for large texts
Rate Limiting: Adds delays between chunk requests

If you see "Too many input tokens" errors:

The system will automatically chunk and retry
Large documents may take longer to process
Monitor logs for chunking progress

Couchbase Connection Issues (Optional)

Symptoms: Backend logs show

UnambiguousTimeoutError: unambiguous timeout

Impact: With graceful error handling implemented, this should NOT crash the application.

Expected Behavior:

✅ Ollama models still load successfully
✅ Chat functionality continues to work
✅ Frontend shows empty chat history instead of errors
⚠️ Chat history is not persisted (in-memory only)

Error Handling Implementation:

```
ChatService.saveChatHistory()
```
: Catches errors and continues without throwing
```
ChatService.getChatHistory()
```
: Returns empty array instead of throwing errors
Main chat endpoint: Wraps history saves in non-blocking try-catch blocks

Common Development Issues

Tailwind CSS v4 Issues

The project uses Tailwind CSS v4.1+ with modern configuration:

Current Configuration (

postcss.config.js

export default {
  plugins: {
    '@tailwindcss/postcss': {},
    autoprefixer: {}
  }
};

CSS Import (

app.css

@import 'tailwindcss';

If you encounter PostCSS errors:

Verify

@tailwindcss/postcss

is installed:

bun add -D @tailwindcss/postcss

Ensure all Tailwind plugins are properly installed
Restart the development server after configuration changes

Bun vs Node.js

This project is optimized for Bun runtime (per ADR-0002):

Backend uses
```
bun run src/index.ts
```
Frontend can use
```
bun run dev
```
or
```
npm run dev
```
Package management prefers
```
bun install
```

RAG Pipeline Troubleshooting

Common RAG Issues

Problem: "I couldn't find any relevant information in the documents"

Root Cause: Using wrong RAG service endpoint or document not uploaded to correct service

Solution:

Verify Correct Service Pairing:

Upload with:
```
/api/rag/upload-pdf/pinecone-document
```
Chat with:
```
/api/rag/chat/pinecone-document
```

Check Session and Filename Matching:

# Ensure sessionId and filename match exactly between upload and chat
curl -X POST http://localhost:3001/api/rag/chat/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Your question here",
    "sessionId": "exact-session-id-from-upload",
    "filename": "exact-filename-from-upload.pdf"
  }'

Problem: "Retrieved 0 chunks with scores: []"

Solutions:

Verify document was successfully uploaded (check upload response)
Ensure sessionId and filename match exactly
Check Pinecone API key and index configuration
Verify the document contains relevant content for your query

Service-Specific Guidance

Standard Pinecone RAG (

/api/rag/chat/pinecone

Best for: General document Q&A with advanced query processing
Features: Query classification, legal analysis, keyword enhancement
Use when: You need intelligent query processing and document analysis

Session-Based RAG (

/api/rag/chat/pinecone-document

Best for: Isolated document sessions with enhanced metadata
Features: Session isolation, enhanced PDF metadata, LangSmith threading
Use when: You need document isolation and detailed tracing

Couchbase RAG (

/api/rag/chat/couchbase

Best for: Alternative vector storage with model flexibility
Features: Couchbase integration, model factory support
Use when: You need Couchbase integration or specific model requirements

Health Check Commands

# Test backend directly
curl http://localhost:3001/api/ollama/models

# Test frontend proxy (should return same result)
curl http://localhost:5174/api/ollama/models

# Test RAG document upload
curl -X POST http://localhost:3001/api/rag/upload-pdf/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{"file":"base64-encoded-pdf", "filename":"test.pdf"}'

# Test RAG chat
curl -X POST http://localhost:3001/api/rag/chat/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{"message":"test question", "sessionId":"session-id", "filename":"test.pdf"}'

# Check servers are running
lsof -i :3001  # Backend (Native Bun HTTP)
lsof -i :5174  # Frontend (Vite dev server)
lsof -i :11434 # Ollama (if running locally)

# Monitor LangSmith traces
# Visit: https://smith.langchain.com/

Testing Requirements

CRITICAL: Preventing Breaking Changes

Based on past issues with traceable wrapper implementations, ALL changes to RAG services must follow these mandatory testing requirements:

Before Making ANY Changes to Traceable Wrappers

1. Syntax Validation (MANDATORY)

# ALWAYS run these checks BEFORE implementing changes
cd packages/backend
bun run typecheck
bun run lint

2. Backup Working State

# Create a branch backup before making changes
git checkout -b backup-before-traceable-fixes
git add . && git commit -m "Backup working state before traceable changes"

During Implementation

3. Traceable Wrapper Pattern Requirements

CRITICAL: All
```
traceable
```
function calls must be properly closed with
```
});
```
CRITICAL: All processing logic must be indented INSIDE the traceable wrapper
CRITICAL: Return statements must be INSIDE the traceable wrapper, BEFORE the closing
```
});
```

Example of CORRECT traceable wrapper pattern:

// ✅ CORRECT Pattern
const documentProcessingTracer = langsmithService.createDocumentProcessingTracer();
const processingResult = await documentProcessingTracer(async () => {
    console.log("Starting processing...");

    // ALL processing logic goes here with proper indentation
    const result = await processDocument();

    return result; // Return INSIDE wrapper
}); // 🔥 CRITICAL: Close the traceable wrapper

// Error handling OUTSIDE wrapper
} catch (error) {
    console.error("Error:", error);
    throw error;
}

4. Incremental Testing

# Test syntax after EACH service modification
bun run typecheck

# If errors found, STOP and fix before continuing

After Implementation

5. Comprehensive Testing (MANDATORY)

# Test all RAG services can start without syntax errors
cd packages/backend
bun run dev &
sleep 5
# If no startup errors, kill the process
pkill -f "bun run dev"

# Test frontend can connect to backend
cd packages/frontend
bun run dev &
sleep 5
curl http://localhost:5174/api/ollama/models
pkill -f "bun run dev"

6. RAG Pipeline Integration Testing

# Test document upload (use provided test PDF)
curl -X POST http://localhost:3001/api/rag/upload-pdf/pinecone-document \
  -F "file=@Formula One World Drivers' Champions (1950-2024).pdf" \
  -H "Accept: application/json"

# Test RAG chat with uploaded document
curl -X POST http://localhost:3001/api/rag/chat/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Who won the championship in 1950?",
    "sessionId": "SESSION_ID_FROM_UPLOAD",
    "filename": "Formula One World Drivers' Champions (1950-2024).pdf"
  }'

7. LangSmith Trace Verification

Verify traces appear in LangSmith dashboard
Check that hierarchical grouping is working
Confirm parent-child trace relationships

Files That Require Special Attention

RAG Services with Traceable Wrappers:

packages/backend/src/services/ragService.ts

- Couchbase RAG

packages/backend/src/services/pineconeService.ts

- Standard Pinecone RAG

packages/backend/src/services/pineconeDocumentService.ts

- Session-based Pinecone RAG

Common Error Patterns to Avoid:

❌ Unclosed traceable wrapper:

await documentProcessingTracer(async () => {

without matching

});

❌ Logic outside wrapper: Processing code not properly indented inside the wrapper
❌ Return outside wrapper:
```
return
```
statement after the
```
});
```
instead of before
❌ Malformed syntax: Missing parentheses, brackets, or semicolons

Emergency Recovery

If Implementation Breaks the Pipeline:

# Immediately revert to working state
git checkout backup-before-traceable-fixes
git branch -D failed-traceable-implementation

# Restart from working baseline
# Re-read this testing guide before attempting fixes again

Testing Philosophy

"Test Early, Test Often, Never Break Production"

Incremental Changes: Modify one service at a time
Immediate Validation: Test syntax after each change
Integration Testing: Verify end-to-end pipeline after all changes
Rollback Ready: Always have a working backup to revert to

This testing protocol exists because traceable wrapper syntax errors can completely break the RAG pipeline, making the application unusable. Following these requirements prevents such critical failures.

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Core Components:

Backend: Bun + Elysia API server with LangChain, Pinecone, and multiple LLM providers
Frontend: SvelteKit 2.x application with Tailwind CSS v4
Vector Database: Pinecone for semantic search and document embeddings
AI Models: OpenAI, HuggingFace, Ollama, and AWS Bedrock support
Monitoring: LangSmith integration for AI model performance tracing

Key Features:

PDF document upload and processing with LangSmith tracing
Vector embeddings and semantic search via Pinecone
Multi-model AI chat (OpenAI, HuggingFace, Ollama, AWS Bedrock)
Real-time streaming responses with Server-Sent Events
Session-based document management with ULID-based tracking
LangSmith trace visualization and performance monitoring

Common Development Commands

Workspace Commands (from root)

# Installation and workspace management
bun install                    # Install all workspace dependencies using catalog
bun run workspace:info         # Show workspace package information
bun run workspace:update       # Update all dependencies to latest versions
bun run workspace:check        # Run typecheck and lint across workspace

# Development
bun run dev                    # Start both backend and frontend in parallel
bun run dev:backend            # Start only backend (port 3001)
bun run dev:frontend           # Start only frontend (port 5174)

# Building
bun run build                  # Build both backend and frontend sequentially
bun run build:watch            # Build both packages in watch mode (parallel)
bun run build:backend          # Build only backend
bun run build:frontend         # Build only frontend

# Testing
bun run test                   # Run root-level Bun tests
bun run test:packages          # Run tests in all packages
bun run test:watch             # Run tests in watch mode
bun run test:coverage          # Run tests with coverage reporting

# Code quality
bun run typecheck              # Run TypeScript checking across all packages
bun run lint                   # Lint all packages with Biome
bun run lint:fix               # Lint and auto-fix issues
bun run format                 # Format all code with Biome

# Maintenance
bun run clean                  # Clean all build artifacts and caches
bun run clean:packages         # Clean only package-level artifacts

Backend Development

cd packages/backend
# Dependencies managed by workspace catalog - install from root

# Development
bun run dev                    # Start development server with hot reload (port 3001)
bun run start                  # Start production server

# Building
bun run build                  # Build optimized bundle for Bun runtime
bun run build:watch            # Build in watch mode

# Testing and Quality
bun run test                   # Run Bun native tests
bun run test:watch             # Run tests in watch mode
bun run test:coverage          # Run tests with coverage
bun run typecheck              # TypeScript type checking
bun run lint                   # Lint with Biome
bun run lint:fix               # Lint and auto-fix
bun run format                 # Format code with Biome

# Maintenance
bun run clean                  # Clean build artifacts

Frontend Development

cd packages/frontend
# Dependencies managed by workspace catalog - install from root

# Development
bun run dev                    # Start Vite dev server (port 5174)
bun run preview                # Preview production build

# Building
bun run build                  # Build for production
bun run build:watch            # Build in watch mode

# SvelteKit specific
bun run check                  # SvelteKit sync and type checking
bun run check:watch           # Continuous SvelteKit checking

# Testing and Quality
bun run test                   # Run Vitest tests
bun run test:watch             # Run tests in watch mode
bun run test:coverage          # Run tests with coverage
bun run typecheck              # TypeScript type checking
bun run lint                   # Lint with Biome
bun run lint:fix               # Lint and auto-fix
bun run format                 # Format code with Biome

# Maintenance
bun run clean                  # Clean build artifacts and cache

Workspace Configuration

Bun Workspace with Catalog System

This project uses Bun workspaces with catalogs for optimal dependency management and development experience. The workspace follows bun-reviewer.md best practices for production-ready patterns.

Key Features

✅ Dependency Catalog: Centralized version management for shared dependencies
✅ Bun Optimization: Frozen lockfile, peer dependencies, and caching configuration
✅ TypeScript Workspace: Composite projects with cross-package type checking
✅ Advanced Build Config: Environment-aware optimization with
```
bun.config.ts
```
✅ Quality Assurance: Biome for linting/formatting, comprehensive testing setup
✅ Performance Monitoring: Development and production monitoring scripts

Workspace Structure

ollama-prompting/
├── package.json              # Root workspace with catalog
├── tsconfig.json             # Root TypeScript configuration
├── bun.config.ts             # Advanced Bun configuration
├── biome.json                # Code quality configuration
├── scripts/                  # Workspace utilities
│   ├── validate-env.ts       # Environment validation
│   ├── dev-utils.ts          # Development tools
│   └── prod-monitoring.ts    # Production monitoring
└── packages/
    ├── backend/              # @ollama-prompting/backend
    └── frontend/             # @ollama-prompting/frontend

Dependency Catalog Benefits

// Root package.json - Centralized version management
"workspaces": {
  "packages": ["packages/*"],
  "catalog": {
    "typescript": "^5.9.2",
    "elysia": "^1.4.5",
    "svelte": "^5.38.10"
    // ... all shared dependencies
  }
}

// Package usage - Always in sync
"dependencies": {
  "elysia": "catalog:",
  "typescript": "catalog:"
}

Advanced Bun Configuration

The

bun.config.ts

provides:

Performance Monitoring: Nanosecond precision timing and alerts
Environment-Aware Settings: Different configs for dev/prod
Test Configuration: Coverage thresholds and reporting
Security Settings: Trusted dependencies and validation
Preload Scripts: Automatic environment validation and utilities

// Key configuration highlights
export default {
  preload: ["./scripts/validate-env.ts"],
  test: {
    coverage: { enabled: true, threshold: 80 }
  },
  build: {
    target: "bun",
    minify: isProduction,
    sourcemap: isDevelopment ? "inline" : "external"
  }
}

TypeScript Workspace Setup

Root
tsconfig.json
: Composite project with path mapping
Package extends: Each package extends root configuration
Cross-package imports: Type-safe imports between packages
Incremental builds: Faster compilation with declaration maps

// Path mapping for workspace packages
"paths": {
  "@ollama-prompting/backend/*": ["packages/backend/src/*"],
  "@ollama-prompting/frontend/*": ["packages/frontend/src/*"]
}

Code Quality with Biome

Comprehensive linting and formatting configuration:

Performance rules: Optimized for Bun runtime patterns
Security rules: No dangerous patterns, validated globals
Svelte support: Frontend-specific overrides
Workspace globals:
```
Bun
```
,
```
globalThis
```
, development tools

Runtime Detection Patterns

All code includes proper runtime detection for Bun/Node compatibility:

// Runtime-aware utilities
const isBun = () => typeof Bun !== 'undefined';

// Performance measurement
const timer = isBun() ? Bun.nanoseconds() : performance.now() * 1_000_000;

// File operations
const data = isBun()
  ? await Bun.file('config.json').json()
  : JSON.parse(await fs.readFile('config.json', 'utf-8'));

Architecture Overview

System Design Philosophy

Based on the architecture documentation (@docs/architecture.md), this system follows a modern, scalable design:

Core Goals:

Document Intelligence: Extract and understand complex document structures
Semantic Search: Find relevant information using vector embeddings
Intelligent Chat: Provide contextual responses about document content
Scalable Architecture: Handle multiple documents and concurrent users
Developer Experience: Modern tooling with Bun, Svelte, and TypeScript

Backend Structure (

/packages/backend

)

Framework: Native Bun HTTP server with TypeScript (no framework dependencies)
Services Layer: Modular services in
```
/packages/backend/src/services/
```
- ```
chatService.ts
```
  - Core chat functionality with graceful Couchbase handling
- ```
ragService.ts
```
  - Couchbase vector store RAG implementation
- ```
pineconeService.ts
```
  - Standard Pinecone vector operations with advanced query processing
- ```
pineconeDocumentService.ts
```
  - Session-based document processing with enhanced metadata
- ```
streamService.ts
```
  - Real-time response streaming with Server-Sent Events
- ```
langsmithService.ts
```
  - AI model performance tracing/monitoring with hierarchical threading
- ```
kafkaDbService.ts
```
  - SQLite-based Kafka message storage and retrieval
- ```
modelFactory.ts
```
  - Multi-model support factory
Models: Support for Ollama, OpenAI, HuggingFace, AWS Bedrock, and Groq
Database:
- Primary: Pinecone for vector storage (two separate implementations)
- Optional: Couchbase for vector operations and chat history
- Local: SQLite for Kafka message storage

Frontend Structure (

/packages/frontend

)

Framework: SvelteKit 2.x with Svelte 5.35+
Build Tool: Vite 7.x with proxy configuration for API routing
Routes: Located in
```
/packages/frontend/src/routes/
```
- ```
/
```
  - Home page with navigation
- ```
/ollama-chat
```
  - Direct Ollama chat interface
- ```
/rag-chat
```
  - RAG-enhanced document chat interface with LangSmith tracing
- ```
/simon
```
  - Alternative chat interface
Components: Enhanced components in
```
/packages/frontend/src/components/
```
- ```
LangSmithTraceViewer.svelte
```
  - Real-time trace visualization
- ```
Navigation.svelte
```
  - Application navigation
Styling: Tailwind CSS v4.1+ with comprehensive plugin ecosystem
- ```
@tailwindcss/typography
```
  - Rich text styling
- ```
@tailwindcss/forms
```
  - Form component styling
- ```
@tailwindcss/container-queries
```
  - Modern responsive design
- ```
@tailwindcss/aspect-ratio
```
  - Aspect ratio utilities

RAG Pipeline Architecture

The system implements three separate RAG pipelines for different use cases:

1. Standard Pinecone RAG (

/api/rag/chat/pinecone

)

Service:
```
pineconeService.ts
```
Purpose: General-purpose document processing with advanced query classification
Features:
- Advanced keyword extraction and query enhancement
- Legal document analysis with specialized prompts
- Token-aware text chunking for AWS Bedrock (8,192 token limit)
- Embedding averaging for large documents
- Query intent classification (legal_analysis, general_inquiry, etc.)
Upload:
```
/api/rag/upload-pdf/pinecone
```
Chat:
```
/api/rag/chat/pinecone
```

2. Session-Based Document RAG (

/api/rag/chat/pinecone-document

)

Service:
```
pineconeDocumentService.ts
```
Purpose: Session-isolated document processing with enhanced metadata
Features:
- ULID-based session management for document isolation
- Enhanced PDF metadata extraction (title, author, dates)
- Session parent trace creation for LangSmith threading
- Direct AWS Bedrock Claude integration (bypasses model factory)
- Session-specific vector storage with filename filtering
Upload:
```
/api/rag/upload-pdf/pinecone-document
```
Chat:
```
/api/rag/chat/pinecone-document
```

3. Couchbase Vector RAG (

/api/rag/chat/couchbase

)

Service:
```
ragService.ts
```
Purpose: Alternative vector storage using Couchbase Vector Search
Features:
- Couchbase cluster integration with graceful fallback
- LangChain CouchbaseVectorStore integration
- Model factory integration for flexible LLM selection
- Comprehensive error handling for connection issues
Upload:
```
/api/rag/upload-pdf/couchbase
```
Chat:
```
/api/rag/chat/couchbase
```

RAG Data Flow Architecture

Document Processing Flow:

PDF Upload → PDF Loading (LangChain) → Text Splitting →
Token Estimation → Text Chunking → AWS Bedrock Embeddings →
Vector Storage (Pinecone/Couchbase) → Session/Metadata Storage

Chat Query Flow:

User Query → Query Enhancement → Embedding Generation →
Vector Similarity Search → Context Retrieval →
Prompt Construction → LLM Generation → Streaming Response

Advanced Features:

Smart Chunking: Automatic text splitting with token awareness
Query Classification: Intent-based prompt selection
Session Isolation: Document-specific contexts using ULID sessions
Metadata Filtering: Precise document retrieval with filename/session filters
Streaming Responses: Real-time response delivery via Server-Sent Events
LangSmith Integration: Comprehensive tracing and performance monitoring

API Endpoints

Core Chat Endpoints

GET /api/ollama/models
- List available Ollama models
GET /api/ollama/chat/history
- Retrieve chat history
POST /api/ollama/chat
- Direct Ollama chat (streaming)
GET /api/rag/model-info
- Get current model configuration

RAG Document Processing Endpoints

POST /api/rag/upload-pdf/pinecone
- Upload PDF to standard Pinecone RAG

POST /api/rag/upload-pdf/pinecone-document
- Upload PDF to session-based RAG

POST /api/rag/upload-pdf/couchbase
- Upload PDF to Couchbase vector store

RAG Chat Endpoints

POST /api/rag/chat/pinecone
- Chat with standard Pinecone documents
- Required:
```
{ message, sessionId, filename }
```
- Optional:
```
{ queryType }
```
  for intent classification
POST /api/rag/chat/pinecone-document
- Chat with session-based documents
- Required:
```
{ message, sessionId, filename }
```
- Uses: Direct AWS Bedrock Claude integration
POST /api/rag/chat/couchbase
- Chat with Couchbase vector documents
- Required:
```
{ message, sessionId, filename }
```
- Features: Model factory integration

Kong Integration Endpoints

POST /api/kong/chat
- Kong Konnect AI chat with streaming
GET /api/kong/kafka/:topic
- Kong Kafka topic integration

Kafka Storage Endpoints

GET /api/kafka
- Retrieve stored Kafka messages
GET /api/kafka/topics
- List available Kafka topics

Key Integration Points

API Communication: Frontend proxy routes
```
/api/*
```
to backend on port 3001
Streaming: Server-Sent Events for real-time response streaming
Environment Variables: Backend uses
```
.env
```
for all configuration
Vector Search: Documents embedded and stored in Pinecone for semantic retrieval
Session Management: ULID-based session tracking per document
LangSmith Tracing: All endpoints support trace headers for performance monitoring

Model Configuration

The backend uses a factory pattern (

modelFactory.ts

) to support multiple LLM providers:

Ollama: Local models for offline use
OpenAI: GPT models for chat, embeddings via API
HuggingFace: Alternative models with local hosting support
AWS Bedrock: Claude models for chat, Titan embeddings (recommended)
Groq: High-performance inference

Important Patterns

Service Pattern: All major functionality is encapsulated in service classes
Streaming: Uses Server-Sent Events for real-time responses
Error Handling: Graceful degradation when Couchbase is unavailable
Type Safety: Full TypeScript support in both frontend and backend

Environment Configuration

Required Environment Variables (Backend .env)

Based on @docs/architecture.md and @docs/adr/0003-use-pinecone-vector-database.md:

# Pinecone Configuration (Primary Vector Database)
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX_NAME=your_index_name
PINECONE_NAMESPACE=your_namespace

# AI Model Configuration
OPENAI_API_KEY=your_openai_api_key
HUGGINGFACE_API_KEY=your_huggingface_api_key
HUGGINGFACE_MODEL=facebook/opt-1.3b
HUGGINGFACE_EMBEDDING_MODEL=nlpaueb/legal-bert-base-uncased

# LangSmith Configuration (for AI tracing)
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langchain_api_key
LANGCHAIN_PROJECT=your_project_name

# AWS Bedrock Configuration
AWS_REGION=eu-central-1
AWS_ACCESS_KEY_ID=your_access_key_or_DUMMY
AWS_SECRET_ACCESS_KEY=your_secret_key_or_DUMMY
AWS_BEARER_TOKEN_BEDROCK=your_bearer_token

# Bedrock Models
BEDROCK_EMBEDDING_MODEL=amazon.titan-embed-text-v1
BEDROCK_CHAT_MODEL=anthropic.claude-3-5-sonnet-20240620-v1:0
BEDROCK_MAX_TOKENS=4096
BEDROCK_TEMPERATURE=0.7
BEDROCK_TOP_P=0.9

# Model Selection
CHAT_MODEL=OpenAI|HuggingFace|Ollama|Bedrock
OPENAI_MODEL=gpt-3.5-turbo
EMBEDDING_MODEL=Bedrock

# Ollama Configuration (if using local models)
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama2

# Optional: Couchbase (for chat history)
COUCHBASE_URL=couchbase://localhost
COUCHBASE_USERNAME=your_username
COUCHBASE_PASSWORD=your_password
COUCHBASE_CHAT_BUCKET=chat
COUCHBASE_CHAT_SCOPE=_default
COUCHBASE_CHAT_COLLECTION=_default

# Application Configuration
PORT=3001

Using This Workspace as a Template

Template-Ready Configuration

✅ Template Features

Universal Bun workspace patterns that work for any project type
Technology-agnostic catalog system for flexible dependency management
Comprehensive development tooling (linting, formatting, testing, monitoring)
Production-ready configurations for deployment and monitoring
Cross-platform compatibility with Node.js fallbacks

🔄 Quick Template Usage

To use this workspace structure for a new project:

Copy Configuration Files:

# Essential template files
cp package.json bun.config.ts tsconfig.json biome.json new-project/
cp -r scripts/ new-project/scripts/

Update Package Names and Catalog:

// Update root package.json
{
  "name": "your-project-monorepo",
  "workspaces": {
    "catalog": {
      // Add your project-specific dependencies
      "your-framework": "^1.0.0"
    }
  }
}

Adapt Package Structure:

# Create your packages following the pattern
mkdir -p packages/your-backend packages/your-frontend
# Copy and adapt package.json templates from existing packages

Customize Bun Configuration:

// Update bun.config.ts for your specific needs
export default {
  workspace: {
    packages: {
      "your-backend": { target: "bun" },
      "your-frontend": { target: "browser" }
    }
  }
}

📋 Template Checklist

When adapting this template:

CRITICAL: Run workspace validation before any changes:
```
bun run scripts/workspace-validator.ts
```
Detect framework types in packages to avoid configuration conflicts
Update all
```
@ollama-prompting/*
```
references to your project name
Customize the workspace catalog with your dependencies
Apply framework-specific TypeScript configurations (see Framework Guide below)
Adapt the scripts in
```
scripts/
```
for your environment variables
Update Biome configuration for your code style preferences
Validate TypeScript paths (no multiple wildcards allowed)
Customize test configuration for your testing strategy
Run final validation:
```
bun run workspace:check
```
Update CLAUDE.md with your project-specific information

⚠️ Critical Framework-Specific Configurations

BEFORE modifying any TypeScript configurations, run:

bun run scripts/workspace-validator.ts --detect-frameworks

SvelteKit Projects:

// ❌ NEVER DO THIS - Causes configuration conflicts
{
  "extends": ["../../tsconfig.json", "./.svelte-kit/tsconfig.json"],
  "compilerOptions": {
    "baseUrl": ".",
    "paths": { "@/*": ["./src/*"] } // CONFLICT!
  }
}

// ✅ CORRECT - SvelteKit-first approach
{
  "extends": "./.svelte-kit/tsconfig.json",
  "compilerOptions": {
    "composite": true,
    "declaration": true
    // NO baseUrl/paths - use svelte.config.js instead
  }
}

// ✅ Path aliases in svelte.config.js
export default {
  kit: {
    alias: {
      '@': './src',
      '@/components': './src/components'
    }
  }
}

React/Next.js Projects:

// ✅ SAFE - Can extend root workspace config
{
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "jsx": "react-jsx",
    "baseUrl": ".",
    "paths": {
      "@/*": ["./src/*"] // Single wildcard OK
    }
  }
}

Node.js/Express/Elysia Projects:

// ✅ SAFE - Full workspace integration
{
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "types": ["bun-types", "node"],
    "baseUrl": ".",
    "paths": {
      "@/*": ["./src/*"],
      "@/services/*": ["./src/services/*"]
    }
  }
}

🏗️ Architecture Patterns

The template provides these reusable patterns:

1. Dependency Management:

// Centralized version control through catalog
"catalog": {
  "shared-dep": "^1.0.0"
}
// Package references always stay in sync
"dependencies": {
  "shared-dep": "catalog:"
}

2. Cross-Package TypeScript:

// Root tsconfig with workspace references
"references": [
  { "path": "./packages/package-a" },
  { "path": "./packages/package-b" }
]

3. Runtime Detection:

// Universal pattern for Bun/Node compatibility
const isBun = () => typeof Bun !== 'undefined';

4. Performance Monitoring:

// Built-in performance measurement
const measure = isBun() ? Bun.nanoseconds() : performance.now() * 1_000_000;

5. Environment Validation:

// Automatic environment validation on startup
// Configurable per project in scripts/validate-env.ts

🔧 Customization Points

For Different Project Types:

Full-Stack Apps: Keep backend/frontend structure
Library Monorepos: Replace with lib/cli/docs structure
Microservices: Use service-a/service-b/shared structure
Tool Development: Use core/cli/plugins structure

Technology Adaptations:

React Frontend: Replace Svelte dependencies in catalog
Express Backend: Replace Elysia dependencies in catalog
Database Projects: Add database tooling to catalog
CLI Tools: Focus on Bun build optimizations

🚀 Agent Instructions Compatibility

This template is designed to work seamlessly with the

@bun-reviewer

agent patterns:

# When requesting workspace setup:
"Set up a Bun workspace with catalogs following the ollama-prompting template"

# Agent will understand:
- Catalog-based dependency management
- Production-ready bun.config.ts
- Comprehensive TypeScript setup
- Quality assurance tooling
- Performance monitoring patterns

The workspace structure serves as a reference implementation that agents can follow for consistent, production-ready Bun workspace setups.

Troubleshooting

Workspace-Specific Issues

⚠️ CRITICAL: Common Implementation Errors (LEARN FROM THESE!)

Error 1: TypeScript Path Pattern Wildcards

# ❌ ERROR MESSAGE:
# Invalid pattern "packages/*/src/*", must have at most one "*" character

# 🔍 ROOT CAUSE:
# Multiple wildcards in TypeScript path patterns are invalid

# ✅ PREVENTION:
bun run scripts/workspace-validator.ts  # Run BEFORE implementation

# ✅ FIX:
# Replace generic patterns with explicit ones:
"@backend/*": ["packages/backend/src/*"]  # ✅ Single wildcard
"@frontend/*": ["packages/frontend/src/*"]  # ✅ Single wildcard

Error 2: SvelteKit Configuration Conflicts

# ❌ ERROR MESSAGE:
# "baseUrl and/or paths in your tsconfig.json interferes with SvelteKit"

# 🔍 ROOT CAUSE:
# SvelteKit auto-generates tsconfig.json and conflicts with custom baseUrl/paths

# ✅ PREVENTION:
# Detect SvelteKit before applying workspace patterns:
bun run scripts/workspace-validator.ts --detect-frameworks

# ✅ FIX:
# 1. Remove baseUrl/paths from tsconfig.json
# 2. Add aliases to svelte.config.js instead:
export default {
  kit: {
    alias: {
      '@': './src',
      '@/components': './src/components'
    }
  }
}

Error 3: Catalog Reference Loops

# ❌ ERROR: Dependencies not resolving from catalog

# 🔍 ROOT CAUSE:
# Circular references or missing catalog entries

# ✅ PREVENTION:
bun install --dry-run  # Test before committing
bun run workspace:info  # Check catalog status

# ✅ FIX:
# Verify catalog syntax in root package.json:
"workspaces": {
  "catalog": {
    "typescript": "^5.9.2",  // ✅ Valid version
    "zod": "catalog:"        // ❌ Circular reference
  }
}

🛡️ Error Prevention Protocol

MANDATORY Steps for ALL Workspace Implementations:

Pre-Implementation Validation:

# Run BEFORE making ANY changes
bun run scripts/workspace-validator.ts

Framework Detection:

# Detect frameworks to apply correct patterns
grep -r "@sveltejs/kit" packages/*/package.json   # SvelteKit
grep -r '"react"' packages/*/package.json         # React
grep -r '"next"' packages/*/package.json          # Next.js

Post-Implementation Testing:

# Test configuration works
bun run typecheck
bun run workspace:check
bun run dev:backend & sleep 5 && kill %1  # Quick startup test
bun run dev:frontend & sleep 5 && kill %1

Final Validation:

# Ensure no errors before completion
bun run scripts/workspace-validator.ts
echo "✅ Workspace validation passed"

Catalog Dependency Problems

Problem: Dependencies not resolving from catalog

Solution:

# Verify catalog configuration
bun run workspace:info

# Reinstall with catalog
rm -rf node_modules packages/*/node_modules bun.lockb
bun install

# Check for catalog syntax errors
bun run workspace:check

TypeScript Workspace Errors

Problem: Cross-package imports not working

Solution:

# Rebuild TypeScript references
bun run typecheck --build

# Check path mapping in tsconfig.json
# Verify package references are correct

Bun Configuration Issues

Problem:

bun.config.ts

not being recognized

Symptoms:

Preload scripts not running
Test configuration ignored
Build settings not applied

Solution:

# Verify config syntax
bun --config ./bun.config.ts --help

# Check for TypeScript compilation errors
bun run typecheck

# Ensure scripts/ directory exists with proper permissions
ls -la scripts/

Biome Linting Conflicts

Problem: Linting errors in Svelte or test files

Solution:

# Check Biome configuration overrides
cat biome.json | grep -A 10 "overrides"

# Run with specific configuration
bun run lint --config-path ./biome.json

# Fix auto-fixable issues
bun run lint:fix

Performance Script Errors

Problem: Development or production monitoring scripts failing

Solution:

# Check Zod validation in validate-env.ts
bun run scripts/validate-env.ts

# Verify environment variables
bun run workspace:check

# Test runtime detection
node -e "console.log(typeof Bun)"  # Should be 'undefined'
bun -e "console.log(typeof Bun)"   # Should be 'object'

Frontend-Backend Connection Issues

Problem: Frontend shows "Failed to fetch models" with "Available Models: 0"

Root Cause: Frontend cannot reach backend API endpoints

Solution:

Verify Vite Proxy Configuration in

frontend/vite.config.ts

server: {
  port: 5174,
  proxy: {
    '/api': {
      target: 'http://localhost:3001',
      changeOrigin: true,
      secure: false,
    },
  },
}

Use Relative URLs in frontend code:

// ✅ Correct
const MODELS_URL = "/api/ollama/models";

// ❌ Incorrect (causes CORS/proxy issues)
const MODELS_URL = "http://localhost:3001/api/ollama/models";

Restart Frontend Server after Vite config changes:
```
bun run dev
```

Database Connection Issues

Pinecone Connection Issues

Pinecone is the PRIMARY database (per ADR-0003). If Pinecone fails:

RAG functionality will not work
Document upload and processing will fail
Check API key and index configuration

AWS Bedrock Token Limits

Amazon Titan Text Embeddings has an 8,192 token limit. The system automatically handles this by:

Token Estimation: Roughly 1 token ≈ 4 characters
Smart Chunking: Splits text by sentences, then words if needed
Embedding Averaging: Creates averaged embeddings for large texts
Rate Limiting: Adds delays between chunk requests

If you see "Too many input tokens" errors:

The system will automatically chunk and retry
Large documents may take longer to process
Monitor logs for chunking progress

Couchbase Connection Issues (Optional)

Symptoms: Backend logs show

UnambiguousTimeoutError: unambiguous timeout

Impact: With graceful error handling implemented, this should NOT crash the application.

Expected Behavior:

✅ Ollama models still load successfully
✅ Chat functionality continues to work
✅ Frontend shows empty chat history instead of errors
⚠️ Chat history is not persisted (in-memory only)

Error Handling Implementation:

```
ChatService.saveChatHistory()
```
: Catches errors and continues without throwing
```
ChatService.getChatHistory()
```
: Returns empty array instead of throwing errors
Main chat endpoint: Wraps history saves in non-blocking try-catch blocks

Common Development Issues

Tailwind CSS v4 Issues

The project uses Tailwind CSS v4.1+ with modern configuration:

Current Configuration (

postcss.config.js

export default {
  plugins: {
    '@tailwindcss/postcss': {},
    autoprefixer: {}
  }
};

CSS Import (

app.css

@import 'tailwindcss';

If you encounter PostCSS errors:

Verify

@tailwindcss/postcss

is installed:

bun add -D @tailwindcss/postcss

Ensure all Tailwind plugins are properly installed
Restart the development server after configuration changes

Bun vs Node.js

This project is optimized for Bun runtime (per ADR-0002):

Backend uses
```
bun run src/index.ts
```
Frontend can use
```
bun run dev
```
or
```
npm run dev
```
Package management prefers
```
bun install
```

RAG Pipeline Troubleshooting

Common RAG Issues

Problem: "I couldn't find any relevant information in the documents"

Root Cause: Using wrong RAG service endpoint or document not uploaded to correct service

Solution:

Verify Correct Service Pairing:

Upload with:
```
/api/rag/upload-pdf/pinecone-document
```
Chat with:
```
/api/rag/chat/pinecone-document
```

Check Session and Filename Matching:

# Ensure sessionId and filename match exactly between upload and chat
curl -X POST http://localhost:3001/api/rag/chat/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Your question here",
    "sessionId": "exact-session-id-from-upload",
    "filename": "exact-filename-from-upload.pdf"
  }'

Problem: "Retrieved 0 chunks with scores: []"

Solutions:

Verify document was successfully uploaded (check upload response)
Ensure sessionId and filename match exactly
Check Pinecone API key and index configuration
Verify the document contains relevant content for your query

Service-Specific Guidance

Standard Pinecone RAG (

/api/rag/chat/pinecone

Best for: General document Q&A with advanced query processing
Features: Query classification, legal analysis, keyword enhancement
Use when: You need intelligent query processing and document analysis

Session-Based RAG (

/api/rag/chat/pinecone-document

Best for: Isolated document sessions with enhanced metadata
Features: Session isolation, enhanced PDF metadata, LangSmith threading
Use when: You need document isolation and detailed tracing

Couchbase RAG (

/api/rag/chat/couchbase

Best for: Alternative vector storage with model flexibility
Features: Couchbase integration, model factory support
Use when: You need Couchbase integration or specific model requirements

Health Check Commands

# Test backend directly
curl http://localhost:3001/api/ollama/models

# Test frontend proxy (should return same result)
curl http://localhost:5174/api/ollama/models

# Test RAG document upload
curl -X POST http://localhost:3001/api/rag/upload-pdf/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{"file":"base64-encoded-pdf", "filename":"test.pdf"}'

# Test RAG chat
curl -X POST http://localhost:3001/api/rag/chat/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{"message":"test question", "sessionId":"session-id", "filename":"test.pdf"}'

# Check servers are running
lsof -i :3001  # Backend (Native Bun HTTP)
lsof -i :5174  # Frontend (Vite dev server)
lsof -i :11434 # Ollama (if running locally)

# Monitor LangSmith traces
# Visit: https://smith.langchain.com/

# ALWAYS run these checks BEFORE implementing changes
cd packages/backend
bun run typecheck
bun run lint

2. Backup Working State

# Create a branch backup before making changes
git checkout -b backup-before-traceable-fixes
git add . && git commit -m "Backup working state before traceable changes"

During Implementation

3. Traceable Wrapper Pattern Requirements

CRITICAL: All
```
traceable
```
function calls must be properly closed with
```
});
```
CRITICAL: All processing logic must be indented INSIDE the traceable wrapper
CRITICAL: Return statements must be INSIDE the traceable wrapper, BEFORE the closing
```
});
```

Example of CORRECT traceable wrapper pattern:

// ✅ CORRECT Pattern
const documentProcessingTracer = langsmithService.createDocumentProcessingTracer();
const processingResult = await documentProcessingTracer(async () => {
    console.log("Starting processing...");

    // ALL processing logic goes here with proper indentation
    const result = await processDocument();

    return result; // Return INSIDE wrapper
}); // 🔥 CRITICAL: Close the traceable wrapper

// Error handling OUTSIDE wrapper
} catch (error) {
    console.error("Error:", error);
    throw error;
}

4. Incremental Testing

# Test syntax after EACH service modification
bun run typecheck

# If errors found, STOP and fix before continuing

After Implementation

5. Comprehensive Testing (MANDATORY)

# Test all RAG services can start without syntax errors
cd packages/backend
bun run dev &
sleep 5
# If no startup errors, kill the process
pkill -f "bun run dev"

# Test frontend can connect to backend
cd packages/frontend
bun run dev &
sleep 5
curl http://localhost:5174/api/ollama/models
pkill -f "bun run dev"

6. RAG Pipeline Integration Testing

# Test document upload (use provided test PDF)
curl -X POST http://localhost:3001/api/rag/upload-pdf/pinecone-document \
  -F "file=@Formula One World Drivers' Champions (1950-2024).pdf" \
  -H "Accept: application/json"

# Test RAG chat with uploaded document
curl -X POST http://localhost:3001/api/rag/chat/pinecone-document \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Who won the championship in 1950?",
    "sessionId": "SESSION_ID_FROM_UPLOAD",
    "filename": "Formula One World Drivers' Champions (1950-2024).pdf"
  }'

7. LangSmith Trace Verification

Verify traces appear in LangSmith dashboard
Check that hierarchical grouping is working
Confirm parent-child trace relationships

Files That Require Special Attention

RAG Services with Traceable Wrappers:

packages/backend/src/services/ragService.ts

- Couchbase RAG

packages/backend/src/services/pineconeService.ts

- Standard Pinecone RAG

packages/backend/src/services/pineconeDocumentService.ts

- Session-based Pinecone RAG

Common Error Patterns to Avoid:

❌ Unclosed traceable wrapper:

await documentProcessingTracer(async () => {

without matching

});

❌ Logic outside wrapper: Processing code not properly indented inside the wrapper
❌ Return outside wrapper:
```
return
```
statement after the
```
});
```
instead of before
❌ Malformed syntax: Missing parentheses, brackets, or semicolons

Emergency Recovery

If Implementation Breaks the Pipeline:

# Immediately revert to working state
git checkout backup-before-traceable-fixes
git branch -D failed-traceable-implementation

# Restart from working baseline
# Re-read this testing guide before attempting fixes again

Testing Philosophy

"Test Early, Test Often, Never Break Production"

Incremental Changes: Modify one service at a time
Immediate Validation: Test syntax after each change
Integration Testing: Verify end-to-end pipeline after all changes
Rollback Ready: Always have a working backup to revert to

CLAUDE.md

Related Skills

Markdown Converter

Nano Banana Pro

1password