This document describes the AI agents and automation workflows in the Dharma Radio project.
Dharma Radio uses automated agents for data synchronization and processing. These agents run on Cloudflare Workers with scheduled cron jobs and can be triggered manually via API endpoints.
Location:
app/sync/sync-teachers.ts
Purpose: Fetches and synchronizes teacher profiles from dharmaseed.org
How it works:
- Fetches paginated list of teachers (100 per page)
- Parses HTML to extract teacher metadata
- Upserts teachers into D1 database (conflict handling on dharmaSeedId)
- Updates existing records, inserts new ones
Key features:
- Retry logic with exponential backoff (3 attempts, max 10s delay)
- Structured logging for observability
- Batch processing with 1s delay between pages
- Slug generation from teacher name + ID
Manual trigger:
GET /api/sync?command=teachers
Location:
app/sync/sync-to-db.ts
Purpose: Fetches and synchronizes dharma talks with all relationships
How it works:
- Fetches all teachers from database
- For each teacher, fetches their talks from dharmaseed.org RSS feeds
- Enriches talk data by fetching retreat and center information
- Batch inserts talks into database (50 at a time)
- Updates relationships (teacher, center, retreat)
Key features:
- Parallel processing with configurable concurrency
- HTML scraping for additional metadata (retreat, center)
- Retry logic on all network requests
- Batch processing for efficient database operations
- Skip processing flag for faster sync (
skipProcessing=true
)
Manual trigger:
GET /api/sync?command=talks
GET /api/sync?command=talks&skipProcessing=true
Location:
app/routes/api.sync.ts
Purpose: Orchestrates complete data synchronization
How it works:
- Syncs teachers first
- Then syncs talks (which depend on teachers)
- Returns detailed results for each operation
- Continues even if one operation fails (reports all results)
Manual trigger:
GET /api/sync
or
GET /api/sync?command=all
Cron Configuration (
wrangler.toml
):
crons=["0 */6 * * *"] # Every 6 hours
The full sync agent runs automatically every 6 hours to keep data fresh.
- Parses HTML using jsdom (Node.js) or linkedom (Cloudflare Workers)
- Extracts structured data from dharmaseed.org pages
- Implements exponential backoff with jitter
- Configurable max attempts and delay
- Used by all network operations
- Structured logging with context
- Supports info, debug, error levels
- Includes timing and metadata
- Processes large datasets in chunks
- Configurable batch size
- Memory efficient for large syncs
All agents include:
- Try-catch blocks around critical operations
- Structured error logging with context
- Graceful degradation (continues on non-fatal errors)
- Retry logic for transient failures
- HTTP error status codes in API responses
Logs: All agents use structured logging with:
- Operation name (e.g., "sync-teachers")
- Timing information (duration)
- Record counts
- Error details with stack traces
API Response Format:
{
"success": true/false,
"results": {
"teachers": { "success": true },
"talks": { "success": true }
},
"message": "Sync completed"
}
-
Ensure local D1 database is initialized:
pnpm d1:init:local
-
Start dev server:
pnpm dev
-
Trigger sync via curl:
curl http://127.0.0.1:8788/api/sync?command=teachers
curl http://127.0.0.1:8788/api/sync?command=talks&skipProcessing=true
- Create new file in
app/sync/
- Implement main sync function with D1Database parameter
- Add to
api.sync.ts
switch statement
- Include retry logic and structured logging
- Use batch processing for large datasets
- Test locally before deploying
- Use
skipProcessing=true
for faster initial sync
- Adjust batch sizes in
batch.ts
(default: 50)
- Configure retry delays in
retry.ts
- Add indexes to schema for new query patterns
- Monitor D1 query performance in Wrangler logs
Based on
.cursorrules
, planned future agents:
- Audio File Sync Agent: Sync audio files to Cloudflare R2 for CDN delivery
- Transcription Agent: Use AI to transcribe dharma talks
- Analysis Agent: Analyze talk content and generate metadata
- RSS Feed Generator: Create custom RSS feeds for users
- Playlist Curator: Generate personalized playlists based on listening history