Nano Banana Pro
Agent skill for nano-banana-pro
You are building production-grade systems medicine infrastructure.
Sign in to like and favorite skills
You are building production-grade systems medicine infrastructure.
Aeon Cascade is a multi-factor, all-in-one health assistant that uses systems medicine to discover synergistic interventions across multiple conditions, powered by INDRA bio-ontology and structural causal models.
Patient Profile:
Clinical Challenge: Sarah has TWO interconnected conditions—not independent diseases but a unified metabolic-inflammatory syndrome with shared molecular mechanisms:
PM2.5 → Oxidative Stress (ROS) → { ├─→ NF-κB → IL-6 → CRP (Inflammation) └─→ JNK → IRS-1 inhibition → Insulin Resistance (Prediabetes) }
Traditional Approach (siloed):
Systems Medicine Approach (our system):
Query: "If Sarah moves from LA to Seattle (PM2.5: 10 µg/m³), how will both her inflammation AND metabolic markers respond?"
System Output:
Clinical Impact: One environmental intervention reverses two chronic conditions by targeting shared upstream mechanisms.
User → Telegram → aeon_cascade_frontend (bot.py) ↓ [Health Query Detection] ↓ ┌──────────┴──────────┐ ↓ ↓ Health Query General Query ↓ ↓ INDRA Agent (direct) OpenAI GPT-4 (Bio-ontology) (Conversational) ↓ ↓ AWS Bedrock (Claude) Chat Response INDRA Bio-Ontology ↓ Formatted Result ↓ ↓ Telegram Reply Telegram Reply
✅ Integrated Health Intelligence: INDRA agent runs inside bot.py via direct Python imports ✅ Automatic Detection: Health keywords trigger INDRA bio-ontology analysis ✅ Fallback Support: Falls back to OpenAI if INDRA unavailable or non-health queries ✅ Single Container: One Docker container runs both Telegram bot + INDRA agent ✅ Evidence-Based: Causal pathways backed by scientific papers from INDRA knowledge graph
Location:
/aeon_cascade_frontend/
Status: ✅ Production-ready with INDRA integration
Capabilities:
Technology Stack:
Location:
/indra_agent/
Status: ✅ Integrated into aeon_cascade_frontend via direct Python imports
Capabilities:
Technology Stack:
Deployment Mode: Python modules imported directly into aeon_cascade_frontend/bot.py (NO HTTP API)
Location:
/indra_agent/services/local_ontology/
Status: ✅ Operational (Writer KG trial ended, local system deployed)
Architecture:
Ontologies Integrated:
Technology Stack:
Integration Status:
KG_INTEGRATION_PLAN.md for complete integration roadmapDeployment:
# Start Memgraph (via Docker Compose) cd /Users/noot/Documents/digitalme docker-compose -f docker-compose.local-ontology.yml up -d # Verify database health python3 -c " import asyncio from indra_agent.services.local_ontology import MemgraphClient async def health_check(): client = MemgraphClient(uri='bolt://localhost:7687') await client.connect() stats = await client.get_stats() print(f'Total entities: {stats[\"total_entities\"]:,}') print(f'Total relationships: {stats[\"total_relationships\"]:,}') print(f'Namespaces: {stats[\"namespaces\"]}') await client.close() asyncio.run(health_check()) "
Known Limitations:
The integration uses direct Python imports for performance and simplicity:
# aeon_cascade_frontend/bot/bot.py from indra_agent.core.client import INDRAAgentClient from indra_agent.core.models import ( CausalDiscoveryRequest, UserContext, Query, RequestOptions ) # Initialize client at startup (singleton) indra_client = INDRAAgentClient() # Query processing (no HTTP calls) async def query_indra_health_system(user_id: int, message_text: str): request = CausalDiscoveryRequest( request_id=str(uuid.uuid4()), user_context=UserContext( user_id=str(user_id), genetics=db.get_user_attribute(user_id, 'health_genetics') or {}, current_biomarkers=db.get_user_attribute(user_id, 'health_biomarkers') or {}, location_history=db.get_user_attribute(user_id, 'health_location_history') or [] ), query=Query(text=message_text), options=RequestOptions() ) # Direct function call - no HTTP overhead response = await indra_client.process_request(request) return format_indra_response(response)
The bot automatically detects health-related queries:
def is_health_query(message_text: str) -> bool: """Detect health-related queries for INDRA routing.""" health_keywords = [ 'biomarker', 'crp', 'il-6', 'inflammation', 'oxidative stress', 'pollution', 'pm2.5', 'air quality', 'exposure', 'gene', 'genetic', 'variant', 'gstm1', 'health', 'risk', 'causal', 'pathway', 'mechanism', 'environmental', 'affect', 'impact', 'influence', 'molecular', 'protein', 'cytokine' ] return any(keyword in message_text.lower() for keyword in health_keywords)
Trigger Examples:
async def message_handle_fn(): # Check if health query if _message and is_health_query(_message) and INDRA_AVAILABLE: # Route to INDRA agent indra_result = await query_indra_health_system(user_id, _message) if indra_result['success']: # Display INDRA response await update.message.reply_text(indra_result['response'], parse_mode=ParseMode.HTML) return # Fall through to OpenAI for non-health or failed queries chatgpt_instance = openai_utils.ChatGPT(model=current_model) # ... existing OpenAI logic
Edit
aeon_cascade_frontend/config/config.env:
# Telegram & OpenAI TELEGRAM_TOKEN=your-telegram-bot-token OPENAI_API_KEY=your-openai-api-key # MongoDB MONGODB_PORT=27017 # AWS Bedrock (for INDRA health intelligence) AWS_ACCESS_KEY_ID=your-aws-access-key-id AWS_SECRET_ACCESS_KEY=your-aws-secret-access-key AWS_REGION=us-east-1 # Optional INDRA_BASE_URL=https://db.indra.bio IQAIR_API_KEY=your-iqair-api-key-optional
Edit
aeon_cascade_frontend/config/config.yml:
telegram_token: ${TELEGRAM_TOKEN} openai_api_key: ${OPENAI_API_KEY} allowed_telegram_usernames: [] # Empty = allow all users
cd aeon_cascade_frontend/ # Build and run all services docker-compose --env-file config/config.env up --build
What happens:
/opt/indra_agent for editable installServices Started:
chatgpt_telegram_bot: Main bot with INDRA integrationmongo: MongoDB databasemongo_express: Database admin UI (http://localhost:8081)# Install both projects pip install -e . cd aeon_cascade_frontend/ pip install -r requirements.txt # Run bot directly python3 bot/bot.py
The Docker setup uses parent directory context to access both projects:
# aeon_cascade_frontend/docker-compose.yml services: chatgpt_telegram_bot: build: context: ".." # Parent directory (digitalme/) dockerfile: aeon_cascade_frontend/Dockerfile
# aeon_cascade_frontend/Dockerfile FROM cgr.dev/chainguard-private/python:3.11-dev # Install aeon_cascade_frontend dependencies COPY aeon_cascade_frontend/requirements.txt /tmp/requirements.txt RUN pip3 install -r /tmp/requirements.txt # Copy and install indra_agent (permanent location for editable install) COPY indra_agent /opt/indra_agent COPY pyproject.toml /opt/pyproject.toml RUN cd /opt && pip3 install -e . # Copy aeon_cascade_frontend code COPY aeon_cascade_frontend /code WORKDIR /code CMD ["bash"]
Container Filesystem:
/opt/indra_agent/ # Permanent copy for editable install ├── agents/ ├── core/ └── services/ /code/ # Working directory ├── bot/bot.py # Imports: from indra_agent.core.client import ... ├── config/ └── requirements.txt
User: "How does PM2.5 pollution affect CRP biomarkers?"
Bot Response:
🧬 Health Intelligence Report 📊 Key Insights: 1. PM2.5 exposure increases inflammatory biomarkers through oxidative stress pathways 2. Causal chain: PM2.5 → NF-κB activation → IL-6 elevation → CRP increase 3. Based on 312 peer-reviewed scientific papers 🔬 Causal Analysis: • 5 biological entities identified • 4 causal relationships found • Based on 312 scientific papers • Analysis time: 2847ms 🔗 Top Causal Pathways: PM2.5 ⬆️ NF-κB Evidence: 47 papers, Effect: 0.82, Lag: 6h NF-κB ⬆️ IL-6 Evidence: 89 papers, Effect: 0.87, Lag: 12h IL-6 ⬆️ CRP Evidence: 312 papers, Effect: 0.98, Lag: 6h 💡 This analysis uses INDRA bio-ontology for evidence-based causal pathways.
User: "What's the weather like in San Francisco?"
Bot Response: [Standard ChatGPT response using OpenAI]
aeon_cascade_frontend/config/config.env (shared)Users can store personal health context in MongoDB for personalized analysis:
# Store user genetics db.set_user_attribute(user_id, 'health_genetics', { 'GSTM1': 'null', 'CYP1A1': 'T/T' }) # Store current biomarkers db.set_user_attribute(user_id, 'health_biomarkers', { 'CRP': 5.2, # mg/L 'IL-6': 3.8 # pg/mL }) # Store location history (for environmental exposure analysis) db.set_user_attribute(user_id, 'health_location_history', [ { 'city': 'San Francisco', 'start_date': '2024-01-01', 'end_date': '2024-06-01', 'avg_pm25': 12.5 } ])
This context is automatically included in INDRA queries for personalized health insights.
Positioning: "Mechanism Explorer for Informed Health Decisions"
Status: Production deployment cleared (Ship Blocker #5 RESOLVED ✅)
A tool that shows validated biological mechanisms connecting exposures, genetics, and biomarkers:
Problem: "My doctor says reduce PM2.5, but I don't feel different, so I skip air filter usage"
Our Value: "PM2.5 → NF-κB (6h lag) → IL-6 (12h lag) → CRP (6h lag). Measure CRP at 24h to see effect."
Impact: Understanding mechanism → better compliance → better outcomes
Problem: "Should we target NF-κB or JAK-STAT pathway for inflammation research?"
Our Value: "NF-κB → IL-6 (89 papers, belief 0.87). JAK-STAT → IL-6 (127 papers, belief 0.92). Consider JAK-STAT."
Impact: Evidence-based target selection → faster discovery
Problem: "I feel worse when I eat gluten, but my doctor says it's psychosomatic"
Our Value: "Gliadin → Zonulin (tight junction disruption) → Intestinal Permeability → IL-6 (inflammation). Mechanism exists."
Impact: Validation (not crazy) → informed self-monitoring → better communication with providers
Strong Capabilities (Validated via Ship Blockers 1-5):
Clear Limitations (Documented in HONEST_ARCHITECTURE.md):
Likely Exempt under Clinical Decision Support (CDS) exemption (21 USC § 360j(o)(1)(E)):
Exemption Criteria (we meet ALL):
Why We Qualify:
See SHIP_BLOCKER_5_RESOLVED.md for complete regulatory analysis.
Transparency > Paternalism
Informed Decisions > Blind Adherence
Right Side of History
Every Query Result Includes:
⚠️ IMPORTANT DISCLAIMER This shows VALIDATED BIOLOGY (peer-reviewed literature via INDRA bio-ontology). What this means: ✅ This mechanism EXISTS in humans (evidence: X papers, belief: Y) ✅ This temporal lag is TYPICAL for this pathway (estimate: Z hours) ✅ This effect size is POPULATION AVERAGE (not personalized to you) What this does NOT mean: ❌ This WILL happen to YOU (genetics, microbiome, environment vary) ❌ This is medical advice (consult healthcare provider) ❌ This guarantees outcomes (monitor YOUR biomarkers to validate) How to use this information: 1. Understand mechanism (WHY intervention affects target → adherence) 2. Measure YOUR response (test biomarkers at suggested timepoints) 3. Collaborate with providers (share mechanisms, discuss monitoring plan) Population biology ≠ Personalized prediction. Monitor YOUR response.
This system has been systematically validated through 5 Ship Blockers:
Engineering Distinction: Not just "looks reasonable" — empirically validated against expert curation, with transparent limitations.
See Documentation:
SHIP_BLOCKER_5_RESOLVED.md: Complete positioning decisionSHIP_BLOCKERS_PROGRESS.md: Overall validation progressHONEST_ARCHITECTURE.md: Brutally honest capabilities vs limitationsWeek 1 (Documentation): ⏳ PENDING
Week 2 (UI Updates): ✅ COMPLETED
Future (Optional Validation):
Decision Point: Only pursue clinical validation if early adoption shows clear impact on adherence/outcomes.
File:
aeon_cascade_frontend/bot/bot.py
Lines 37-51: Import INDRA modules
from indra_agent.core.client import INDRAAgentClient from indra_agent.core.models import (...)
Lines 60-68: Initialize INDRA client singleton
indra_client = INDRAAgentClient()
Lines 111-133: Health query detection function
def is_health_query(message_text: str) -> bool:
Lines 136-201: INDRA query processing function
async def query_indra_health_system(user_id: int, message_text: str):
Lines 204-272: Result formatting for Telegram
def format_indra_response(response) -> str:
Lines 827-880: Message handler integration
if _message and is_health_query(_message) and INDRA_AVAILABLE: indra_result = await query_indra_health_system(user_id, _message)
digitalme/ ├── pyproject.toml # Root project config for indra_agent ├── indra_agent/ # Health intelligence backend │ ├── agents/ # LangGraph agents │ │ ├── supervisor.py # Orchestration │ │ ├── indra_query_agent.py # INDRA queries │ │ ├── web_researcher.py # Environmental data │ │ ├── state.py # State management │ │ └── graph.py # Workflow definition │ ├── core/ │ │ ├── client.py # Main client interface │ │ └── models.py # Pydantic models │ ├── services/ │ │ ├── grounding_service.py # Entity grounding │ │ ├── indra_service.py # INDRA API wrapper (legacy) │ │ ├── indra_production_client.py # Production INDRA client (NEW) │ │ ├── indra_network_builder.py # Complete network builder (NEW) │ │ └── graph_builder.py # Graph construction │ ├── examples/ │ │ └── download_full_network.py # Network download example (NEW) │ └── config/ │ ├── agent_config.py # Agent prompts │ └── cached_responses.py # Pre-cached paths └── aeon_cascade_frontend/ # Telegram bot ├── bot/ │ ├── bot.py # Main bot (imports indra_agent) │ ├── config.py # Configuration loader │ ├── database.py # MongoDB abstraction │ └── openai_utils.py # OpenAI utilities ├── config/ │ ├── config.yml # Bot settings │ ├── config.env # Environment variables │ ├── chat_modes.yml # Bot personalities │ └── models.yml # OpenAI models ├── Dockerfile # Docker build └── docker-compose.yml # Docker orchestration
Cause: Editable install failed or Docker build context incorrect
Check:
docker exec chatgpt_telegram_bot ls /opt/indra_agent # Should see: agents/, core/, services/
Fix: Rebuild with correct build context:
cd aeon_cascade_frontend/ docker-compose down docker-compose --env-file config/config.env up --build
Cause: Missing or invalid AWS credentials
Fix: Add correct credentials to
aeon_cascade_frontend/config/config.env:
AWS_ACCESS_KEY_ID=your-real-key AWS_SECRET_ACCESS_KEY=your-real-secret AWS_REGION=us-east-1
Verify AWS Bedrock access and Claude Sonnet 4.5 availability in your region.
Cause: Query doesn't contain health keywords
Check: Message includes:
biomarker, crp, pollution, genetic, health, etc.
Fix: Add more keywords to
is_health_query() in bot.py:111
Cause: INDRA query failed or timed out
Check logs:
docker logs chatgpt_telegram_bot | grep "INDRA"
Common issues:
Check INDRA initialization:
docker logs chatgpt_telegram_bot | grep "INDRA"
Expected output:
INDRA agent modules imported successfully INDRA agent client initialized
Health query detection:
Health query detected from user 12345: How does PM2.5... Calling INDRA agent for user 12345
# Start bot cd aeon_cascade_frontend/ docker-compose --env-file config/config.env up # Send test message to bot via Telegram # "How does pollution affect inflammation?" # Check logs docker logs chatgpt_telegram_bot -f
cd .. pip install -e . # Run FastAPI server python -m indra_agent.main # Open browser open http://localhost:8000/docs # Test causal discovery endpoint curl -X POST http://localhost:8000/api/v1/causal_discovery \ -H "Content-Type: application/json" \ -d @tests/fixtures/sample_request.json
The system uses a supervisor pattern where a central orchestrator routes work to specialist agents:
User Request → FastAPI → LangGraph Workflow ├─ Supervisor (orchestration) ├─ INDRA Query Agent (bio-ontology) └─ Web Researcher (environmental data)
Workflow execution (
indra_agent/agents/graph.py):
State management (
indra_agent/agents/state.py):
OverallState TypedDict passed between all agentsExhaustive Synonym Search (
indra_agent/services/grounding_service.py + indranet_service.py):
CRITICAL ARCHITECTURAL SHIFT (2025-11-01): This is NOT a "grounding" problem - it's a path discovery problem.
The Problem:
The Solution: Exhaustive synonym search
# OLD (WRONG): Query with single name processor = idr.get_statements(subject="PM2.5", object="CRP") # 0 results # NEW (CORRECT): Query with ALL synonyms source_synonyms = await grounding.get_all_synonyms("PM2.5") # → ["PM2.5", "Particulate Matter", "particulates", "MESH:D052638", ...] target_synonyms = await grounding.get_all_synonyms("CRP") # → ["CRP", "C-Reactive Protein", "HGNC:2367", "UP:P02741", ...] # Query all combinations in parallel (7 × 6 = 42 queries) for src in source_synonyms: for tgt in target_synonyms: statements.extend(await query_indra(src, tgt)) # Molecular intermediates EMERGE: # PM2.5 → oxidative_stress → NF-κB → IL-6 → CRP # (These intermediates were NOT queried explicitly - they emerged from merged results!)
Why This Works:
Performance:
Path Discovery (
indra_agent/services/indranet_service.py):
Documentation: See
EXHAUSTIVE_SYNONYM_SEARCH.md for complete architectural details.
Graph Construction (
indra_agent/services/graph_builder.py):
min(belief * 0.8 + evidence_boost, 0.95) where boost depends on paper countCritical constraints (see
agentic-system-spec.md):
effect_size MUST be ∈ [0, 1] (used for Monte Carlo weights)temporal_lag_hours MUST be ≥ 0 (causality violation otherwise)Genetic modifiers:
config/cached_responses.py::get_genetic_modifier()Required:
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION: Bedrock accessus.anthropic.claude-sonnet-4-5-20250129-v1:0Optional:
IQAIR_API_KEY: Real-time pollution dataINDRA_BASE_URL: Default https://db.indra.bioAPP_PORT: Default 8000All agents use temperature=0.0 for deterministic output.
Based on biological mechanism type (see
TEMPORAL_LAG_MAP in graph_builder.py):
UPDATED (per architecture review):
# Use raw INDRA belief scores (no artificial scaling) effect_size = belief # [0, 1] from INDRA evidence_weight = min(log(1 + evidence_count) / 10, 0.15) # Diminishing returns effect_with_evidence = min(effect_size + evidence_weight, 0.98)
This avoids saturation issues and preserves INDRA's calibrated belief scores.
For system reliability during development, key paths are cached:
Fallback to cache if INDRA API unavailable.
Heuristic-based (
_infer_node_type in graph_builder.py):
NEW CAPABILITY: Factor graph modeling for synergistic effects across multiple pathways.
Why Factor Graphs?
Simple DAGs treat pathways independently, missing super-additive effects. Sarah Chen's clinical case proves this:
Factor Graph Structure:
from indra_agent.services.synergy_factor_graph import SynergyFactorGraph # Create factor graph with synergy priors from literature synergy_priors = { "inflammation+metabolic": 1.34 # Meta-analysis derived } fg = SynergyFactorGraph(causal_graph, synergy_priors=synergy_priors) # Infer joint response (belief propagation) predictions = fg.infer_joint_response( intervention={"PM2.5": 10.0}, target_biomarkers=["CRP", "HbA1c"] ) # Compute synergy score synergy = fg.compute_synergy_score( baseline_effects={"inflammation": -0.16, "metabolic": -0.19}, joint_effect=-0.47 ) # Returns 1.34
Multi-Scale Ergodic Modeling:
Biological systems exhibit different variance at different scales:
from indra_agent.services.multiscale_inference import ( BiologicalScale, MultiScaleFactorGraph ) # Assign biological scales node_scales = { "PM2.5": BiologicalScale.MOLECULAR, "ROS": BiologicalScale.MOLECULAR, "NF-κB": BiologicalScale.CELLULAR, "CRP": BiologicalScale.ORGAN } # Create multi-scale factor graph msfg = MultiScaleFactorGraph(causal_graph, node_scales) # Infer with variance reduction predictions = msfg.infer_multiscale_response( intervention={"PM2.5": 10.0}, intervention_scale=BiologicalScale.MOLECULAR, target_biomarkers=["CRP"] ) # predictions = { # "CRP": { # "mean": 4.36, # "variance": 0.000001, # 10⁶× reduction vs molecular scale # "ci_lower": 4.16, # "ci_upper": 4.56 # } # }
Variance Reduction Across Scales:
Example Output:
APPROACH 1: Simple DAG (Independent Pathways) CRP: 5.2 → 4.68 mg/L (10% reduction) HbA1c: 5.9% → 5.43% (8% reduction) Synergy: NONE (additive) APPROACH 2: Factor Graph (Joint Distribution) CRP: 5.2 → 4.36 mg/L (16% reduction) ← Enters LOW-RISK range! HbA1c: 5.9% → 4.77% (19% reduction) ← Exits PREDIABETES! Synergy: 1.34 (34% super-additive!) Clinical Impact: Single intervention reverses TWO chronic conditions. This synergy is INVISIBLE to simple DAG models.
Implementation Files:
indra_agent/services/synergy_factor_graph.py: Factor graph implementationindra_agent/services/multiscale_inference.py: Multi-scale ergodic modelingindra_agent/examples/sarah_chen_factor_graph.py: Complete clinical exampleWhen to Use:
Theoretical Foundation:
Install in editable mode:
pip install -e .
.envus.anthropic.claude-sonnet-4-5-20250129-v1:0System automatically falls back to cached responses. Check logs for "using cache" warnings.
Set
APP_PORT=8001 in .env or use uvicorn indra_agent.main:app --port 8001
Service layer (
indra_agent/services/): Stateless services for INDRA API, grounding, graph building, web data. These are called by agents but contain no LLM logic.
Agent layer (
indra_agent/agents/): LangGraph agents with AWS Bedrock LLMs. Each agent has system prompt in config/agent_config.py.
Core layer (
indra_agent/core/): Pydantic models matching API specification, client wrappers, state management.
API layer (
indra_agent/api/): FastAPI routes that invoke LangGraph workflow.
Always return status="success" even if no paths found (empty graph). Only return status="error" for:
NO_CAUSAL_PATH: Query nonsensical (e.g., "coffee affects eye color")TIMEOUT: Processing took >5 secondsINVALID_REQUEST: Missing required fieldsExplanations must be 3-5 items, each <200 characters. Priority order:
Unit tests: Test individual services (grounding, graph builder, etc.) Integration tests: Test full workflow with cached INDRA responses Contract tests: Validate response against API specification (effect_size range, temporal_lag ≥ 0, etc.)
Use pytest fixtures in
tests/fixtures/ for sample requests/responses.
IMPORTANT: This section documents known constraints and limitations of the current architecture. See
ARCHITECTURE_FIX_PLAN.md for detailed fixes addressing these issues.
Update 2025-10-25: We are NOT limited to 3-hop paths via API.
New Capability (
indra_agent/services/indra_network_builder.py):
Evidence (tested on Sarah Chen pathways):
Downloaded 40 INDRA statements (CRP, IL6, TNF, INS) Built graph: 29 nodes, 35 edges Average belief: 0.862, Average evidence: 4.6 papers/edge Found convergent nodes: IL6 (23 inputs), CRP (3 inputs) Detected feedback loop: CRP ↔ TNF ↔ IL6 (inflammation cycle)
What This Enables:
What Still Requires Experimental Data:
Usage:
from indra_agent.services.indra_network_builder import build_indra_network # Download complete network graph, stats = await build_indra_network(["CRP", "IL6", "TNF", "INS", "NFKB1"]) # Find synergy candidates from topology builder = INDRANetworkBuilder() convergent = builder.find_convergent_pathways(graph, min_inputs=2) synergy_structure = builder.extract_synergy_structure(graph) # IL6 has 23 upstream effectors → potential synergy on downstream CRP
Bottom Line: Path length is NO LONGER a limitation. We can access full INDRA network topology.
Constraint: System assumes strict directed acyclic graphs (DAGs)
Impact:
Workaround:
IL-6(t) → NF-κB(t+1) → IL-6(t+2)Status: FIXED (per ARCHITECTURE_FIX_PLAN.md)
Old Formula (BROKEN):
effect = min(belief * 0.6 + 0.1 * log(1 + evidence), 0.95) # Saturated at 0.95
New Formula (FIXED):
effect = belief # Use raw INDRA belief scores evidence_weight = min(log(1 + evidence) / 10, 0.15) # Separate confidence effect_with_evidence = min(effect + evidence_weight, 0.98)
Why This Matters:
Policy: ALL intermediate nodes are retained (no Markov pruning)
Rationale:
Example:
# ✅ CORRECT: Keep all nodes PM2.5 → NF-κB → IL-6 → CRP # ❌ WRONG: Don't prune intermediate nodes PM2.5 → IL-6 → CRP # Lost NF-κB (drug target!)
Current Limits:
Bottlenecks:
Scaling Strategy (Phase 2):
Will NOT Scale To:
Status: Not yet implemented
Planned:
Why NOT Full Monte Carlo:
Alternative (ARCHITECTURE_FIX_PLAN.md):
# Scenario-based prediction (deterministic) scenarios = ['low', 'medium', 'high'] for scenario in scenarios: intervention_value = SCENARIO_MAP[scenario] propagate_effects(graph, intervention_value) compute_confidence_intervals(evidence_counts)
Status: FIXED (observability layer implemented)
Now Available:
Usage:
from indra_agent.core.observability import get_observability obs = get_observability() # Trace operations with obs.trace_operation("indra_query", source="PM2.5", target="CRP"): result = await indra_api.get_paths("PM2.5", "CRP") # Get metrics metrics = obs.get_metrics() logger.info(f"Cache hit rate: {metrics.indra_cache_hit_rate:.1%}")
Status: FIXED (Pydantic validators added)
Protected Against:
Validators:
Edge.effect_size: Must be ∈ [0, 1], warns if <0.05 or >0.98Edge.temporal_lag_hours: Must be ≥0, warns if >168h (1 week)Evidence.confidence: Must be ∈ [0, 1], warns if <0.1Impact: Zero crashes from malformed INDRA data
Current Costs (per query):
Cost Drivers:
Mitigation (Phase 2):
Projected Costs (100 users/day):
| Limitation | Impact | Severity | Fix Status |
|---|---|---|---|
| Path length ≤3 | Limits complex disease modeling | HIGH | ✅ Documented (inherent) |
| DAG-only (no cycles) | Cannot model feedback loops | MEDIUM | ⏳ Cycle detection added |
| Effect size saturation | Monte Carlo meaningless | CRITICAL | ✅ FIXED |
| Markov pruning | Destroys interpretability | CRITICAL | ✅ PREVENTED |
| Bedrock throttling | Limits concurrency | HIGH | ⏳ Rate limiting (Phase 2) |
| INDRA API latency | 2-3s per query | MEDIUM | ⏳ Prefix caching (Phase 2) |
| MongoDB blocking | Bottleneck under load | MEDIUM | ⏳ Async ops (Phase 2) |
| Zero observability | Blind operations | CRITICAL | ✅ FIXED |
| No input validation | Crash risk | HIGH | ✅ FIXED |
| Cost per query | Unsustainable at scale | MEDIUM | ⏳ Caching (Phase 2) |
Ready for Production ✅:
Phase 2 Required ⏳:
Phase 3 (Research) 🧪:
Good Use Cases ✅:
Poor Use Cases ❌:
Bottom Line: This is a production systems medicine platform for mechanistic hypothesis generation and clinical research, not a replacement for clinical judgment.
For detailed implementation fixes, see
ARCHITECTURE_FIX_PLAN.md.
For brutalist critique that motivated these fixes, see internal review notes.