AGENTS.md - Guidelines for AI Agents Working on JupyterHub Metrics

This document contains important rules, conventions, and structural information for AI agents (like Claude) working on this codebase. These guidelines help ensure consistency, safety, and efficiency when making changes.

🚨 Critical Safety Rules

Git Commit Safety

ONLY COMMIT FILES YOU ACTIVELY WORKED ON - This is critical

When committing changes, ONLY stage and commit the specific files you created or modified
NEVER use
```
git add -A
```
or
```
git add .
```
without carefully reviewing what will be committed
NEVER commit files that were modified/added/deleted by other processes or users
Always use
```
git status
```
to review changes before committing
Use selective staging:
```
git add <specific-file>
```
for only the files you worked on

Example workflow:

# Review all changes first
git status

# Only stage the specific files you modified
git add docker-compose.yml
git add AGENTS.md

# Verify what will be committed
git status

# Then commit
git commit -m "Update docker-compose to use chart/files/grafana"

NEVER PUSH TO REMOTE - The user handles all git push operations
- You may create commits with
```
git commit
```
- NEVER run
```
git push
```
  or
```
git push origin <branch>
```
- The user will review and push commits when ready
- If the user asks you to "commit changes", only run
```
git commit
```
  , not
```
git push
```

ALWAYS USE

--no-pager

WITH GIT COMMANDS - Prevents hanging on interactive prompts

Git commands can open interactive pagers (like
```
less
```
or
```
more
```
) which will hang waiting for user input
Always use
```
git --no-pager <command>
```
to prevent this

Examples:

git --no-pager status
git --no-pager log
git --no-pager diff
git --no-pager show

Exception: Simple commands like
```
git add
```
,
```
git commit
```
don't need
```
--no-pager
```

AMEND COMMITS WHEN FIXING ERRORS IN THE SAME CHAT - Keep history clean

If you're fixing an error or making adjustments to work done in the same conversation, use
```
git commit --amend
```
Only create new commits if the previous work has been pushed to remote
Check if commits have been pushed:
```
git --no-pager log origin/main..HEAD
```

Example:

# Make fix to previous work
git add chart/files/grafana/dashboards/jupyterhub-demographics.json

# Amend the previous commit instead of creating a new one
git commit --amend --no-edit

Use
```
--amend -m "new message"
```
if you need to update the commit message

Helm Chart Management

UPDATE Chart.yaml VERSION AFTER CHANGES - Keep versions in sync
- Any changes to files in the
```
chart/
```
  directory require a version bump in
```
chart/Chart.yaml
```
- Version bump guidelines (follow Semantic Versioning):
  - PATCH (e.g., 1.2.3 → 1.2.4): Bug fixes, dashboard tweaks, small corrections
  - MINOR (e.g., 1.2.4 → 1.3.0): New features, new dashboards, new configuration options
  - MAJOR (e.g., 1.3.0 → 2.0.0): Breaking changes, schema changes, incompatible updates
- Example changes:
```
# chart/Chart.yaml
version: 1.3.0  # Bump this after changes
```
ONLY BUMP VERSION IF NOT PUSHED - Check before versioning
- Before bumping the chart version, verify the current version hasn't been pushed
- Check for unpushed commits:
```
git --no-pager log origin/main..HEAD
```
- Check if the version tag exists:
```
git --no-pager tag -l "v1.3.0"
```
- If the version is already pushed/tagged, you MUST bump to the next version
- If not pushed yet, you can amend the existing commit with new changes
MAINTAIN values.schema.json - Keep schema up to date
- The
```
chart/values.schema.json
```
  file must be kept in sync with
```
chart/values.yaml
```
- Add JSON schema definitions for any new values added to
```
values.yaml
```
- Include descriptions, types, and constraints for all configuration options
- This enables validation and provides documentation for chart users
- When making fields optional or allowing null, update the type to:
```
"type": ["string", "null"]
```
  or
```
"type": ["object", "null"]
```
- Always update
```
Chart.yaml
```
  with schema changes in the
```
artifacthub.io/changes
```
  annotation
- Location:
```
chart/values.schema.json
```

UPDATE ARTIFACTHUB.IO CHANGES - Document changes in Chart.yaml

Helm charts can include an
```
artifacthub.io/changes
```
annotation in
```
Chart.yaml
```
Document ALL changes made in the version, even small ones

Format:

annotations:
  artifacthub.io/changes: |
    - kind: added
      description: Added Users by College and Users by Application tables to demographics dashboard
    - kind: changed
      description: Updated all demographics tables to show GPU/CPU hours
    - kind: fixed
      description: Fixed incorrect column widths in dashboard tables

Valid kinds:

added

changed

deprecated

removed

fixed

security

MAINTAIN CHANGELOG.md - Document all releases

Keep a
```
CHANGELOG.md
```
file in the
```
chart/
```
directory
The version in the changelog MUST match the version in
```
Chart.yaml
```
Follow Keep a Changelog format
Include sections: Added, Changed, Deprecated, Removed, Fixed, Security

Example:

## [1.3.0] - 2025-11-03

### Added
- Users by College table in demographics dashboard
- Users by Application table in demographics dashboard

### Changed
- Updated all demographics tables to display GPU Hours, CPU Hours, and Total Hours
- Reorganized demographics dashboard into 2x2 grid layout

Location:
```
chart/CHANGELOG.md
```

Database Safety

NEVER DROP TABLES - Especially when upgrading or modifying schemas
- Always use
```
ALTER TABLE
```
  to modify existing tables
- Use
```
IF NOT EXISTS
```
  when creating new tables or columns
- Preserve all existing data during schema changes
- Example: Use
```
ALTER TABLE users ADD COLUMN IF NOT EXISTS department TEXT;
```
  instead of dropping and recreating
NEVER DELETE DATA - Unless explicitly requested by the user
- Retention policies are DISABLED - keep all data indefinitely
- DO NOT add retention policies to
```
init-db.sql
```
  or
```
add-policies.sql
```
- Any deletion scripts must be clearly marked and require explicit confirmation
- When modifying data, use UPDATE instead of DELETE/INSERT where possible
Always Use Transactions for Data Modifications
- Wrap multi-step database changes in transactions
- Test queries on a small dataset before running on full table
- Provide rollback instructions for any major changes

📁 Project Structure

Root Directory

jupyterhub-metrics/
├── .env                          # Database credentials and configuration (DO NOT COMMIT)
├── .env.example                  # Template for environment variables
├── add-policies.sql              # TimescaleDB policies (compression, aggregates)
├── migrations/                   # Database migration and fix scripts
│   ├── migrate_*.sql             # Database migration scripts (dated/versioned)
│   └── fix_*.sql                 # One-off data fix scripts
├── AGENTS.md                     # This file - guidelines for AI agents
└── README.md                     # Project documentation

IMPORTANT: The main database schema (

init-db.sql

) and Grafana dashboards are maintained in

chart/files/

- see Helm Chart section below.

Python Scripts

├── export_user_details.py              # Fetch user details from MS Graph (device auth)
├── export_user_details_with_token.py   # Fetch user details from MS Graph (token auth)
├── export_user_usage_stats.py          # Export user usage statistics to CSV
├── test_*.py                            # Test scripts (don't modify production data)

Collector

collector/
├── collector.sh                  # Main data collection script (runs every 5 minutes)
└── Dockerfile                    # Container for running collector

Helm Chart (Source of Truth for Config Files)

chart/
├── Chart.yaml                    # Helm chart metadata
├── values.yaml                   # Default configuration values
├── cori-dev.yaml                 # Environment-specific overrides (not tracked)
├── templates/                    # Kubernetes manifests
│   ├── deployment.yaml           # TimescaleDB deployment
│   ├── service.yaml              # Database service
│   ├── cronjob.yaml              # Collector cronjob
│   └── configmap.yaml            # Configuration
└── files/                        # **PRIMARY SOURCE** for all config files
    ├── init-db.sql               # Main database schema (edit this, not root)
    └── grafana/
        ├── provisioning/         # Grafana provisioning configs
        └── dashboards/           # Grafana dashboard JSON files
            └── jupyterhub-demographics.json

CRITICAL: The

chart/files/

directory is the single source of truth for:

Database schema (
```
init-db.sql
```
)
Grafana dashboards and provisioning
All configuration files used by both Kubernetes and local docker-compose

Edit files in

chart/files/

directly. Do NOT create duplicates in the root directory.

History/Testing

history/
└── venv/                         # Python virtual environment for scripts

🗄️ Database Schema Overview

Tables

users

- User Information

Primary Key:
```
email
```

Key Fields:

user_id

full_name

department

job_title

first_seen

last_seen

Purpose: Stores user profile information from Microsoft Graph API
Updated By:
```
export_user_details*.py
```
scripts

container_observations

- Raw Time Series Data

Type: TimescaleDB Hypertable (partitioned by timestamp)
Primary Key:
```
(user_email, pod_name, timestamp)
```

Key Fields:

timestamp

user_email

node_name

container_image

container_base

container_version

age_seconds

pod_name

Purpose: Raw observations of running containers (collected every 5 minutes)
Updated By:
```
collector/collector.sh
```
Retention: DISABLED (keeps all data indefinitely)

Views

user_sessions

- Materialized View

Purpose: Pre-computed user sessions from container observations
Session Definition: Continuous observations on same pod/node with no >1 hour gap

Fields:

user_email

pod_name

node_name

session_id

session_start

session_end

runtime_hours

container_base

container_version

Refresh: Automatically after each collector run

Note: This is a MATERIALIZED VIEW - refresh with

REFRESH MATERIALIZED VIEW CONCURRENTLY user_sessions;

user_session_stats

- Regular View

Purpose: Aggregated statistics per user

Fields:

user_email

total_hours

gpu_hours

cpu_hours

total_sessions

applications_used

first_session

last_session

GPU Detection: Nodes without "cpu" in name = GPU, nodes with "cpu" = CPU

Continuous Aggregates (TimescaleDB)

```
hourly_node_stats
```
- Hourly statistics per Kubernetes node
```
hourly_image_stats
```
- Hourly statistics per container image
Auto-maintained by TimescaleDB refresh policies

🔧 Development Conventions

Python Scripts

Always Use the Existing Virtual Environment
- Located at
```
history/venv/
```
- Activate with:
```
source history/venv/bin/activate
```
- Don't create new virtual environments
Environment Configuration
- Load
```
.env
```
  file using the
```
load_env_file()
```
  pattern (see existing scripts)
- Never hardcode credentials
- Use
```
DB_*
```
  environment variables for database connection
- Database variables:
```
DB_HOST
```
  ,
```
DB_PORT
```
  ,
```
DB_NAME
```
  ,
```
DB_USER
```
  ,
```
DB_PASSWORD
```
Script Naming Conventions
- ```
export_*.py
```
  - Scripts that export data to CSV
- ```
test_*.py
```
  - Scripts that test functionality without modifying production data
- ```
migrations/migrate_*.sql
```
  - Database migration scripts (include date if possible)
- ```
migrations/fix_*.sql
```
  - One-off data correction scripts
CSV Export Conventions
- Fixed filenames (no timestamps) for regular exports:
```
user_usage_stats.csv
```
- Timestamped filenames for one-off exports:
```
user_details_20251024_123456.csv
```
- Always include headers
- Use UTF-8 encoding

SQL Scripts

Idempotent Operations

Always use
```
IF EXISTS
```
/
```
IF NOT EXISTS
```
where applicable
Scripts should be safe to run multiple times

Example:

ALTER TABLE users ADD COLUMN IF NOT EXISTS department TEXT;

Comments and Documentation
- Every major section should have a comment explaining its purpose
- Document any non-obvious business logic
- Include examples in comments where helpful
Migration Scripts
- Create new files in
```
migrations/
```
  for migrations (don't modify chart/files/init-db.sql for one-off changes)
- Test on a small dataset first
- Provide verification queries at the end
- Example:
```
migrations/migrate_add_user_fields.sql
```

YAML Files (values.yaml, workflows)

Comment Spacing - CRITICAL
- ALWAYS use 2 spaces before inline comments (the
```
#
```
  character)
- Correct:
```
key: "value"  # comment
```
  (2 spaces)
- Wrong:
```
key: "value" # comment
```
  (1 space - will fail yamllint)
- This applies to ALL YAML files, especially
```
chart/values.yaml
```
- Lines to watch in values.yaml: 59, 151, 199 (these have inline comments)
- When using Edit tool, preserve exact spacing of inline comments
Why This Matters
- yamllint requires at least 2 spaces before inline comments
- Helm linting will fail if spacing is incorrect
- CI/CD workflows check this automatically

Data Transformations

Special Cases to Remember
- Fellowships Department: Users with department="Fellowships" have their actual department embedded in job_title after a comma
- Format:
```
"JOB_TITLE, Actual Department"
```
- Always split on comma and swap when encountering this
- Example:
```
"GRAD TEACHING ASST, Siebel School Comp & Data Sci"
```
  → job_title="GRAD TEACHING ASST", department="Siebel School Comp & Data Sci"
GPU vs CPU Detection
- Nodes with "cpu" in the name (case-insensitive) = CPU-only nodes
- All other nodes = GPU nodes
- Pattern:
```
node_name NOT ILIKE '%cpu%'
```
  for GPU hours

🔄 Common Operations

Updating User Information from Microsoft Graph

Incremental Update (default - only new users):

export ACCESS_TOKEN="your_token_here"
python export_user_details_with_token.py

Full Refresh (all users):

export ACCESS_TOKEN="your_token_here"
python export_user_details_with_token.py --refresh

Exporting User Usage Statistics

python export_user_usage_stats.py
# Outputs: user_usage_stats.csv
# Fields: fullname, email, department, jobtitle, gpu_hours, cpu_hours, total_hours, total_sessions, last_seen, favorite_container

Database Migrations

Create a new migration file in the migrations folder:
```
migrations/migrate_description_YYYYMMDD.sql
```
Use idempotent operations (IF EXISTS, IF NOT EXISTS)

Test the migration:

psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -f migrations/migrate_description.sql

Update
```
chart/files/init-db.sql
```
if the changes should apply to new deployments

Configuration Files - Single Source of Truth

All configuration files are maintained in

chart/files/

Database schema: Edit
```
chart/files/init-db.sql
```
directly
Grafana dashboards: Edit
```
chart/files/grafana/dashboards/*.json
```
directly
Grafana provisioning: Edit
```
chart/files/grafana/provisioning/*
```
directly

Both

docker-compose.yml

and Kubernetes deployments reference these files. Do NOT create duplicate copies in the root directory.

📊 Data Collection Flow

Every 5 minutes:
```
collector/collector.sh
```
runs
- Queries Kubernetes API for running JupyterHub pods
- Extracts: user email, pod name, node name, container image, age
- Inserts observations into
```
container_observations
```
  table
- Refreshes
```
user_sessions
```
  materialized view
- Updates
```
users
```
  table with latest
```
last_seen
```
  timestamp
Hourly: TimescaleDB continuous aggregate policies update
- ```
hourly_node_stats
```
  - node usage by hour
- ```
hourly_image_stats
```
  - container image usage by hour
On-demand: User detail synchronization
- Run
```
export_user_details_with_token.py
```
  to fetch latest user info from Microsoft Graph
- Updates:
```
full_name
```
  ,
```
department
```
  ,
```
job_title
```
  in
```
users
```
  table

🐛 Troubleshooting Tips

Python Script Issues

Always use the virtual environment:
```
source history/venv/bin/activate
```
Check
```
.env
```
file exists and has correct credentials

Verify database connectivity:

psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -c "SELECT version();"

Database Issues

Check if TimescaleDB extension is enabled:

SELECT * FROM pg_extension WHERE extname = 'timescaledb';

View active policies:

SELECT * FROM timescaledb_information.jobs;

Check materialized view freshness:

SELECT schemaname, matviewname, last_refresh FROM pg_matviews;

Data Issues

Session calculations depend on
```
user_sessions
```
materialized view being refreshed

Missing job titles? Run

export_user_details_with_token.py --refresh

GPU hours showing as 0? Check node naming convention (nodes must NOT contain "cpu" for GPU detection)

✅ Pre-Flight Checklist for Major Changes

Before making significant changes:

Read this document thoroughly
Understand the current schema (check
```
init-db.sql
```
)
Identify which files need to be kept in sync
Create a backup or migration script if modifying schema
Use
```
IF EXISTS
```
/
```
IF NOT EXISTS
```
for idempotent operations
Test on a small dataset first
Provide rollback instructions
Update both
```
init-db.sql
```
and
```
chart/files/init-db.sql
```
if needed
Update this AGENTS.md if you learned new important rules

🔮 Future Agent Instructions

Dear Future AI Agent:

If you discover new important patterns, conventions, or safety rules while working on this codebase, please update this document in the relevant section. This helps maintain institutional knowledge across conversations.

When adding new rules:

Add them in the appropriate section (or create a new section if needed)
Explain the why behind the rule, not just the what
Provide examples where helpful
Mark critical safety rules with 🚨
Keep the tone helpful and conversational

Remember: This codebase tracks valuable long-term research data. Preservation and accuracy are more important than convenience.

Last Updated: 2025-11-03
Version: 1.1
Maintained By: AI Agents working with the JupyterHub Metrics project team

AGENTS.md - Guidelines for AI Agents Working on JupyterHub Metrics

🚨 Critical Safety Rules

Git Commit Safety

ONLY COMMIT FILES YOU ACTIVELY WORKED ON - This is critical

When committing changes, ONLY stage and commit the specific files you created or modified
NEVER use
```
git add -A
```
or
```
git add .
```
without carefully reviewing what will be committed
NEVER commit files that were modified/added/deleted by other processes or users
Always use
```
git status
```
to review changes before committing
Use selective staging:
```
git add <specific-file>
```
for only the files you worked on

Example workflow:

# Review all changes first
git status

# Only stage the specific files you modified
git add docker-compose.yml
git add AGENTS.md

# Verify what will be committed
git status

# Then commit
git commit -m "Update docker-compose to use chart/files/grafana"

NEVER PUSH TO REMOTE - The user handles all git push operations
- You may create commits with
```
git commit
```
- NEVER run
```
git push
```
  or
```
git push origin <branch>
```
- The user will review and push commits when ready
- If the user asks you to "commit changes", only run
```
git commit
```
  , not
```
git push
```

ALWAYS USE

--no-pager

WITH GIT COMMANDS - Prevents hanging on interactive prompts

Git commands can open interactive pagers (like
```
less
```
or
```
more
```
) which will hang waiting for user input
Always use
```
git --no-pager <command>
```
to prevent this

Examples:

git --no-pager status
git --no-pager log
git --no-pager diff
git --no-pager show

Exception: Simple commands like
```
git add
```
,
```
git commit
```
don't need
```
--no-pager
```

AMEND COMMITS WHEN FIXING ERRORS IN THE SAME CHAT - Keep history clean

If you're fixing an error or making adjustments to work done in the same conversation, use
```
git commit --amend
```
Only create new commits if the previous work has been pushed to remote
Check if commits have been pushed:
```
git --no-pager log origin/main..HEAD
```

Example:

# Make fix to previous work
git add chart/files/grafana/dashboards/jupyterhub-demographics.json

# Amend the previous commit instead of creating a new one
git commit --amend --no-edit

Use
```
--amend -m "new message"
```
if you need to update the commit message

Helm Chart Management

UPDATE Chart.yaml VERSION AFTER CHANGES - Keep versions in sync
- Any changes to files in the
```
chart/
```
  directory require a version bump in
```
chart/Chart.yaml
```
- Version bump guidelines (follow Semantic Versioning):
  - PATCH (e.g., 1.2.3 → 1.2.4): Bug fixes, dashboard tweaks, small corrections
  - MINOR (e.g., 1.2.4 → 1.3.0): New features, new dashboards, new configuration options
  - MAJOR (e.g., 1.3.0 → 2.0.0): Breaking changes, schema changes, incompatible updates
- Example changes:
```
# chart/Chart.yaml
version: 1.3.0  # Bump this after changes
```
ONLY BUMP VERSION IF NOT PUSHED - Check before versioning
- Before bumping the chart version, verify the current version hasn't been pushed
- Check for unpushed commits:
```
git --no-pager log origin/main..HEAD
```
- Check if the version tag exists:
```
git --no-pager tag -l "v1.3.0"
```
- If the version is already pushed/tagged, you MUST bump to the next version
- If not pushed yet, you can amend the existing commit with new changes
MAINTAIN values.schema.json - Keep schema up to date
- The
```
chart/values.schema.json
```
  file must be kept in sync with
```
chart/values.yaml
```
- Add JSON schema definitions for any new values added to
```
values.yaml
```
- Include descriptions, types, and constraints for all configuration options
- This enables validation and provides documentation for chart users
- When making fields optional or allowing null, update the type to:
```
"type": ["string", "null"]
```
  or
```
"type": ["object", "null"]
```
- Always update
```
Chart.yaml
```
  with schema changes in the
```
artifacthub.io/changes
```
  annotation
- Location:
```
chart/values.schema.json
```

UPDATE ARTIFACTHUB.IO CHANGES - Document changes in Chart.yaml

Helm charts can include an
```
artifacthub.io/changes
```
annotation in
```
Chart.yaml
```
Document ALL changes made in the version, even small ones

Format:

annotations:
  artifacthub.io/changes: |
    - kind: added
      description: Added Users by College and Users by Application tables to demographics dashboard
    - kind: changed
      description: Updated all demographics tables to show GPU/CPU hours
    - kind: fixed
      description: Fixed incorrect column widths in dashboard tables

Valid kinds:

added

changed

deprecated

removed

fixed

security

MAINTAIN CHANGELOG.md - Document all releases

Keep a
```
CHANGELOG.md
```
file in the
```
chart/
```
directory
The version in the changelog MUST match the version in
```
Chart.yaml
```
Follow Keep a Changelog format
Include sections: Added, Changed, Deprecated, Removed, Fixed, Security

Example:

## [1.3.0] - 2025-11-03

### Added
- Users by College table in demographics dashboard
- Users by Application table in demographics dashboard

### Changed
- Updated all demographics tables to display GPU Hours, CPU Hours, and Total Hours
- Reorganized demographics dashboard into 2x2 grid layout

Location:
```
chart/CHANGELOG.md
```

Database Safety

NEVER DROP TABLES - Especially when upgrading or modifying schemas
- Always use
```
ALTER TABLE
```
  to modify existing tables
- Use
```
IF NOT EXISTS
```
  when creating new tables or columns
- Preserve all existing data during schema changes
- Example: Use
```
ALTER TABLE users ADD COLUMN IF NOT EXISTS department TEXT;
```
  instead of dropping and recreating
NEVER DELETE DATA - Unless explicitly requested by the user
- Retention policies are DISABLED - keep all data indefinitely
- DO NOT add retention policies to
```
init-db.sql
```
  or
```
add-policies.sql
```
- Any deletion scripts must be clearly marked and require explicit confirmation
- When modifying data, use UPDATE instead of DELETE/INSERT where possible
Always Use Transactions for Data Modifications
- Wrap multi-step database changes in transactions
- Test queries on a small dataset before running on full table
- Provide rollback instructions for any major changes

📁 Project Structure

Root Directory

jupyterhub-metrics/
├── .env                          # Database credentials and configuration (DO NOT COMMIT)
├── .env.example                  # Template for environment variables
├── add-policies.sql              # TimescaleDB policies (compression, aggregates)
├── migrations/                   # Database migration and fix scripts
│   ├── migrate_*.sql             # Database migration scripts (dated/versioned)
│   └── fix_*.sql                 # One-off data fix scripts
├── AGENTS.md                     # This file - guidelines for AI agents
└── README.md                     # Project documentation

IMPORTANT: The main database schema (

init-db.sql

) and Grafana dashboards are maintained in

chart/files/

- see Helm Chart section below.

Python Scripts

├── export_user_details.py              # Fetch user details from MS Graph (device auth)
├── export_user_details_with_token.py   # Fetch user details from MS Graph (token auth)
├── export_user_usage_stats.py          # Export user usage statistics to CSV
├── test_*.py                            # Test scripts (don't modify production data)

Collector

collector/
├── collector.sh                  # Main data collection script (runs every 5 minutes)
└── Dockerfile                    # Container for running collector

Helm Chart (Source of Truth for Config Files)

chart/
├── Chart.yaml                    # Helm chart metadata
├── values.yaml                   # Default configuration values
├── cori-dev.yaml                 # Environment-specific overrides (not tracked)
├── templates/                    # Kubernetes manifests
│   ├── deployment.yaml           # TimescaleDB deployment
│   ├── service.yaml              # Database service
│   ├── cronjob.yaml              # Collector cronjob
│   └── configmap.yaml            # Configuration
└── files/                        # **PRIMARY SOURCE** for all config files
    ├── init-db.sql               # Main database schema (edit this, not root)
    └── grafana/
        ├── provisioning/         # Grafana provisioning configs
        └── dashboards/           # Grafana dashboard JSON files
            └── jupyterhub-demographics.json

CRITICAL: The

chart/files/

directory is the single source of truth for:

Database schema (
```
init-db.sql
```
)
Grafana dashboards and provisioning
All configuration files used by both Kubernetes and local docker-compose

Edit files in

chart/files/

directly. Do NOT create duplicates in the root directory.

History/Testing

history/
└── venv/                         # Python virtual environment for scripts

🗄️ Database Schema Overview

Tables

users

- User Information

Primary Key:
```
email
```

Key Fields:

user_id

full_name

department

job_title

first_seen

last_seen

Purpose: Stores user profile information from Microsoft Graph API
Updated By:
```
export_user_details*.py
```
scripts

container_observations

- Raw Time Series Data

Type: TimescaleDB Hypertable (partitioned by timestamp)
Primary Key:
```
(user_email, pod_name, timestamp)
```

Key Fields:

timestamp

user_email

node_name

container_image

container_base

container_version

age_seconds

pod_name

Purpose: Raw observations of running containers (collected every 5 minutes)
Updated By:
```
collector/collector.sh
```
Retention: DISABLED (keeps all data indefinitely)

Views

user_sessions

- Materialized View

Purpose: Pre-computed user sessions from container observations
Session Definition: Continuous observations on same pod/node with no >1 hour gap

Fields:

user_email

pod_name

node_name

session_id

session_start

session_end

runtime_hours

container_base

container_version

Refresh: Automatically after each collector run

Note: This is a MATERIALIZED VIEW - refresh with

REFRESH MATERIALIZED VIEW CONCURRENTLY user_sessions;

user_session_stats

- Regular View

Purpose: Aggregated statistics per user

Fields:

user_email

total_hours

gpu_hours

cpu_hours

total_sessions

applications_used

first_session

last_session

GPU Detection: Nodes without "cpu" in name = GPU, nodes with "cpu" = CPU

Continuous Aggregates (TimescaleDB)

```
hourly_node_stats
```
- Hourly statistics per Kubernetes node
```
hourly_image_stats
```
- Hourly statistics per container image
Auto-maintained by TimescaleDB refresh policies

🔧 Development Conventions

Python Scripts

Always Use the Existing Virtual Environment
- Located at
```
history/venv/
```
- Activate with:
```
source history/venv/bin/activate
```
- Don't create new virtual environments
Environment Configuration
- Load
```
.env
```
  file using the
```
load_env_file()
```
  pattern (see existing scripts)
- Never hardcode credentials
- Use
```
DB_*
```
  environment variables for database connection
- Database variables:
```
DB_HOST
```
  ,
```
DB_PORT
```
  ,
```
DB_NAME
```
  ,
```
DB_USER
```
  ,
```
DB_PASSWORD
```
Script Naming Conventions
- ```
export_*.py
```
  - Scripts that export data to CSV
- ```
test_*.py
```
  - Scripts that test functionality without modifying production data
- ```
migrations/migrate_*.sql
```
  - Database migration scripts (include date if possible)
- ```
migrations/fix_*.sql
```
  - One-off data correction scripts
CSV Export Conventions
- Fixed filenames (no timestamps) for regular exports:
```
user_usage_stats.csv
```
- Timestamped filenames for one-off exports:
```
user_details_20251024_123456.csv
```
- Always include headers
- Use UTF-8 encoding

SQL Scripts

Idempotent Operations

Always use
```
IF EXISTS
```
/
```
IF NOT EXISTS
```
where applicable
Scripts should be safe to run multiple times

Example:

ALTER TABLE users ADD COLUMN IF NOT EXISTS department TEXT;

Comments and Documentation
- Every major section should have a comment explaining its purpose
- Document any non-obvious business logic
- Include examples in comments where helpful
Migration Scripts
- Create new files in
```
migrations/
```
  for migrations (don't modify chart/files/init-db.sql for one-off changes)
- Test on a small dataset first
- Provide verification queries at the end
- Example:
```
migrations/migrate_add_user_fields.sql
```

YAML Files (values.yaml, workflows)

Comment Spacing - CRITICAL
- ALWAYS use 2 spaces before inline comments (the
```
#
```
  character)
- Correct:
```
key: "value"  # comment
```
  (2 spaces)
- Wrong:
```
key: "value" # comment
```
  (1 space - will fail yamllint)
- This applies to ALL YAML files, especially
```
chart/values.yaml
```
- Lines to watch in values.yaml: 59, 151, 199 (these have inline comments)
- When using Edit tool, preserve exact spacing of inline comments
Why This Matters
- yamllint requires at least 2 spaces before inline comments
- Helm linting will fail if spacing is incorrect
- CI/CD workflows check this automatically

Data Transformations

Special Cases to Remember
- Fellowships Department: Users with department="Fellowships" have their actual department embedded in job_title after a comma
- Format:
```
"JOB_TITLE, Actual Department"
```
- Always split on comma and swap when encountering this
- Example:
```
"GRAD TEACHING ASST, Siebel School Comp & Data Sci"
```
  → job_title="GRAD TEACHING ASST", department="Siebel School Comp & Data Sci"
GPU vs CPU Detection
- Nodes with "cpu" in the name (case-insensitive) = CPU-only nodes
- All other nodes = GPU nodes
- Pattern:
```
node_name NOT ILIKE '%cpu%'
```
  for GPU hours

🔄 Common Operations

Updating User Information from Microsoft Graph

Incremental Update (default - only new users):

export ACCESS_TOKEN="your_token_here"
python export_user_details_with_token.py

Full Refresh (all users):

export ACCESS_TOKEN="your_token_here"
python export_user_details_with_token.py --refresh

Exporting User Usage Statistics

python export_user_usage_stats.py
# Outputs: user_usage_stats.csv
# Fields: fullname, email, department, jobtitle, gpu_hours, cpu_hours, total_hours, total_sessions, last_seen, favorite_container

Database Migrations

Create a new migration file in the migrations folder:
```
migrations/migrate_description_YYYYMMDD.sql
```
Use idempotent operations (IF EXISTS, IF NOT EXISTS)

Test the migration:

psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -f migrations/migrate_description.sql

Update
```
chart/files/init-db.sql
```
if the changes should apply to new deployments

Configuration Files - Single Source of Truth

All configuration files are maintained in

chart/files/

Database schema: Edit
```
chart/files/init-db.sql
```
directly
Grafana dashboards: Edit
```
chart/files/grafana/dashboards/*.json
```
directly
Grafana provisioning: Edit
```
chart/files/grafana/provisioning/*
```
directly

Both

docker-compose.yml

and Kubernetes deployments reference these files. Do NOT create duplicate copies in the root directory.

📊 Data Collection Flow

Every 5 minutes:
```
collector/collector.sh
```
runs
- Queries Kubernetes API for running JupyterHub pods
- Extracts: user email, pod name, node name, container image, age
- Inserts observations into
```
container_observations
```
  table
- Refreshes
```
user_sessions
```
  materialized view
- Updates
```
users
```
  table with latest
```
last_seen
```
  timestamp
Hourly: TimescaleDB continuous aggregate policies update
- ```
hourly_node_stats
```
  - node usage by hour
- ```
hourly_image_stats
```
  - container image usage by hour
On-demand: User detail synchronization
- Run
```
export_user_details_with_token.py
```
  to fetch latest user info from Microsoft Graph
- Updates:
```
full_name
```
  ,
```
department
```
  ,
```
job_title
```
  in
```
users
```
  table

🐛 Troubleshooting Tips

Python Script Issues

Always use the virtual environment:
```
source history/venv/bin/activate
```
Check
```
.env
```
file exists and has correct credentials

Verify database connectivity:

psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -c "SELECT version();"

Database Issues

Check if TimescaleDB extension is enabled:

SELECT * FROM pg_extension WHERE extname = 'timescaledb';

View active policies:

SELECT * FROM timescaledb_information.jobs;

Check materialized view freshness:

SELECT schemaname, matviewname, last_refresh FROM pg_matviews;

Data Issues

Session calculations depend on
```
user_sessions
```
materialized view being refreshed

Missing job titles? Run

export_user_details_with_token.py --refresh

GPU hours showing as 0? Check node naming convention (nodes must NOT contain "cpu" for GPU detection)

✅ Pre-Flight Checklist for Major Changes

Before making significant changes:

Read this document thoroughly
Understand the current schema (check
```
init-db.sql
```
)
Identify which files need to be kept in sync
Create a backup or migration script if modifying schema
Use
```
IF EXISTS
```
/
```
IF NOT EXISTS
```
for idempotent operations
Test on a small dataset first
Provide rollback instructions
Update both
```
init-db.sql
```
and
```
chart/files/init-db.sql
```
if needed
Update this AGENTS.md if you learned new important rules

🔮 Future Agent Instructions

Dear Future AI Agent:

When adding new rules:

Add them in the appropriate section (or create a new section if needed)
Explain the why behind the rule, not just the what
Provide examples where helpful
Mark critical safety rules with 🚨
Keep the tone helpful and conversational

Remember: This codebase tracks valuable long-term research data. Preservation and accuracy are more important than convenience.

Last Updated: 2025-11-03
Version: 1.1
Maintained By: AI Agents working with the JupyterHub Metrics project team

AGENTS.md - Guidelines for AI Agents Working on JupyterHub Metrics

Related Skills

Nano Banana Pro

Markdown Converter

1password