Coding
PromptBeginner5 minmarkdown
Nano Banana Pro
Agent skill for nano-banana-pro
7
- `ragent_core/`: Core Python package (config, prompts, rewards, data_sources, `bm25_client.py`). Env vars loaded from `.env` via `ragent_core/ragent_core/config`.
Sign in to like and favorite skills
ragent_core/: Core Python package (config, prompts, rewards, data_sources, bm25_client.py). Env vars loaded from .env via ragent_core/ragent_core/config.environments/: Verifiers evaluation/training environments (e.g., bm25/ with bm25.py, its own pyproject.toml and .venv). Wrapper script: environments/run_eval.sh.data/: Local artifacts (e.g., BM25 indexes under data/<dataset>/bm25s_corpus_index/).train.py: Tyro-driven entrypoint that wires Verifiers’ GRPOTrainer.cd ragent_core && uv sync and cd environments/bm25 && uv sync.bash environments/run_eval.sh gpt-4.1-mini bm25.cd environments && uv run vf-eval bm25 -m gpt-4.1-mini --save-dataset.uv run python train.py --env-id bm25 --hf-dataset nampdn-ai/devdocs.io --model-name Qwen/Qwen2.5-0.5B --max-steps 100.snake_case; classes CapWords.ragent_core/ragent_core/config/logging.py), avoid print.uv run black ragent_core environments/bm25 and uv run isort ragent_core environments/bm25.uv run vf-eval bm25 -n 10 -r 1 -t 512 --save-dataset.results/<model>_<env>/ and, if needed, open index.html to visualize trajectories.core:, env/bm25:, docs:. Example: env/bm25: integrate judge reward backoff.results/<model>_<env>/ when behavior changes..env files (see environments/bm25/.env-template).OPENROUTER_API_KEY, HF_TOKEN, WANDB_PROJECT, WANDB_API_KEY, JUDGE_MODEL, OPENROUTER_URL (base URL). Set them in the relevant .env.