Nano Banana Pro
Agent skill for nano-banana-pro
This document describes the system prompt configuration for fair evaluation across different models on the Qiskit HumanEval benchmark and synthetic datasets.
Sign in to like and favorite skills
This document describes the system prompt configuration for fair evaluation across different models on the Qiskit HumanEval benchmark and synthetic datasets.
The IBM Qiskit team's Qwen2.5-Coder-14B-Qiskit model uses a detailed system prompt that includes:
For fair comparison across models in benchmarks, we provide configurable system prompts that match or adapt from IBM's approach.
qiskit_humaneval (Default)Purpose: Full Qiskit code assistant prompt based on IBM's Qwen2.5-Coder-14B-Qiskit
Use when:
Content:
qiskit_humaneval_minimalPurpose: Minimal Qiskit prompt for base models
Use when:
Content:
genericPurpose: Generic code assistant (no Qiskit-specific)
Use when:
Content:
customPurpose: User-provided custom prompt
Use when:
None (No system prompt)Purpose: Evaluate without any system message
Use when:
eval_config.yaml)metrics: # System prompt configuration system_prompt_type: "qiskit_humaneval" # or "qiskit_humaneval_minimal", "generic", "custom", null custom_system_prompt: null # Set this if system_prompt_type is "custom"
Examples:
# Full Qiskit prompt (recommended for fair comparison with IBM results) system_prompt_type: "qiskit_humaneval" custom_system_prompt: null # Minimal prompt system_prompt_type: "qiskit_humaneval_minimal" custom_system_prompt: null # No system prompt system_prompt_type: null custom_system_prompt: null # Custom prompt system_prompt_type: "custom" custom_system_prompt: | You are a quantum computing expert specializing in Qiskit. Generate clean, well-documented code following Qiskit 2.0 standards.
python -m evaluate.cli run --config eval_config.yaml
# Full Qiskit prompt python -m evaluate.cli qiskit-humaneval \ --dataset dataset.json \ --model-url http://localhost:8000/v1 \ --system-prompt qiskit # Minimal prompt python -m evaluate.cli qiskit-humaneval \ --dataset dataset.json \ --model-url http://localhost:8000/v1 \ --system-prompt minimal # Custom prompt python -m evaluate.cli qiskit-humaneval \ --dataset dataset.json \ --model-url http://localhost:8000/v1 \ --system-prompt "You are a Qiskit expert..." # No system prompt python -m evaluate.cli qiskit-humaneval \ --dataset dataset.json \ --model-url http://localhost:8000/v1
from evaluate.config.system_prompts import get_system_prompt from evaluate.runners.qiskit_humaneval import QiskitHumanEvalRunner # Get predefined prompt system_prompt = get_system_prompt("qiskit_humaneval") # Or use custom system_prompt = "Your custom prompt here" # Or None for no system prompt system_prompt = None runner = QiskitHumanEvalRunner(...) results = runner.evaluate( samples=samples, system_prompt=system_prompt, ... )
For reproducible and fair comparisons:
qiskit_humaneval promptThe full
qiskit_humaneval prompt includes guidance on:
generate_preset_pass_manager instead of transpileSamplerV2 and EstimatorV2 instead of executeqiskit-ibm-runtime instead of deprecated qiskit-ibmq-providerAll evaluation results automatically include system prompt information:
{ "metadata": { "evaluation": { "system_prompt": "You are the Qiskit code assistant...", "system_prompt_type": "qiskit_humaneval", ... } } }
This ensures reproducibility and transparency in benchmarking.