Nano Banana Pro
Agent skill for nano-banana-pro
**IMPORTANT**: When using the Plan tool or entering plan mode:
Sign in to like and favorite skills
IMPORTANT: When using the Plan tool or entering plan mode:
.md file in .claude/tasks/ directoryYYYY-MM-DD_HH-MM_task_description.mdExample workflow:
User requests → Enter plan mode → Create plan → Save to .claude/tasks/2024-XX-XX_XX-XX_task_name.md → Execute according to saved plan
This ensures all plans are documented, traceable, and can be reviewed or resumed later.
This project simulates distributed optimization algorithms on a single GPU. It is designed to test and analyze various distributed optimization strategies, particularly the PushPull algorithm, without requiring an actual distributed computing environment.
Key Features:
new_push_pull/ ├── data/ # Training data directory (local: empty, server: contains MNIST/CIFAR10) ├── datasets/ # Data processing and splitting scripts for distributed optimization │ ├── __init__.py │ └── prepare_data.py # Handles data partitioning for heterogeneous distribution ├── models/ # Model architectures │ ├── __init__.py │ ├── cnn.py # CNN model implementation │ └── fully_connected.py # Fully connected network implementation ├── training/ # Training loops and optimizers │ ├── optimizer_push_pull_grad_norm_track.py │ ├── optimizer_push_pull_grad_norm_track_different_learning_rate.py │ ├── training_track_grad_norm.py │ └── training_track_grad_norm_different_learning_rate.py ├── utils/ # Utility functions │ ├── algebra_utils.py # Matrix operations (Perron vectors, etc.) │ ├── network_utils.py # Graph/network topology utilities │ └── train_utils.py # Training helper functions ├── scripts_pushpull_differ_lr/ # Experiment execution scripts │ ├── Nearest_neighbor_MNIST.py # Example training script │ └── network_utils.py # Network generation utilities (project-specific) └── NEW_PROJECT_20250717/ # Output directory for CSV results and analysis
Main training function with heterogeneous data distribution and different learning rates per node.train_track_grad_norm_with_hetero_different_learning_rate()
from training import train_track_grad_norm_with_hetero_different_learning_rate df = train_track_grad_norm_with_hetero_different_learning_rate( algorithm="PushPull", # Algorithm type lr_list=[...], # List of learning rates for each node A=A, # Row-stochastic mixing matrix B=B, # Column-stochastic mixing matrix dataset_name="MNIST", # Dataset: "MNIST" or "CIFAR10" batch_size=128, # Batch size num_epochs=500, # Number of epochs remark="experiment_name", # Experiment identifier alpha=1000, # Heterogeneity parameter (higher = more uniform) root="/path/to/output", # Output directory use_hetero=True, # Enable heterogeneous data distribution device="cuda:0", # GPU device seed=42 # Random seed )
Generates communication topology matrices for distributed optimization.generate_nearest_neighbor_matrices()
from network_utils import generate_nearest_neighbor_matrices A, B = generate_nearest_neighbor_matrices( n=16, # Number of nodes k=3, # Number of neighbors per node seed=42 # Random seed )
from utils import get_left_perron, get_right_perron pi_a = get_left_perron(A) # Left Perron vector of matrix A pi_b = get_right_perron(B) # Right Perron vector of matrix B
alpha parameter:
import sys import os import numpy as np # Add project root to path current_dir = os.path.dirname(os.path.abspath(__file__)) project_root = os.path.abspath(os.path.join(current_dir, '..')) if project_root not in sys.path: sys.path.insert(0, project_root) from training import train_track_grad_norm_with_hetero_different_learning_rate from utils import get_left_perron, get_right_perron from network_utils import generate_nearest_neighbor_matrices # Configuration n = 16 # Number of nodes lr_basic = 7e-3 # Base learning rate num_epochs = 100 # Training epochs batch_size = 128 # Batch size alpha = 1000 # Heterogeneity parameter device = "cuda:0" # GPU device # Generate communication topology A, B = generate_nearest_neighbor_matrices(n=n, k=3, seed=42) # Compute Perron vectors pi_b = get_right_perron(B) # Set up learning rates (uniform strategy) lr_total = lr_basic * n lr_list = [lr_total / n] * n # Uniform distribution # Train model df = train_track_grad_norm_with_hetero_different_learning_rate( algorithm="PushPull", lr_list=lr_list, A=A, B=B, dataset_name="MNIST", batch_size=batch_size, num_epochs=num_epochs, remark="uniform_lr_experiment", alpha=alpha, root="./output", use_hetero=True, device=device, seed=0 ) # Save results df.to_csv("experiment_results.csv")
For questions related to
合成数据_不使用cupy_最简单的版本/basic_test.py or just the dir 合成数据_不使用cupy_最简单的版本/, please refer to the detailed analysis document:
Local vs Server: This is a local simulation project. Data paths are configured for server environments but work locally with empty data directories.
GPU Memory: When simulating many nodes, GPU memory usage scales with the number of nodes. Adjust batch size and number of nodes accordingly.
Output Files: Training results are saved as CSV files containing loss, gradient norms, and other metrics for each epoch.
Reproducibility: Always set seeds for reproducible experiments.
Network Topology: The choice of communication graph (A, B matrices) significantly affects convergence. Ensure matrices satisfy required properties (row/column stochasticity).