Nano Banana Pro
Agent skill for nano-banana-pro
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Sign in to like and favorite skills
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
load2mssql is a production-ready Python application for loading CSV files into Microsoft SQL Server. It features dynamic metadata removal, flexible configuration via YAML, modular filename sanitization, and efficient bulk loading.
Author: Quentin Casares
# Create and activate virtual environment python -m venv venv source venv/bin/activate # On Mac/Linux venv\Scripts\activate # On Windows # Install dependencies pip install -r requirements.txt
# Use default config.yaml python load_csv_to_mssql.py # Use custom configuration python load_csv_to_mssql.py --config my_config.yaml
# Test filename sanitization python test_sanitization.py # Run sanitizer module tests python filename_sanitizer.py
1. Main Loader (
)load_csv_to_mssql.py
CSVToMSSQLLoader classDatabaseConfig: Connection settings and authenticationCSVProcessingConfig: File reading parametersFileSelectionConfig: File filtering rulesTableLoadingConfig: Table creation and loading behaviorrun() → get_csv_files() → process_file() → load_dataframe_to_sql()read_csv_with_metadata_removal() skips header/footer rowsget_table_name() applies custom mapping or sanitization2. Filename Sanitizer (
)filename_sanitizer.py
FilenameSanitizer class with configurable SanitizationRules_20251114_093000), dates, versions, special charactersvalidate_table_name()from filename_sanitizer import sanitize_filename3. Configuration (
)config.yaml
database: Server, auth mode (trusted/sql), driver, performance settingscsv_processing: Encoding, delimiter, chunk_size, metadata row skippingfile_selection: Process 'all' or 'selected' filestable_loading: if_exists strategy (fail/replace/append), schema, sanitizationfilename_sanitization: PascalCase, max_length, custom patterns/replacementslogging: Level, file, formatconfig.yaml into dataclass configurationsskip_header_rows) and footer rows (skip_footer_rows)table_naming: "custom"enable_sanitization: trueto_sql() with SQLAlchemy engine
filename_sanitizer.py module for reusabilitycreate_timestamp_sanitizer(), create_simple_sanitizer() for common patternsWindows Authentication (Recommended):
database: server: "localhost" port: 1433 # Default SQL Server port (optional) auth_mode: "trusted"
SQL Server Authentication:
database: server: "localhost" port: 1433 # Specify custom port if needed auth_mode: "sql" username: "your_username" password: "your_password"
Custom Port Configuration:
server,port (e.g., localhost,1434)mode: "all": Process all CSV files in input_foldermode: "selected": Only process files in selected_files listAdd consistent prefixes to all table names (applied after sanitization):
table_loading: table_prefix: "tbl_" # All tables prefixed with "tbl_"
Behavior:
Examples:
table_prefix: "tbl_" → CustomerAccount becomes tbl_CustomerAccounttable_prefix: "stg_" → Sales becomes stg_Salestable_prefix: "" → No prefix appliedif_exists: "fail": Error if table exists (safe default for testing)if_exists: "replace": Drop and recreate tableif_exists: "append": Add rows to existing table (for incremental loads)Enabled by default (
enable_sanitization: true). Transforms:
CustomerAccount_20251114_093000.csv → CustomerAccountsales-data-final.csv → SalesDataFinalCustomize via
filename_sanitization section:
use_pascal_case: Convert to PascalCasecustom_patterns: Additional regex patterns to stripcustom_replacements: String replacements (e.g., "cust" → "Customer")csv_processing: chunk_size: 50000 # Larger chunks database: fast_executemany: true # Enable bulk insert optimization
csv_processing: chunk_size: 5000 # Smaller chunks
All timestamps in logs use format:
YYYY-MM-DD HH:MM:SS
Configurable via:
logging: date_format: "%Y-%m-%d %H:%M:%S"
config.yaml under filename_sanitization.custom_patterns"_production$")python test_sanitization.pytable_loading: table_prefix: "tbl_" # If using prefix create_indexes: # Use base table names (without prefix) - prefix applied automatically CustomerAccount: ["CustomerID", "AccountNumber"] OrderHistory: ["OrderID"]
Index Naming:
idx_{table}_{column}idx_tbl_CustomerAccount_CustomerIDNote: The system automatically looks up indexes by base name, so you don't need to include the prefix in the
create_indexes configuration.
dtype_overrides to force specific SQL Server data typestable_name: {column: "SQL_TYPE"}Example:
table_loading: dtype_overrides: Sales: OrderDate: "DATETIME" Amount: "DECIMAL(18,2)"
The
filename_sanitizer.py module can be used standalone in other projects:
from filename_sanitizer import FilenameSanitizer, SanitizationRules # Quick usage sanitizer = FilenameSanitizer() table_name = sanitizer.sanitize("file_20251114.csv") # Custom rules rules = SanitizationRules( use_pascal_case=False, custom_replacements={"acct": "Account"} ) sanitizer = FilenameSanitizer(rules)
if_exists: "fail" to prevent accidental overwritesconfig.yaml with credentials to version control.gitignore to exclude sensitive configsauth_mode: "trusted") when available