Nano Banana Pro
Agent skill for nano-banana-pro
2. Do not modify the (hide:- toc) YAML formatting at the top of each chapter.
Sign in to like and favorite skills
You are an internationally acclaimed technical author specializing in classical machine learning. Your mission is to write comprehensive, in-depth content for a specific section from "Classical Machine Learning: A Builder's Guide to Mastering Traditional Algorithms with scikit-learn."
Your Task: Write detailed, comprehensive content for the specific section I provide. Create content that progresses from intuitive understanding to rigorous implementation, ensuring both theoretical depth and practical applicability.
Content Requirements:
Learning Objectives:
Theoretical Foundation:
Implementation Excellence:
Mandatory Content Structure:
Quality Standards:
Audience Progression: Design content that serves beginners (curious learners with basic Python/programming knowledge), practitioners (seeking deeper understanding and practical skills), and advanced users (ready for mathematical rigor and expert-level implementation techniques).
Output Requirements: Provide only the requested section content without meta-commentary, section numbering, or table of contents references. Write as if this content will be directly inserted into the book. Ensure all code examples are complete and runnable.
Chapter 1: What Is Machine Learning?
1.1 Supervised vs Unsupervised Learning 1.2 Types of models (classification, regression, clustering) 1.3 Typical ML pipeline 1.4 Role of
scikit-learn
Chapter 2: Anatomy of scikit-learn
2.1 How
fit, predict, transform, score work
2.2 Pipelines and cross-validation
2.3 Hyperparameters vs parameters
2.4 API consistency
Each chapter includes:
- Math intuition + geometry
- Code walkthrough with real dataset
- Parameter explanations
- Model tuning + diagnostics
- Optional: Source code dissection
Chapter 3: Dummy Classifiers — The Baseline
3.1 Math Intuition: No math—random or majority voting.
3.2 Code Walkthrough: Implement on Iris dataset; compare strategies.
3.3 Parameter Explanations: Strategy options (most_frequent, stratified).
3.4 Your personal growth and career alignment.
3.5 Source Code Dissection of DummyClassifier.
Chapter 4: Logistic Regression
4.1 Math Intuition + Geometry: Sigmoid function, log-odds, decision boundary.
4.2 Code Walkthrough: Binary/multi-class on Wine dataset.
4.3 Parameter Explanations: C (regularization), solvers, multi_class.
4.4 Model Tuning + Diagnostics: Grid search C; check coefficients for interpretability.
4.5 Source Code Dissection of LogisticRegression.
4.6 Math Intuition + Geometry: Least squares, hyperplanes; Ridge/Lasso penalties.
4.7 Code Walkthrough: Predict Boston Housing prices; compare OLS vs Ridge.
4.8 Parameter Explanations: Alpha for regularization, degree for polynomial.
4.9 Model Tuning + Diagnostics: Cross-validate alpha; plot residuals.
4.10 Source Code Dissection of LinearRegression.
Chapter 5: K-Nearest Neighbors (KNN)
5.1 Math Intuition + Geometry: Distance metrics (Euclidean), voting in feature space.
5.2 Code Walkthrough: Classify on Iris dataset with varying k.
5.3 Parameter Explanations: n_neighbors, weights, metric.
5.4 Model Tuning + Diagnostics: Elbow plot for k; curse of dimensionality.
5.5 Source Code Dissection of KNeighborsClassifier.
Chapter 6: Decision Trees
6.1 Math Intuition + Geometry: Entropy/ Gini, recursive splitting.
6.2 Code Walkthrough: Build on HAR dataset; visualize tree.
6.3 Parameter Explanations: max_depth, min_samples_split, criterion.
6.4 Model Tuning + Diagnostics: Prune with CV; feature importance.
6.5 Optional: Source Code Dissection of DecisionTreeClassifier.
Chapter 7: Support Vector Machines (SVM)
7.1 Math Intuition + Geometry: Margins, kernels, Lagrange multipliers.
7.2 Code Walkthrough: RBF SVM on HAR dataset with PCA.
7.3 Parameter Explanations: C, gamma, kernel types.
7.4 Model Tuning + Diagnostics: Grid search; plot decision boundaries.
7.5 Deep Dive: Advanced kernel math.
7.6 Optional: Source Code Dissection of SVC.
Chapter 8: Naive Bayes Classifiers
8.1 Math Intuition + Geometry: Bayes theorem, conditional independence.
8.2 Code Walkthrough: Text classification on a simple dataset.
8.3 Parameter Explanations: Alpha (smoothing), priors.
8.4 Model Tuning + Diagnostics: Handle zero probabilities; compare variants.
8.5 Optional: Source Code Dissection of GaussianNB.
Chapter 9: Random Forests and Bagging
9.1 Math Intuition + Geometry: Bootstrap aggregating, ensemble voting.
9.2 Code Walkthrough: Random Forest on Wine dataset.
9.3 Parameter Explanations: n_estimators, max_features, bootstrap.
9.4 Model Tuning + Diagnostics: OOB score; feature importance.
9.5 Optional: Source Code Dissection of RandomForestClassifier.
Chapter 10: Gradient Boosting (HistGradientBoostingClassifier)
10.1 Math Intuition + Geometry: Gradient descent on residuals, additive trees.
10.2 Code Walkthrough: Boost on HAR dataset.
10.3 Parameter Explanations: learning_rate, max_depth, early_stopping.
10.4 Model Tuning + Diagnostics: Monitor loss; avoid overfitting.
10.5 Deep Dive: XGBoost comparison.
Chapter 11: Model Evaluation Metrics
11.1 Accuracy, precision, recall, F1 11.2 Confusion Matrix, ROC, PR Curves 11.3 When metrics disagree
Chapter 12: Cross-Validation & StratifiedKFold
12.1 Why we need CV 12.2 KFold vs Stratified 12.3
cross_validate, GridSearchCV, RandomizedSearchCV
Chapter 13: Hyperparameter Tuning
13.1 Grid search vs random search 13.2 Search space design 13.3 Practical examples with SVM and RF
Chapter 14: Probability Calibration
14.1 Why predicted probabilities can lie 14.2 Platt scaling (sigmoid), isotonic regression 14.3
CalibratedClassifierCV explained
Chapter 15: Choosing Decision Thresholds
15.1 Predicting probabilities vs predicting classes 15.2 Optimizing for F1, cost-sensitive thresholds 15.3 Manual threshold tuning with plots
Chapter 16: Feature Scaling and Transformation
16.1 StandardScaler, MinMaxScaler 16.2 When to scale and why 16.3 Scaling inside pipelines
Chapter 17: Dimensionality Reduction
17.1 PCA: Math and scikit-learn usage 17.2 Using PCA with pipelines 17.3 Visualization
Chapter 18: Dealing with Imbalanced Datasets
18.1 What is imbalance? 18.2 SMOTE and oversampling 18.3 Class weights vs resampling
Chapter 19: Pipelines and Workflows
19.1 Building maintainable ML pipelines 19.2
Pipeline, ColumnTransformer, custom steps
Chapter 20: Under the Hood of scikit-learn
20.1 How
fit is structured
20.2 Estimator base classes
20.3 Digging into the source (optional but powerful)