Skip to content

Instantly share code, notes, and snippets.

@DSamuelHodge
Last active February 17, 2025 21:37
Show Gist options
  • Select an option

  • Save DSamuelHodge/e6fdd289e97b1bab234f6c17b353e1bf to your computer and use it in GitHub Desktop.

Select an option

Save DSamuelHodge/e6fdd289e97b1bab234f6c17b353e1bf to your computer and use it in GitHub Desktop.
WeightWatcher advanced features into logical categories to make it easier to find specific functionalities when you need them.

WeightWatcher Advanced Usage Cheatsheet

πŸ” Basic Analysis

Feature Command Description
Analyze Model Layers watcher.analyze() Analyze model layers for generalization, spectral properties, and overtraining.
Describe Model watcher.describe(model=model) Get model details without analyzing it.
Plot and Fit ESD watcher.analyze(plot=True) Plot the Empirical Spectral Density (ESD) of model layers and apply fits.
Generate Summary Statistics summary = watcher.get_summary() Generate summary statistics from analysis results to compare models.
Retrieve Model Metrics watcher.get_details() Retrieve layer-specific metrics useful for hyperparameter tuning.

πŸ”¬ Advanced Analysis Options

Feature Command Description
Filter Layers by Type watcher.analyze(layers=[ww.LAYER_TYPE.CONV2D]) Filter model analysis by layer types such as Dense or Conv2D.
Filter Layers by ID/Name watcher.analyze(layers=[20]) Analyze only specified layer IDs or names.
Set Min/Max Eigenvalues watcher.analyze(min_evals=50, max_evals=500) Restrict eigenvalue computations to a range for focused analysis.
Specify Power Law Fit watcher.analyze(fit='PL','TPL','E_TPL') Specify different power law fitting approaches for more accurate metrics.
Fit ESDs to MP Distribution watcher.analyze(mp_fit=True, plot=True) Fit layer ESDs to a Marchenko-Pastur (MP) distribution for deeper analysis.
Fetch ESD for Specific Layer esd = watcher.get_ESD(layers=[20]) Retrieve ESD for a specific layer for custom visualization or analysis.
Analyze Large Models Efficiently watcher.analyze(min_evals=100, max_evals=5000) Process large-scale models efficiently by limiting eigenvalue computations.

πŸ“Š Visualization and Export

Feature Command Description
Save Model Figures watcher.analyze(savefig=True, savefig='/plot_dir') Save visualizations of model layer ESDs for inspection.
Visualize Eigenvalue Distribution watcher.analyze(plot=True) Generate visualizations of eigenvalue distributions for each model layer.
Export Analysis to CSV watcher.analyze().to_csv('analysis_results.csv') Export analysis results as CSV for further processing.

πŸ”„ Model Comparison and Evaluation

Feature Command Description
Compare Two Models watcher.distances(model_1, model_2) Compute distances between two models, useful for tracking training progress.
Calculate Distance Between Models watcher.analyze(model="fine-tuned", base="pretrained") Measure norm-based distances between initial and trained model weights.

🚨 Overfitting and Training Diagnostics

Feature Command Description
Detect Overfitting (Correlation Traps) watcher.analyze(randomize=True, plot=True) Detect overfitting using correlation traps in randomized ESDs.
Apply Early Stopping Detection watcher.analyze() -> check summary['alpha'] Identify early stopping points using the alpha metric.
Detect Randomness in Layers watcher.analyze(mp_fit=True, plot=True) Detect how random a layer is by comparing its spectral density to MP distribution.
Detect Spikes in Spectral Density watcher.analyze()['num_spikes'] Detect the number of spikes in spectral density, indicating overfitting.
Estimate Random Noise Influence watcher.analyze()['max_rand_eval'] Estimate the influence of random noise in the layer weight matrix.
Detect Under-Training in Model Layers watcher.analyze()['rand_distance'] > threshold Detect under-trained layers that are too close to random matrices.

🧠 Special Model Types and Modifications

Feature Command Description
Analyze PEFT/LORA Models watcher.analyze(peft=True) Analyze PEFT/LORA fine-tuned models, including base and delta layers.
Apply SVD Sharpness for Layer Correction sharp_model = watcher.SVDSharpness(model=model) Apply SVD-based sharpening to correct overfitting in weight matrices.

πŸ“ Advanced Metrics

Feature Command Description
Compute Stability Rank watcher.get_summary(details)['stable_rank'] Compute the stable rank metric, a norm-adjusted measure of layer scale.
Measure Soft Rank of Layers watcher.get_summary(details)['mp_softrank'] Measure soft rank of layers, which indicates parameter efficiency.
Identify Over-Parameterized Layers watcher.analyze()['alpha'] > threshold Identify extremely over-parameterized layers for pruning or compression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment