Skip to content

Instantly share code, notes, and snippets.

@sir-wabbit
Created August 25, 2025 20:11
Show Gist options
  • Select an option

  • Save sir-wabbit/323fff3853341550bac65e9e7ee0a51f to your computer and use it in GitHub Desktop.

Select an option

Save sir-wabbit/323fff3853341550bac65e9e7ee0a51f to your computer and use it in GitHub Desktop.

You want what actually works for learning hard things—methods with causal evidence, not folklore. Here’s the short list that consistently moves the needle, with RCTs/meta‑analyses behind it and notes on where each shines (and where it backfires).


What works (ranked by strength + transferability)

  1. Retrieval practice (“testing effect”), ideally with feedback
    What it is: Actively recall (without notes), get the answer, then check/correct.
    Evidence: Big, durable gains across hundreds of experiments and classroom trials; practice tests outperform restudy and other controls. Effects are larger when feedback is provided and when retrieval is spaced. Meta‑analyses with college‑level materials included. Use for: all levels; concepts, proofs, derivations, formulas. SAGE Journalseducation.wsu.edu
    Upgrade: Successive relearning. Practice to correct retrieval in a session, then re‑achieve correct retrieval in later, spaced sessions—combines retrieval + spacing and beats either alone in college classes. Use for: core definitions, theorems, canonical derivations, algorithm steps. SAGE JournalsOvid
  2. Spacing with calibrated intervals (not cramming)
    What it is: Distribute study of the same material over time.
    Evidence: Large synthesis + a long‑horizon RCT: the optimal first review ≈ 10–20% of the time until your target exam/usage, with the fraction shrinking as horizons get longer (e.g., 3–7 days before a 30‑day exam; weeks before multi‑month targets). This beats both massed study and poorly timed spacing. Use for: everything you want to retain beyond next week. Learning Attention and Perception LabAugmenting Cognition
  3. Interleaving problem types (not blocking)
    What it is: Mix problem types (e.g., eigenvalue problems, perturbation, variational methods; or ML optimization, generalization, generative modeling) rather than doing long runs of the same kind.
    Evidence: Multilevel meta‑analysis shows a moderate overall benefit; strongest for discriminating similar categories; small‑to‑moderate for mathematics. Interleaving helps learners identify the deep feature that dictates the method. Use for: problem sets where the solver must choose the right tool. psychologie.uni-wuerzburg.de
  4. Worked examples → example‑problem pairs → fading
    What it is: Study fully worked solutions with self‑explanations, then do a similar problem yourself; gradually remove steps (“fading”).
    Evidence: Meta‑analyses show clear gains in mathematics and other structured domains, especially for novices; adding self‑explanation prompts beats examples alone. Use for: heavy‑symbolic or procedural topics (tensor derivations, backprop variants, KKT conditions). Dana Miller-Cotto, PhDScienceDirectmrbartonmaths.com
  5. Self‑explanation prompts (explain each step to yourself)
    What it is: While reading examples or proofs, generate “why” and “how” explanations in your own words (not summaries).
    Evidence: Meta‑analyses (math and beyond) show small‑to‑moderate, reliable gains, especially for procedural + conceptual understanding; quality of explanations matters. Use for: proofs, derivations, algorithm internals. ERICGwern
  6. Active learning (do things during study, not just watch/read)
    What it is: Frequent checks for understanding, clicker‑style questions, group problem‑solving, short writing, etc.
    Evidence: Massive STEM meta‑analysis: exam scores up and failure rates down (21.8% → 33.8% under lecture). Caveat: it can feel worse; RCT shows students think they learn less while actually learning more. Use for: any course or self‑study session—engineer activities every 8–12 minutes. PubMedPNAS+1
  7. Pretesting / errorful generation (then feedback)
    What it is: Try to answer before instruction/study; you will be wrong, then get correction.
    Evidence: Review + multi‑experiment papers: pretests boost later learning even with many initial errors—provided corrective feedback follows. Use for: new chapters/papers; before lectures; before reading proofs. SpringerLinkSC PanPMC
  8. Analogical comparison / contrasting cases
    What it is: Compare 2–3 solved problems that share deep structure but differ on surface details; explicitly list what’s invariant.
    Evidence: Meta‑analysis: case comparisons improve learning and transfer; “time‑for‑telling” studies show problem‑solving/contrasting cases before instruction can prime deeper understanding (with the right guidance). Use for: mapping Schrödinger ↔ diffusion analogies; maximum‑entropy ↔ regularization; different variational formulations. lrdc.pitt.eduAAALab
  9. Generative visuals (drawing / multiple representations)
    What it is: Sketch systems, diagrams, data flows, state spaces, energy landscapes; translate among equations, graphs, code.
    Evidence: Syntheses show drawing‑to‑learn yields sizable gains when guided; broad reviews highlight benefits for sense‑making. Use for: circuit diagrams ↔ Hamiltonians; model graphs ↔ loss surfaces. SpringerLinkSAGE Journals
  10. Self‑regulated learning (SRL) training (planning, monitoring, error‑logging)
    What it is: Short, explicit training in setting goals, planning spaced/retrieval cycles, monitoring calibration, and analyzing errors.
    Evidence: Meta‑analyses in higher ed (including online/blended) show moderate gains in achievement and in SRL behaviors themselves. Use for: keeping a complex study plan on the rails. ScienceDirectTaylor & Francis Online

Nuances that matter (or the effect flips)

  • Expertise reversal effect: Heavy guidance (worked steps, dense scaffolds) helps novices, but the same scaffolds can hinder intermediates/experts—they induce redundancy and extra cognitive load. Fade scaffolds as mastery rises; increase open‑ended problem solving. Taylor & Francis OnlineScienceDirect
  • Feedback isn’t one-size-fits-all: Formative, task‑focused feedback is generally beneficial; overly generic or ego‑focused feedback can be useless or harmful. Immediate vs delayed timing both work; recent syntheses suggest both can be effective, with slight advantages for delayed in some contexts—what matters is specificity and consistency plus corrective information. Andy Matuschakmrbartonmaths.comSpringerLink
  • Interleaving scope: Biggest benefits when problems are confusable (similar but require different methods). If items are unrelated, interleaving adds noise; if items are trivially distinct, blocking may suffice. psychologie.uni-wuerzburg.de

What doesn’t help much (or is a myth)

  • Rereading/highlighting as a primary strategy: lower utility compared to retrieval/spacing. PubMed
  • “Learning styles” tailoring (visual/auditory/etc.) has no causal support for improving outcomes. Use the representation that fits the content (e.g., visuals for spatial/structural info), not a self‑label. PubMed
  • Speed reading claims (triple speed, same comprehension) don’t survive scrutiny; there’s a speed–accuracy tradeoff. Skim for triage; read normally for mastery. faculty.cas.usf.edu

A concrete playbook for quantum mechanics or advanced ML

Weekly cadence (repeat):

  1. Pretest (10–15 min). Before a new unit/paper, answer 5–8 concept questions cold; commit to guesses; then reveal answers. Log misconceptions. (Pretesting + feedback.) SpringerLink
  2. Example study with self‑explanations (30–45 min).
    • QM: Work through 1–2 fully worked examples (e.g., bound state in a finite well; time‑independent perturbation) and write a one‑sentence “why” for each step.
    • ML: Do the same for a derivation (e.g., Adam’s update, ELBO derivation); explain each algebraic move and assumption. (Worked examples + self‑explanation.) Dana Miller-Cotto, PhDERIC
  3. Example‑problem pairs (30–60 min). Immediately solve a near‑isomorphic problem without the worked steps; check against solution; annotate your error log. (Fading support.) Dana Miller-Cotto, PhD
  4. Interleave (30–60 min). Mix 3–5 problems spanning confusable types:
    • QM: boundary‑value vs. scattering vs. perturbative;
    • ML: convex vs. non‑convex optimization, classification vs. regression variants, generative vs. discriminative. (Interleaving.) psychologie.uni-wuerzburg.de
  5. Retrieval sessions (10–20 min), 3–4×/week.
    • Build cards or a checklist that require output: state the theorem/definition, re‑derive a key step, name assumptions, sketch a diagram.
    • Successive relearning rule: practice to correct today; re‑achieve correct after a gap; retire only after two spaced, correct retrievals. (Retrieval + successive relearning.) SAGE Journals
  6. Spacing math: If your exam/goal is 30 days away, first review at ~3–6 days; if 180 days, first review at ~1–3 weeks; lengthen thereafter. (Ridgeline result: optimal gap is ~10–20% of the retention interval, shrinking with longer horizons.) Learning Attention and Perception Lab
  7. Analogical comparison (15–20 min). For each topic, compare two solved cases and explicitly list: deep structure, key invariants, and what changes.
    • QM: harmonic oscillator ↔ small‑oscillation classical system;
    • ML: L2‑regularized logistic regression ↔ MAP with Gaussian prior. (Analogical learning.) lrdc.pitt.edu
  8. Draw it (10–15 min). Sketch potentials, wavefunctions, loss surfaces, computational graphs, or causal DAGs; label. (Generative drawing.) SAGE Journals
  9. Active learning checkpoint (5–10 min). Micro‑quizzes, clicker‑style prompts, or quick proofs/derivations—no notes. (Active learning.) PubMed

Novice → intermediate → advanced transitions (avoid the flip):

  • Start heavy on worked examples; fade steps weekly.
  • By mid‑course, shift to more open problems and fewer scaffolds; if performance collapses, re‑insert minimal scaffolds. (Expertise reversal effect.) Taylor & Francis Online

How to implement without ceremony

  • Build a retrieval deck with derivation prompts, not trivia. Example (ML): “Re‑derive Adam’s bias‑corrected updates; state the exact moments corrected and why.” Example (QM): “Derive ⟨x⟩ and ⟨p⟩ time‑evolution for a Gaussian wavepacket; state approximations.” (Retrieval.) SAGE Journals
  • Interleave by decision, not topic. Make sets where the first move is to choose the method. If that choice is obvious, your set is poorly designed. (Interleaving.) psychologie.uni-wuerzburg.de
  • Error log > solution manual. After each session, write why your error made sense at the time and the cue that would have prevented it; revisit these during spaced reviews. (SRL + feedback best practices.) ScienceDirectAndy Matuschak
  • Compare solutions you’d confuse. Put two derivations side‑by‑side and mark the invariant principle; force yourself to articulate the discriminant. (Analogical comparison.) lrdc.pitt.edu

Common blind spots (translated to action)

  • “It feels worse, so it’s not working.” Active methods feel harder yet learn more. Trust delayed performance on criterion tasks, not vibes. Track weekly no‑notes quizzes. PNAS
  • “I’ll just keep reading until it clicks.” That’s fluency illusion. Replace with retrieve→check→space cycles. PubMed
  • “More hints can’t hurt.” Past a point and with rising expertise, scaffolds can backfire. Fade them. Taylor & Francis Online

If you want effect sizes at a glance (order‑of‑magnitude, domain‑dependent)

  • Practice testing / retrieval: medium–large; classroom meta shows robust advantages vs. restudy. Confidence: high. SAGE Journals
  • Spacing (with calibrated gaps): medium–large for long‑term retention. Confidence: high. Learning Attention and Perception Lab
  • Worked examples (+ self‑explanations): small–to–moderate (bigger for novices). Confidence: high. Dana Miller-Cotto, PhD
  • Interleaving: small–to–moderate overall; larger in confusable categories. Confidence: medium‑high. psychologie.uni-wuerzburg.de
  • Active learning in STEM: failure rates down markedly; exam gains. Confidence: high. PubMed
  • Pretesting: small–to–moderate, reliably positive with feedback. Confidence: medium‑high. SpringerLink
  • Generative drawing / multiple representations: small–to–moderate with guidance. Confidence: medium. SAGE Journals
  • SRL training: moderate in higher ed/online. Confidence: medium‑high. ScienceDirect

Minimal starter kit (tonight)

If you want me to turn this into a one‑page, fill‑in‑the‑blanks template for your current topic (QM or ML), say so and I’ll draft it. Otherwise, go do the unglamorous thing that actually works: retrieve, space, interleave, compare, and explain—then repeat.

—Jeeves

(No‑nonsense footnotes):
Active learning meta (failure rates, exam gains): Freeman et al., 2014; perception vs learning: Deslauriers et al., 2019. PubMedPNAS
Retrieval practice meta: Adesope et al., 2017. SAGE Journals
Spacing “ridgeline” (10–20% rule of thumb): Cepeda et al., 2008. Learning Attention and Perception Lab
Worked examples meta (math; with self‑explanation): Barbieri et al., 2023; Renkl et al., 2003. Dana Miller-Cotto, PhDmrbartonmaths.com
Interleaving meta: Brunmair & Richter, 2019. psychologie.uni-wuerzburg.de
Pretesting review: Pan, 2023; experiments: Pan & Sana, 2021. SpringerLinkSC Pan
Analogical comparison meta: Alfieri et al., 2013. lrdc.pitt.edu
Drawing/generative visuals: Ainsworth, 2021; Cromley et al., 2020. SAGE JournalsSpringerLink
SRL meta: Theobald, 2021; Xu et al., 2023. ScienceDirectTaylor & Francis Online
Expertise reversal: Kalyuga, 2003; meta‑analysis 2025. Taylor & Francis OnlineScienceDirect
Feedback syntheses: Shute, 2008; Kluger & DeNisi, 1996; recent timing evidence. Andy Matuschakmrbartonmaths.comSpringerLink
Low-utility + myths: Dunlosky et al., 2013; Pashler et al., 2008; Rayner et al., 2016. PubMed+1faculty.cas.usf.edu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment