rndmcnlly/npc_mcmc_demo.html

## npc_mcmc_demo.html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Bayesian NPC Generator</title>
    <style>
        * {
            box-sizing: border-box;
        }

        body {
            font-family: 'Segoe UI', system-ui, sans-serif;
            max-width: 900px;
            margin: 0 auto;
            padding: 20px;
            background: #1a1a2e;
            color: #eee;
            line-height: 1.6;
        }

        h1 {
            color: #7b68ee;
            border-bottom: 2px solid #7b68ee;
            padding-bottom: 10px;
        }

        h2 {
            color: #9370db;
            margin-top: 30px;
        }

        .controls {
            background: #16213e;
            padding: 20px;
            border-radius: 10px;
            margin-bottom: 20px;
        }

        .control-group {
            display: flex;
            flex-wrap: wrap;
            gap: 15px;
            margin-bottom: 15px;
        }

        .control-item {
            flex: 1;
            min-width: 150px;
        }

        label {
            display: block;
            margin-bottom: 5px;
            color: #aaa;
            font-size: 0.9em;
        }

        select, input {
            width: 100%;
            padding: 8px 12px;
            border: 1px solid #444;
            border-radius: 5px;
            background: #0f0f23;
            color: #eee;
            font-size: 1em;
        }

        select:focus, input:focus {
            outline: none;
            border-color: #7b68ee;
        }

        button {
            background: #7b68ee;
            color: white;
            border: none;
            padding: 12px 30px;
            border-radius: 5px;
            cursor: pointer;
            font-size: 1em;
            margin-right: 10px;
            margin-top: 10px;
            transition: background 0.2s;
        }

        button:hover {
            background: #6a5acd;
        }

        button.secondary {
            background: #444;
        }

        button.secondary:hover {
            background: #555;
        }

        .npc-card {
            background: linear-gradient(135deg, #1e3a5f 0%, #16213e 100%);
            border: 1px solid #2a4a7f;
            border-radius: 10px;
            padding: 25px;
            margin-bottom: 20px;
        }

        .npc-name {
            font-size: 1.8em;
            font-weight: bold;
            color: #7b68ee;
            margin-bottom: 5px;
        }

        .npc-demo {
            color: #aaa;
            margin-bottom: 15px;
            font-style: italic;
        }

        .npc-stats {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
            gap: 10px;
        }

        .stat {
            background: rgba(0,0,0,0.2);
            padding: 10px 15px;
            border-radius: 5px;
        }

        .stat-label {
            color: #888;
            font-size: 0.85em;
        }

        .stat-value {
            color: #fff;
            font-weight: 500;
        }

        .stats-panel {
            background: #16213e;
            padding: 20px;
            border-radius: 10px;
            margin-top: 20px;
        }

        .stats-grid {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
            gap: 15px;
        }

        .stat-box {
            background: rgba(0,0,0,0.2);
            padding: 15px;
            border-radius: 5px;
        }

        .stat-box h4 {
            margin: 0 0 10px 0;
            color: #7b68ee;
        }

        .bar-chart {
            margin-top: 10px;
        }

        .bar-row {
            display: flex;
            align-items: center;
            margin-bottom: 5px;
            font-size: 0.85em;
        }

        .bar-label {
            width: 100px;
            flex-shrink: 0;
        }

        .bar-container {
            flex: 1;
            height: 18px;
            background: rgba(0,0,0,0.3);
            border-radius: 3px;
            overflow: hidden;
        }

        .bar-fill {
            height: 100%;
            background: linear-gradient(90deg, #7b68ee, #9370db);
            transition: width 0.3s;
        }

        .bar-value {
            width: 50px;
            text-align: right;
            margin-left: 10px;
            color: #aaa;
        }

        .method-comparison {
            display: grid;
            grid-template-columns: 1fr 1fr;
            gap: 20px;
            margin-top: 20px;
        }

        @media (max-width: 600px) {
            .method-comparison {
                grid-template-columns: 1fr;
            }
        }

        .method-box {
            background: #16213e;
            padding: 20px;
            border-radius: 10px;
        }

        .method-box h3 {
            margin-top: 0;
            color: #7b68ee;
        }

        .attempt-counter {
            font-size: 2em;
            color: #7b68ee;
            text-align: center;
            padding: 20px;
        }

        .info-box {
            background: rgba(123, 104, 238, 0.1);
            border-left: 3px solid #7b68ee;
            padding: 15px;
            margin: 20px 0;
            border-radius: 0 5px 5px 0;
        }

        code {
            background: rgba(0,0,0,0.3);
            padding: 2px 6px;
            border-radius: 3px;
            font-family: 'Consolas', monospace;
        }
    </style>
</head>
<body>
    <h1>🎲 Bayesian NPC Generator</h1>

    <div class="info-box">
        <strong>About:</strong> This demonstrates the difference between <em>rejection sampling</em>
        (as used in the paper) and <em>MCMC-style sampling</em> (as PyMC would use) for generating
        demographically-plausible characters. Set constraints below and see how each method handles them.
    </div>

    <div class="controls">
        <h3>🎯 Constraints (leave as "Any" for no constraint)</h3>
        <div class="control-group">
            <div class="control-item">
                <label>Demographic Group</label>
                <select id="group">
                    <option value="">Any</option>
                    <option value="ca_resident">CA Resident</option>
                    <option value="sfpd">SF Police</option>
                    <option value="stem">STEM Worker</option>
                    <option value="unhoused">Unhoused</option>
                </select>
            </div>
            <div class="control-item">
                <label>Gender</label>
                <select id="gender">
                    <option value="">Any</option>
                    <option value="male">Male</option>
                    <option value="female">Female</option>
                </select>
            </div>
            <div class="control-item">
                <label>Ethnicity</label>
                <select id="ethnicity">
                    <option value="">Any</option>
                    <option value="white">White</option>
                    <option value="black">Black</option>
                    <option value="hispanic">Hispanic/Latino</option>
                    <option value="asian">Asian</option>
                </select>
            </div>
        </div>
        <div class="control-group">
            <div class="control-item">
                <label>Eye Color</label>
                <select id="eyes">
                    <option value="">Any</option>
                    <option value="brown">Brown</option>
                    <option value="blue">Blue</option>
                    <option value="green">Green</option>
                    <option value="hazel">Hazel</option>
                    <option value="grey">Grey</option>
                </select>
            </div>
            <div class="control-item">
                <label>Hair Color</label>
                <select id="hair">
                    <option value="">Any</option>
                    <option value="black">Black</option>
                    <option value="brown">Brown</option>
                    <option value="blonde">Blonde</option>
                    <option value="red">Red</option>
                </select>
            </div>
            <div class="control-item">
                <label>Min Height (cm)</label>
                <input type="number" id="minHeight" placeholder="e.g., 180">
            </div>
            <div class="control-item">
                <label>Max BMI</label>
                <input type="number" id="maxBmi" placeholder="e.g., 25">
            </div>
        </div>
        <div>
            <button onclick="generateNPC()">🎲 Generate NPC</button>
            <button onclick="compareMethod()" class="secondary">📊 Compare Methods (100 samples)</button>
            <button onclick="resetConstraints()" class="secondary">🔄 Reset</button>
        </div>
    </div>

    <div id="npcResult"></div>

    <div id="comparison"></div>

    <h2>📈 Current Probability Distributions</h2>
    <div class="stats-panel">
        <div class="stats-grid" id="distributions"></div>
    </div>

    <script>
    // ============================================
    // DEMOGRAPHIC DATA (simplified from paper)
    // ============================================

    const DEMOGRAPHICS = {
        // P(ethnicity | group) - from paper's sources
        ethnicity: {
            ca_resident: { white: 0.37, hispanic: 0.39, asian: 0.15, black: 0.06 },
            sfpd: { white: 0.45, hispanic: 0.20, asian: 0.25, black: 0.10 },
            stem: { white: 0.63, hispanic: 0.08, asian: 0.21, black: 0.07 },
            unhoused: { white: 0.35, hispanic: 0.22, asian: 0.06, black: 0.35 }
        },

        // P(gender | group)
        gender: {
            ca_resident: { male: 0.49, female: 0.51 },
            sfpd: { male: 0.82, female: 0.18 },
            stem: { male: 0.73, female: 0.27 },
            unhoused: { male: 0.70, female: 0.30 }
        },

        // P(eye_color | ethnicity) - with gating from paper
        eyeColor: {
            white: { brown: 0.45, blue: 0.27, grey: 0.10, hazel: 0.09, green: 0.09 },
            black: { brown: 0.85, hazel: 0.10, grey: 0.05 },
            hispanic: { brown: 0.80, hazel: 0.12, grey: 0.08 },
            asian: { brown: 0.90, hazel: 0.07, grey: 0.03 }
        },

        // P(hair_color | ethnicity) - with gating from paper
        hairColor: {
            white: { brown: 0.45, black: 0.15, blonde: 0.25, red: 0.15 },
            black: { black: 0.85, brown: 0.15 },
            hispanic: { black: 0.70, brown: 0.30 },
            asian: { black: 0.90, brown: 0.10 }
        },

        // Height parameters (mean, std) by gender in cm
        height: {
            male: { mean: 175, std: 7 },
            female: { mean: 162, std: 6.5 }
        },

        // BMI parameters by gender
        bmi: {
            male: { mean: 26.5, std: 5 },
            female: { mean: 26.0, std: 5.5 }
        }
    };

    // Name data (simplified)
    const NAMES = {
        firstName: {
            male: {
                white: ['James', 'John', 'Robert', 'Michael', 'William', 'David', 'Joseph', 'Thomas', 'Daniel', 'Matthew', 'Donald', 'Steven', 'Brian', 'Kevin', 'Eric'],
                black: ['Michael', 'James', 'Anthony', 'Marcus', 'Terrence', 'Darnell', 'Jamal', 'DeShawn', 'Malik', 'Andre', 'Tyrone', 'Lamar', 'Cedric', 'Darius', 'Jerome'],
                hispanic: ['Jose', 'Carlos', 'Miguel', 'Juan', 'Luis', 'Jorge', 'Ricardo', 'Eduardo', 'Francisco', 'Antonio', 'Manuel', 'Rafael', 'Diego', 'Alejandro', 'Fernando'],
                asian: ['David', 'Michael', 'Kevin', 'Jason', 'Brian', 'William', 'Daniel', 'Andrew', 'James', 'Ryan', 'Wei', 'Chen', 'Hiroshi', 'Kenji', 'Min']
            },
            female: {
                white: ['Mary', 'Patricia', 'Jennifer', 'Linda', 'Elizabeth', 'Susan', 'Jessica', 'Sarah', 'Karen', 'Nancy', 'Emily', 'Amanda', 'Megan', 'Lauren', 'Ashley'],
                black: ['Michelle', 'Tiffany', 'Crystal', 'Ebony', 'Jasmine', 'Latoya', 'Keisha', 'Tamika', 'Aaliyah', 'Destiny', 'Shaniqua', 'Tanisha', 'Alicia', 'Briana', 'Imani'],
                hispanic: ['Maria', 'Carmen', 'Rosa', 'Guadalupe', 'Ana', 'Sofia', 'Isabella', 'Valentina', 'Camila', 'Lucia', 'Elena', 'Gabriela', 'Mariana', 'Daniela', 'Paula'],
                asian: ['Jennifer', 'Michelle', 'Jessica', 'Lisa', 'Amy', 'Christine', 'Angela', 'Stephanie', 'Emily', 'Grace', 'Mei', 'Yuki', 'Sakura', 'Lin', 'Anh']
            }
        },
        lastName: {
            white: ['Smith', 'Johnson', 'Williams', 'Brown', 'Jones', 'Miller', 'Davis', 'Wilson', 'Anderson', 'Taylor', 'Thomas', 'Moore', 'Martin', 'Thompson', 'White', 'Harris', 'Clark', 'Lewis', 'Robinson', 'Walker', 'Hall', 'Young', 'King', 'Wright', 'Hill', 'Scott', 'Adams', 'Baker', 'Nelson', 'Carter', "O'Brien", "O'Connor", 'Murphy', 'Kelly', 'Sullivan', 'Schmidt', 'Mueller', 'Weber', 'Meyer', 'Wagner'],
            black: ['Williams', 'Johnson', 'Smith', 'Jones', 'Brown', 'Jackson', 'Davis', 'Thomas', 'Harris', 'Robinson', 'Taylor', 'Moore', 'Martin', 'Anderson', 'Thompson', 'White', 'Washington', 'Jefferson', 'Freeman', 'Banks'],
            hispanic: ['Garcia', 'Rodriguez', 'Martinez', 'Hernandez', 'Lopez', 'Gonzalez', 'Perez', 'Sanchez', 'Ramirez', 'Torres', 'Flores', 'Rivera', 'Gomez', 'Diaz', 'Reyes', 'Morales', 'Cruz', 'Ortiz', 'Gutierrez', 'Chavez'],
            asian: ['Wang', 'Li', 'Zhang', 'Liu', 'Chen', 'Yang', 'Huang', 'Wu', 'Kim', 'Park', 'Lee', 'Choi', 'Nguyen', 'Tran', 'Pham', 'Tanaka', 'Yamamoto', 'Suzuki', 'Patel', 'Shah']
        }
    };

    // ============================================
    // SAMPLING UTILITIES
    // ============================================

    function weightedChoice(distribution) {
        const entries = Object.entries(distribution);
        const total = entries.reduce((sum, [_, p]) => sum + p, 0);
        let r = Math.random() * total;

        for (const [value, prob] of entries) {
            r -= prob;
            if (r <= 0) return value;
        }
        return entries[entries.length - 1][0];
    }

    function randomChoice(arr) {
        return arr[Math.floor(Math.random() * arr.length)];
    }

    function normalRandom(mean, std) {
        // Box-Muller transform
        const u1 = Math.random();
        const u2 = Math.random();
        const z = Math.sqrt(-2 * Math.log(u1)) * Math.cos(2 * Math.PI * u2);
        return mean + z * std;
    }

    function truncatedNormal(mean, std, min, max) {
        let value;
        do {
            value = normalRandom(mean, std);
        } while (value < min || value > max);
        return value;
    }

    // ============================================
    // REJECTION SAMPLING (Paper's approach)
    // ============================================

    function generateUnconstrainedNPC(group) {
        group = group || 'ca_resident';

        // Sample from prior distributions following the Bayes net
        const ethnicity = weightedChoice(DEMOGRAPHICS.ethnicity[group]);
        const gender = weightedChoice(DEMOGRAPHICS.gender[group]);

        const eyeColor = weightedChoice(DEMOGRAPHICS.eyeColor[ethnicity]);
        const hairColor = weightedChoice(DEMOGRAPHICS.hairColor[ethnicity]);

        const heightParams = DEMOGRAPHICS.height[gender];
        const height = truncatedNormal(heightParams.mean, heightParams.std, 140, 210);

        const bmiParams = DEMOGRAPHICS.bmi[gender];
        const bmi = truncatedNormal(bmiParams.mean, bmiParams.std, 16, 45);

        const weight = 0.0001 * bmi * height * height;

        const age = Math.floor(18 + Math.random() * 62);

        const firstName = randomChoice(NAMES.firstName[gender][ethnicity]);
        const lastName = randomChoice(NAMES.lastName[ethnicity]);

        return {
            group, ethnicity, gender, eyeColor, hairColor,
            height, bmi, weight, age, firstName, lastName
        };
    }

    function rejectionSample(constraints, maxAttempts = 10000) {
        let attempts = 0;

        while (attempts < maxAttempts) {
            attempts++;
            const npc = generateUnconstrainedNPC(constraints.group || null);

            // Check all constraints
            let valid = true;

            if (constraints.gender && npc.gender !== constraints.gender) valid = false;
            if (constraints.ethnicity && npc.ethnicity !== constraints.ethnicity) valid = false;
            if (constraints.eyes && npc.eyeColor !== constraints.eyes) valid = false;
            if (constraints.hair && npc.hairColor !== constraints.hair) valid = false;
            if (constraints.minHeight && npc.height < constraints.minHeight) valid = false;
            if (constraints.maxBmi && npc.bmi > constraints.maxBmi) valid = false;

            if (valid) {
                return { npc, attempts };
            }
        }

        return { npc: null, attempts };
    }

    // ============================================
    // MCMC-STYLE SAMPLING (PyMC approach)
    // ============================================

    // Compute posterior distribution given constraints
    function computePosterior(constraints) {
        const group = constraints.group || 'ca_resident';

        // Start with priors
        let ethnicityDist = { ...DEMOGRAPHICS.ethnicity[group] };
        let genderDist = { ...DEMOGRAPHICS.gender[group] };

        // If ethnicity is constrained, set it deterministically
        if (constraints.ethnicity) {
            ethnicityDist = { [constraints.ethnicity]: 1.0 };
        }

        // If gender is constrained, set it deterministically
        if (constraints.gender) {
            genderDist = { [constraints.gender]: 1.0 };
        }

        // If eye color is constrained, update ethnicity posterior
        // P(ethnicity | eye_color) ∝ P(eye_color | ethnicity) * P(ethnicity)
        if (constraints.eyes && !constraints.ethnicity) {
            const newEthDist = {};
            let total = 0;

            for (const [eth, prior] of Object.entries(ethnicityDist)) {
                const eyeDist = DEMOGRAPHICS.eyeColor[eth];
                const likelihood = eyeDist[constraints.eyes] || 0;
                newEthDist[eth] = likelihood * prior;
                total += newEthDist[eth];
            }

            // Normalize
            if (total > 0) {
                for (const eth of Object.keys(newEthDist)) {
                    newEthDist[eth] /= total;
                }
                ethnicityDist = newEthDist;
            }
        }

        // Similar update for hair color constraint
        if (constraints.hair && !constraints.ethnicity) {
            const newEthDist = {};
            let total = 0;

            for (const [eth, prior] of Object.entries(ethnicityDist)) {
                const hairDist = DEMOGRAPHICS.hairColor[eth];
                const likelihood = hairDist[constraints.hair] || 0;
                newEthDist[eth] = likelihood * prior;
                total += newEthDist[eth];
            }

            if (total > 0) {
                for (const eth of Object.keys(newEthDist)) {
                    newEthDist[eth] /= total;
                }
                ethnicityDist = newEthDist;
            }
        }

        // Height constraint affects gender posterior
        // Taller people more likely male
        if (constraints.minHeight && !constraints.gender) {
            const newGenderDist = {};
            let total = 0;

            for (const [g, prior] of Object.entries(genderDist)) {
                const params = DEMOGRAPHICS.height[g];
                // P(height > minHeight | gender) using normal CDF approximation
                const z = (constraints.minHeight - params.mean) / params.std;
                const likelihood = 1 - normalCDF(z);
                newGenderDist[g] = likelihood * prior;
                total += newGenderDist[g];
            }

            if (total > 0) {
                for (const g of Object.keys(newGenderDist)) {
                    newGenderDist[g] /= total;
                }
                genderDist = newGenderDist;
            }
        }

        return { ethnicityDist, genderDist, group };
    }

    function normalCDF(z) {
        // Approximation of standard normal CDF
        const a1 =  0.254829592;
        const a2 = -0.284496736;
        const a3 =  1.421413741;
        const a4 = -1.453152027;
        const a5 =  1.061405429;
        const p  =  0.3275911;

        const sign = z < 0 ? -1 : 1;
        z = Math.abs(z) / Math.sqrt(2);

        const t = 1.0 / (1.0 + p * z);
        const y = 1.0 - (((((a5 * t + a4) * t) + a3) * t + a2) * t + a1) * t * Math.exp(-z * z);

        return 0.5 * (1.0 + sign * y);
    }

    function mcmcSample(constraints) {
        const posterior = computePosterior(constraints);

        // Sample from posterior distributions
        const ethnicity = constraints.ethnicity || weightedChoice(posterior.ethnicityDist);
        const gender = constraints.gender || weightedChoice(posterior.genderDist);

        // Sample eye color given ethnicity (and constraint if any)
        let eyeColor;
        if (constraints.eyes) {
            eyeColor = constraints.eyes;
        } else {
            eyeColor = weightedChoice(DEMOGRAPHICS.eyeColor[ethnicity]);
        }

        // Sample hair color given ethnicity (and constraint if any)
        let hairColor;
        if (constraints.hair) {
            hairColor = constraints.hair;
        } else {
            hairColor = weightedChoice(DEMOGRAPHICS.hairColor[ethnicity]);
        }

        // Sample height given gender (with constraint if any)
        const heightParams = DEMOGRAPHICS.height[gender];
        let height;
        if (constraints.minHeight) {
            // Truncated normal above minHeight
            height = truncatedNormal(heightParams.mean, heightParams.std, constraints.minHeight, 210);
        } else {
            height = truncatedNormal(heightParams.mean, heightParams.std, 140, 210);
        }

        // Sample BMI given gender (with constraint if any)
        const bmiParams = DEMOGRAPHICS.bmi[gender];
        let bmi;
        if (constraints.maxBmi) {
            bmi = truncatedNormal(bmiParams.mean, bmiParams.std, 16, constraints.maxBmi);
        } else {
            bmi = truncatedNormal(bmiParams.mean, bmiParams.std, 16, 45);
        }

        const weight = 0.0001 * bmi * height * height;
        const age = Math.floor(18 + Math.random() * 62);

        const firstName = randomChoice(NAMES.firstName[gender][ethnicity]);
        const lastName = randomChoice(NAMES.lastName[ethnicity]);

        return {
            npc: {
                group: posterior.group,
                ethnicity, gender, eyeColor, hairColor,
                height, bmi, weight, age, firstName, lastName
            },
            attempts: 1,
            posterior
        };
    }

    // ============================================
    // UI FUNCTIONS
    // ============================================

    function getConstraints() {
        return {
            group: document.getElementById('group').value || null,
            gender: document.getElementById('gender').value || null,
            ethnicity: document.getElementById('ethnicity').value || null,
            eyes: document.getElementById('eyes').value || null,
            hair: document.getElementById('hair').value || null,
            minHeight: parseFloat(document.getElementById('minHeight').value) || null,
            maxBmi: parseFloat(document.getElementById('maxBmi').value) || null
        };
    }

    function formatHeight(cm) {
        const inches = cm / 2.54;
        const feet = Math.floor(inches / 12);
        const remainingInches = Math.round(inches % 12);
        return `${feet}'${remainingInches}" (${Math.round(cm)} cm)`;
    }

    function formatWeight(kg) {
        const lbs = kg * 2.205;
        return `${Math.round(lbs)} lbs (${Math.round(kg)} kg)`;
    }

    function renderNPC(npc, attempts, method) {
        if (!npc) {
            return `
                <div class="npc-card">
                    <div class="npc-name">❌ Failed to generate</div>
                    <div class="npc-demo">Could not find valid NPC after ${attempts} attempts</div>
                    <p>Try relaxing some constraints!</p>
                </div>
            `;
        }

        return `
            <div class="npc-card">
                <div class="npc-name">${npc.firstName} ${npc.lastName}</div>
                <div class="npc-demo">${npc.ethnicity} ${npc.gender}, Age ${npc.age}</div>
                <div class="npc-stats">
                    <div class="stat">
                        <div class="stat-label">Height</div>
                        <div class="stat-value">${formatHeight(npc.height)}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Weight</div>
                        <div class="stat-value">${formatWeight(npc.weight)}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">BMI</div>
                        <div class="stat-value">${npc.bmi.toFixed(1)}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Eyes</div>
                        <div class="stat-value">${npc.eyeColor}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Hair</div>
                        <div class="stat-value">${npc.hairColor}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Group</div>
                        <div class="stat-value">${npc.group}</div>
                    </div>
                </div>
                <p style="margin-top: 15px; color: #888;">
                    Generated via <strong>${method}</strong> in <strong>${attempts}</strong> attempt(s)
                </p>
            </div>
        `;
    }

    function renderDistribution(title, dist) {
        const entries = Object.entries(dist).sort((a, b) => b[1] - a[1]);
        const bars = entries.map(([label, prob]) => `
            <div class="bar-row">
                <div class="bar-label">${label}</div>
                <div class="bar-container">
                    <div class="bar-fill" style="width: ${prob * 100}%"></div>
                </div>
                <div class="bar-value">${(prob * 100).toFixed(1)}%</div>
            </div>
        `).join('');

        return `
            <div class="stat-box">
                <h4>${title}</h4>
                <div class="bar-chart">${bars}</div>
            </div>
        `;
    }

    function generateNPC() {
        const constraints = getConstraints();

        // Use MCMC-style sampling
        const result = mcmcSample(constraints);

        document.getElementById('npcResult').innerHTML = renderNPC(result.npc, result.attempts, 'MCMC-style Posterior Sampling');

        updateDistributions(constraints);
    }

    function updateDistributions(constraints) {
        constraints = constraints || getConstraints();
        const posterior = computePosterior(constraints);
        const group = constraints.group || 'ca_resident';

        // Get eye color distribution given current ethnicity posterior
        const eyeDist = {};
        for (const [eth, ethProb] of Object.entries(posterior.ethnicityDist)) {
            for (const [eye, eyeProb] of Object.entries(DEMOGRAPHICS.eyeColor[eth])) {
                eyeDist[eye] = (eyeDist[eye] || 0) + eyeProb * ethProb;
            }
        }

        // Get hair color distribution
        const hairDist = {};
        for (const [eth, ethProb] of Object.entries(posterior.ethnicityDist)) {
            for (const [hair, hairProb] of Object.entries(DEMOGRAPHICS.hairColor[eth])) {
                hairDist[hair] = (hairDist[hair] || 0) + hairProb * ethProb;
            }
        }

        document.getElementById('distributions').innerHTML = `
            ${renderDistribution('P(Ethnicity | constraints)', posterior.ethnicityDist)}
            ${renderDistribution('P(Gender | constraints)', posterior.genderDist)}
            ${renderDistribution('P(Eye Color | constraints)', eyeDist)}
            ${renderDistribution('P(Hair Color | constraints)', hairDist)}
        `;
    }

    function compareMethod() {
        const constraints = getConstraints();
        const numSamples = 100;

        // Rejection sampling
        let rejectionAttempts = 0;
        let rejectionSuccesses = 0;
        const rejectionEthCounts = {};
        const rejectionGenderCounts = {};

        for (let i = 0; i < numSamples; i++) {
            const result = rejectionSample(constraints, 1000);
            rejectionAttempts += result.attempts;
            if (result.npc) {
                rejectionSuccesses++;
                rejectionEthCounts[result.npc.ethnicity] = (rejectionEthCounts[result.npc.ethnicity] || 0) + 1;
                rejectionGenderCounts[result.npc.gender] = (rejectionGenderCounts[result.npc.gender] || 0) + 1;
            }
        }

        // MCMC sampling
        const mcmcEthCounts = {};
        const mcmcGenderCounts = {};

        for (let i = 0; i < numSamples; i++) {
            const result = mcmcSample(constraints);
            mcmcEthCounts[result.npc.ethnicity] = (mcmcEthCounts[result.npc.ethnicity] || 0) + 1;
            mcmcGenderCounts[result.npc.gender] = (mcmcGenderCounts[result.npc.gender] || 0) + 1;
        }

        // Normalize to distributions
        const normalize = (counts) => {
            const total = Object.values(counts).reduce((a, b) => a + b, 0);
            const dist = {};
            for (const [k, v] of Object.entries(counts)) {
                dist[k] = v / total;
            }
            return dist;
        };

        document.getElementById('comparison').innerHTML = `
            <h2>📊 Method Comparison (${numSamples} samples each)</h2>
            <div class="method-comparison">
                <div class="method-box">
                    <h3>❌ Rejection Sampling (Paper's Method)</h3>
                    <div class="attempt-counter">${rejectionAttempts} total attempts</div>
                    <p>Success rate: ${rejectionSuccesses}/${numSamples}</p>
                    <p>Avg attempts per success: ${(rejectionAttempts / rejectionSuccesses).toFixed(1)}</p>
                    ${renderDistribution('Ethnicity Distribution', normalize(rejectionEthCounts))}
                    ${renderDistribution('Gender Distribution', normalize(rejectionGenderCounts))}
                </div>
                <div class="method-box">
                    <h3>✅ MCMC-style Sampling (PyMC approach)</h3>
                    <div class="attempt-counter">${numSamples} total attempts</div>
                    <p>Success rate: 100%</p>
                    <p>Avg attempts per success: 1.0</p>
                    ${renderDistribution('Ethnicity Distribution', normalize(mcmcEthCounts))}
                    ${renderDistribution('Gender Distribution', normalize(mcmcGenderCounts))}
                </div>
            </div>
            <div class="info-box" style="margin-top: 20px;">
                <strong>Key insight:</strong> Try setting <code>Eye Color = blue</code> and/or <code>Hair = blonde</code>.
                These are rare in non-white populations, so rejection sampling needs many attempts, while
                MCMC-style sampling correctly updates the ethnicity posterior (making white more likely)
                and samples efficiently. The paper notes this problem with continuous variables
                (height/BMI) being "effectively impossible" to constrain with rejection sampling.
            </div>
        `;
    }

    function resetConstraints() {
        document.getElementById('group').value = '';
        document.getElementById('gender').value = '';
        document.getElementById('ethnicity').value = '';
        document.getElementById('eyes').value = '';
        document.getElementById('hair').value = '';
        document.getElementById('minHeight').value = '';
        document.getElementById('maxBmi').value = '';

        document.getElementById('npcResult').innerHTML = '';
        document.getElementById('comparison').innerHTML = '';
        updateDistributions();
    }

    // Initialize
    updateDistributions();
    </script>
</body>
</html>

## transcript.md

      
    Raw
  

              transcript.md
            
          
    USER

Get this paper and then walk me through how PyMC would be used to get a principled sampling procedure.
https://ceur-ws.org/Vol-4090/paper9.pdf
ASSISTANT


Tool Executed

## Paper Summary
This is a fascinating paper by Ian Horswill about generating demographically-plausible NPCs (non-player characters) for a tabletop RPG set in present-day San Francisco. The system uses stochastic logic programming to sample character features (name, ethnicity, gender, height, weight, hair color, eye color, age) from a Bayes net that encodes real-world demographic statistics from US Census data and other sources.
The key sampling technique used is rejection sampling - generate candidates according to prior distributions, then reject any that don't match user-specified constraints.

How PyMC Could Provide a More Principled Sampling Procedure

The paper's rejection sampling approach has known limitations (inefficiency for low-probability constraints, incorrect handling of continuous variables). PyMC offers several advantages for this kind of problem. Let me walk you through how you'd implement this:
1. Define the Bayesian Graphical Model

PyMC lets you directly encode the Bayes net from Figure 1:
import pymc as pm
import numpy as np

with pm.Model() as npc_model:
    # Root nodes (priors)
    
    # Demographic group - categorical prior
    group = pm.Categorical('group', 
        p=[0.7, 0.1, 0.15, 0.05],  # CA resident, SFPD, STEM, unhoused
        dims='group_type'
    )
    
    # Age - could be uniform or use demographic data
    age = pm.Uniform('age', lower=18, upper=85)
    
    # Ethnicity conditioned on demographic group
    # P(ethnicity | group) - different distributions per group
    ethnicity_probs = pm.math.switch(
        group,
        [0.37, 0.06, 0.15, 0.39, 0.01, 0.02],  # CA resident
        [0.45, 0.10, 0.20, 0.20, 0.02, 0.03],  # SFPD (example)
        # ... other groups
    )
    ethnicity = pm.Categorical('ethnicity', p=ethnicity_probs)
    
    # Gender conditioned on group
    gender = pm.Bernoulli('gender', p=0.49)  # 0=female, 1=male
    
    # Height conditioned on age, gender, ethnicity (CDC data)
    # Using truncated normal with parameters from CDC percentiles
    height_mu = pm.math.switch(gender, 175.0, 162.0)  # cm, simplified
    height_sigma = 7.0
    height = pm.TruncatedNormal('height', mu=height_mu, sigma=height_sigma, 
                                 lower=140, upper=210)
    
    # BMI conditioned on age, gender, ethnicity  
    bmi = pm.TruncatedNormal('bmi', mu=26, sigma=5, lower=15, upper=50)
    
    # Weight is deterministic given height and BMI
    weight = pm.Deterministic('weight', 0.0001 * bmi * height**2)
    
    # Eye color conditioned on ethnicity
    # Gating certain colors for non-white ethnicities
    eye_color = pm.Categorical('eye_color', 
        p=get_eye_probs(ethnicity))  # function encoding the logic
    
    # Hair color similarly
    hair_color = pm.Categorical('hair_color',
        p=get_hair_probs(ethnicity))
    
    # Names conditioned on ethnicity and gender
    # This would be a large categorical over name indices
    first_name = pm.Categorical('first_name', 
        p=get_first_name_probs(ethnicity, gender))
    last_name = pm.Categorical('last_name',
        p=get_last_name_probs(ethnicity))
2. Use Observations for Constraints

The key advantage of PyMC: instead of rejection sampling, you can condition on observed values directly:
with npc_model:
    # User wants a blue-eyed, 190cm tall character
    # Set these as observed data
    
    # For discrete: use pm.Potential to add log-probability mass
    pm.Potential('eye_constraint', 
        pm.math.log(pm.math.eq(eye_color, BLUE_INDEX)))
    
    # For continuous: observe directly
    height_obs = pm.Normal('height_obs', mu=height, sigma=0.1, observed=190)
3. Sample with MCMC Instead of Rejection Sampling

This is the crucial improvement:
with npc_model:
    # Use NUTS (No U-Turn Sampler) for continuous variables
    # Use Metropolis for discrete variables
    
    trace = pm.sample(
        draws=1000,
        tune=500,
        cores=4,
        return_inferencedata=True
    )
    
    # Get a single sample for an NPC
    sample = trace.posterior.isel(chain=0, draw=np.random.randint(1000))
4. Why This is More Principled


Aspect
Paper's Rejection Sampling
PyMC's MCMC


Efficiency
O(1/p) where p is probability of constraints
Adapts to target distribution


Continuous variables
Essentially impossible to condition on exact values
Handles naturally via gradients


Complex constraints
Must restart from beginning
Explores constrained space efficiently


Inference
Only forward sampling
Can do both prior and posterior inference


Correlated features
Hard to capture covariance (paper notes hair+eye color issue)
Naturally captures via joint posterior


5. Handling the Continuous Variable Problem

The paper explicitly notes that conditioning on exact floating-point values is "effectively impossible" with rejection sampling. PyMC solves this elegantly:
with pm.Model() as constrained_model:
    # ... define model ...
    
    # Condition on height being ~190cm (with small tolerance)
    pm.Normal('height_constraint', mu=height, sigma=1.0, observed=190)
    
    # Or use a potential for a hard constraint
    pm.Potential('height_range', 
        pm.math.switch((height > 185) & (height < 195), 0, -np.inf))
6. For Real-Time Use (Games)

If you need fast sampling (not MCMC's warmup time), you can:

Pre-compute samples for common constraint combinations
Use variational inference for approximate but fast posteriors:

with npc_model:
    approx = pm.fit(n=10000, method='advi')
    # Now sampling is instant:
    samples = approx.sample(100)

Use PyMC's prior predictive for unconditional generation (fast):

with npc_model:
    prior_samples = pm.sample_prior_predictive(samples=1000)
7. Capturing Covariance the Paper Couldn't

The paper laments that hair and eye color are treated as independent. In PyMC, you can model their correlation:
with pm.Model():
    # Joint distribution over (hair, eye) with correlation
    # Using a multivariate approach or copula
    
    hair_eye_joint = pm.Categorical('hair_eye_joint',
        p=joint_hair_eye_probs)  # 2D table flattened
    
    # Or model with latent "pigmentation" variable
    pigmentation = pm.Beta('pigmentation', alpha=2, beta=2)
    
    hair_dark_prob = pigmentation
    eye_dark_prob = 0.8 * pigmentation + 0.1  # correlated
    
    hair_is_dark = pm.Bernoulli('hair_dark', p=hair_dark_prob)
    eye_is_dark = pm.Bernoulli('eye_dark', p=eye_dark_prob)

Summary

PyMC would improve on the paper's approach by:

Proper conditioning on continuous variables (height, BMI, weight)
Efficient sampling even for low-probability constraints via MCMC
Capturing correlations between features that the paper treats as independent
Principled uncertainty quantification - you get a full posterior, not just point samples
Flexibility - easy to add new features or change the model structure

The trade-off is that MCMC has startup costs, so for a real-time game you'd want to use variational inference or pre-compute samples. For the paper's use case (tabletop RPG game master tool where a few hundred milliseconds is acceptable), PyMC's MCMC would work perfectly and be more statistically rigorous.
USER

Uh, i don’t have pythom, just just do html and js to give me an exampel of this idea in a way that runs right here.
ASSISTANT

Here's a complete HTML/JS implementation that demonstrates the Bayesian NPC generator concept with proper conditioning:
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Bayesian NPC Generator</title>
    <style>
        * {
            box-sizing: border-box;
        }
        
        body {
            font-family: 'Segoe UI', system-ui, sans-serif;
            max-width: 900px;
            margin: 0 auto;
            padding: 20px;
            background: #1a1a2e;
            color: #eee;
            line-height: 1.6;
        }
        
        h1 {
            color: #7b68ee;
            border-bottom: 2px solid #7b68ee;
            padding-bottom: 10px;
        }
        
        h2 {
            color: #9370db;
            margin-top: 30px;
        }
        
        .controls {
            background: #16213e;
            padding: 20px;
            border-radius: 10px;
            margin-bottom: 20px;
        }
        
        .control-group {
            display: flex;
            flex-wrap: wrap;
            gap: 15px;
            margin-bottom: 15px;
        }
        
        .control-item {
            flex: 1;
            min-width: 150px;
        }
        
        label {
            display: block;
            margin-bottom: 5px;
            color: #aaa;
            font-size: 0.9em;
        }
        
        select, input {
            width: 100%;
            padding: 8px 12px;
            border: 1px solid #444;
            border-radius: 5px;
            background: #0f0f23;
            color: #eee;
            font-size: 1em;
        }
        
        select:focus, input:focus {
            outline: none;
            border-color: #7b68ee;
        }
        
        button {
            background: #7b68ee;
            color: white;
            border: none;
            padding: 12px 30px;
            border-radius: 5px;
            cursor: pointer;
            font-size: 1em;
            margin-right: 10px;
            margin-top: 10px;
            transition: background 0.2s;
        }
        
        button:hover {
            background: #6a5acd;
        }
        
        button.secondary {
            background: #444;
        }
        
        button.secondary:hover {
            background: #555;
        }
        
        .npc-card {
            background: linear-gradient(135deg, #1e3a5f 0%, #16213e 100%);
            border: 1px solid #2a4a7f;
            border-radius: 10px;
            padding: 25px;
            margin-bottom: 20px;
        }
        
        .npc-name {
            font-size: 1.8em;
            font-weight: bold;
            color: #7b68ee;
            margin-bottom: 5px;
        }
        
        .npc-demo {
            color: #aaa;
            margin-bottom: 15px;
            font-style: italic;
        }
        
        .npc-stats {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
            gap: 10px;
        }
        
        .stat {
            background: rgba(0,0,0,0.2);
            padding: 10px 15px;
            border-radius: 5px;
        }
        
        .stat-label {
            color: #888;
            font-size: 0.85em;
        }
        
        .stat-value {
            color: #fff;
            font-weight: 500;
        }
        
        .stats-panel {
            background: #16213e;
            padding: 20px;
            border-radius: 10px;
            margin-top: 20px;
        }
        
        .stats-grid {
            display: grid;
            grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
            gap: 15px;
        }
        
        .stat-box {
            background: rgba(0,0,0,0.2);
            padding: 15px;
            border-radius: 5px;
        }
        
        .stat-box h4 {
            margin: 0 0 10px 0;
            color: #7b68ee;
        }
        
        .bar-chart {
            margin-top: 10px;
        }
        
        .bar-row {
            display: flex;
            align-items: center;
            margin-bottom: 5px;
            font-size: 0.85em;
        }
        
        .bar-label {
            width: 100px;
            flex-shrink: 0;
        }
        
        .bar-container {
            flex: 1;
            height: 18px;
            background: rgba(0,0,0,0.3);
            border-radius: 3px;
            overflow: hidden;
        }
        
        .bar-fill {
            height: 100%;
            background: linear-gradient(90deg, #7b68ee, #9370db);
            transition: width 0.3s;
        }
        
        .bar-value {
            width: 50px;
            text-align: right;
            margin-left: 10px;
            color: #aaa;
        }
        
        .method-comparison {
            display: grid;
            grid-template-columns: 1fr 1fr;
            gap: 20px;
            margin-top: 20px;
        }
        
        @media (max-width: 600px) {
            .method-comparison {
                grid-template-columns: 1fr;
            }
        }
        
        .method-box {
            background: #16213e;
            padding: 20px;
            border-radius: 10px;
        }
        
        .method-box h3 {
            margin-top: 0;
            color: #7b68ee;
        }
        
        .attempt-counter {
            font-size: 2em;
            color: #7b68ee;
            text-align: center;
            padding: 20px;
        }
        
        .info-box {
            background: rgba(123, 104, 238, 0.1);
            border-left: 3px solid #7b68ee;
            padding: 15px;
            margin: 20px 0;
            border-radius: 0 5px 5px 0;
        }
        
        code {
            background: rgba(0,0,0,0.3);
            padding: 2px 6px;
            border-radius: 3px;
            font-family: 'Consolas', monospace;
        }
    </style>
</head>
<body>
    <h1>🎲 Bayesian NPC Generator</h1>
    
    <div class="info-box">
        <strong>About:</strong> This demonstrates the difference between <em>rejection sampling</em> 
        (as used in the paper) and <em>MCMC-style sampling</em> (as PyMC would use) for generating 
        demographically-plausible characters. Set constraints below and see how each method handles them.
    </div>

    <div class="controls">
        <h3>🎯 Constraints (leave as "Any" for no constraint)</h3>
        <div class="control-group">
            <div class="control-item">
                <label>Demographic Group</label>
                <select id="group">
                    <option value="">Any</option>
                    <option value="ca_resident">CA Resident</option>
                    <option value="sfpd">SF Police</option>
                    <option value="stem">STEM Worker</option>
                    <option value="unhoused">Unhoused</option>
                </select>
            </div>
            <div class="control-item">
                <label>Gender</label>
                <select id="gender">
                    <option value="">Any</option>
                    <option value="male">Male</option>
                    <option value="female">Female</option>
                </select>
            </div>
            <div class="control-item">
                <label>Ethnicity</label>
                <select id="ethnicity">
                    <option value="">Any</option>
                    <option value="white">White</option>
                    <option value="black">Black</option>
                    <option value="hispanic">Hispanic/Latino</option>
                    <option value="asian">Asian</option>
                </select>
            </div>
        </div>
        <div class="control-group">
            <div class="control-item">
                <label>Eye Color</label>
                <select id="eyes">
                    <option value="">Any</option>
                    <option value="brown">Brown</option>
                    <option value="blue">Blue</option>
                    <option value="green">Green</option>
                    <option value="hazel">Hazel</option>
                    <option value="grey">Grey</option>
                </select>
            </div>
            <div class="control-item">
                <label>Hair Color</label>
                <select id="hair">
                    <option value="">Any</option>
                    <option value="black">Black</option>
                    <option value="brown">Brown</option>
                    <option value="blonde">Blonde</option>
                    <option value="red">Red</option>
                </select>
            </div>
            <div class="control-item">
                <label>Min Height (cm)</label>
                <input type="number" id="minHeight" placeholder="e.g., 180">
            </div>
            <div class="control-item">
                <label>Max BMI</label>
                <input type="number" id="maxBmi" placeholder="e.g., 25">
            </div>
        </div>
        <div>
            <button onclick="generateNPC()">🎲 Generate NPC</button>
            <button onclick="compareMethod()" class="secondary">📊 Compare Methods (100 samples)</button>
            <button onclick="resetConstraints()" class="secondary">🔄 Reset</button>
        </div>
    </div>

    <div id="npcResult"></div>
    
    <div id="comparison"></div>

    <h2>📈 Current Probability Distributions</h2>
    <div class="stats-panel">
        <div class="stats-grid" id="distributions"></div>
    </div>

    <script>
    // ============================================
    // DEMOGRAPHIC DATA (simplified from paper)
    // ============================================
    
    const DEMOGRAPHICS = {
        // P(ethnicity | group) - from paper's sources
        ethnicity: {
            ca_resident: { white: 0.37, hispanic: 0.39, asian: 0.15, black: 0.06 },
            sfpd: { white: 0.45, hispanic: 0.20, asian: 0.25, black: 0.10 },
            stem: { white: 0.63, hispanic: 0.08, asian: 0.21, black: 0.07 },
            unhoused: { white: 0.35, hispanic: 0.22, asian: 0.06, black: 0.35 }
        },
        
        // P(gender | group)
        gender: {
            ca_resident: { male: 0.49, female: 0.51 },
            sfpd: { male: 0.82, female: 0.18 },
            stem: { male: 0.73, female: 0.27 },
            unhoused: { male: 0.70, female: 0.30 }
        },
        
        // P(eye_color | ethnicity) - with gating from paper
        eyeColor: {
            white: { brown: 0.45, blue: 0.27, grey: 0.10, hazel: 0.09, green: 0.09 },
            black: { brown: 0.85, hazel: 0.10, grey: 0.05 },
            hispanic: { brown: 0.80, hazel: 0.12, grey: 0.08 },
            asian: { brown: 0.90, hazel: 0.07, grey: 0.03 }
        },
        
        // P(hair_color | ethnicity) - with gating from paper
        hairColor: {
            white: { brown: 0.45, black: 0.15, blonde: 0.25, red: 0.15 },
            black: { black: 0.85, brown: 0.15 },
            hispanic: { black: 0.70, brown: 0.30 },
            asian: { black: 0.90, brown: 0.10 }
        },
        
        // Height parameters (mean, std) by gender in cm
        height: {
            male: { mean: 175, std: 7 },
            female: { mean: 162, std: 6.5 }
        },
        
        // BMI parameters by gender
        bmi: {
            male: { mean: 26.5, std: 5 },
            female: { mean: 26.0, std: 5.5 }
        }
    };
    
    // Name data (simplified)
    const NAMES = {
        firstName: {
            male: {
                white: ['James', 'John', 'Robert', 'Michael', 'William', 'David', 'Joseph', 'Thomas', 'Daniel', 'Matthew', 'Donald', 'Steven', 'Brian', 'Kevin', 'Eric'],
                black: ['Michael', 'James', 'Anthony', 'Marcus', 'Terrence', 'Darnell', 'Jamal', 'DeShawn', 'Malik', 'Andre', 'Tyrone', 'Lamar', 'Cedric', 'Darius', 'Jerome'],
                hispanic: ['Jose', 'Carlos', 'Miguel', 'Juan', 'Luis', 'Jorge', 'Ricardo', 'Eduardo', 'Francisco', 'Antonio', 'Manuel', 'Rafael', 'Diego', 'Alejandro', 'Fernando'],
                asian: ['David', 'Michael', 'Kevin', 'Jason', 'Brian', 'William', 'Daniel', 'Andrew', 'James', 'Ryan', 'Wei', 'Chen', 'Hiroshi', 'Kenji', 'Min']
            },
            female: {
                white: ['Mary', 'Patricia', 'Jennifer', 'Linda', 'Elizabeth', 'Susan', 'Jessica', 'Sarah', 'Karen', 'Nancy', 'Emily', 'Amanda', 'Megan', 'Lauren', 'Ashley'],
                black: ['Michelle', 'Tiffany', 'Crystal', 'Ebony', 'Jasmine', 'Latoya', 'Keisha', 'Tamika', 'Aaliyah', 'Destiny', 'Shaniqua', 'Tanisha', 'Alicia', 'Briana', 'Imani'],
                hispanic: ['Maria', 'Carmen', 'Rosa', 'Guadalupe', 'Ana', 'Sofia', 'Isabella', 'Valentina', 'Camila', 'Lucia', 'Elena', 'Gabriela', 'Mariana', 'Daniela', 'Paula'],
                asian: ['Jennifer', 'Michelle', 'Jessica', 'Lisa', 'Amy', 'Christine', 'Angela', 'Stephanie', 'Emily', 'Grace', 'Mei', 'Yuki', 'Sakura', 'Lin', 'Anh']
            }
        },
        lastName: {
            white: ['Smith', 'Johnson', 'Williams', 'Brown', 'Jones', 'Miller', 'Davis', 'Wilson', 'Anderson', 'Taylor', 'Thomas', 'Moore', 'Martin', 'Thompson', 'White', 'Harris', 'Clark', 'Lewis', 'Robinson', 'Walker', 'Hall', 'Young', 'King', 'Wright', 'Hill', 'Scott', 'Adams', 'Baker', 'Nelson', 'Carter', "O'Brien", "O'Connor", 'Murphy', 'Kelly', 'Sullivan', 'Schmidt', 'Mueller', 'Weber', 'Meyer', 'Wagner'],
            black: ['Williams', 'Johnson', 'Smith', 'Jones', 'Brown', 'Jackson', 'Davis', 'Thomas', 'Harris', 'Robinson', 'Taylor', 'Moore', 'Martin', 'Anderson', 'Thompson', 'White', 'Washington', 'Jefferson', 'Freeman', 'Banks'],
            hispanic: ['Garcia', 'Rodriguez', 'Martinez', 'Hernandez', 'Lopez', 'Gonzalez', 'Perez', 'Sanchez', 'Ramirez', 'Torres', 'Flores', 'Rivera', 'Gomez', 'Diaz', 'Reyes', 'Morales', 'Cruz', 'Ortiz', 'Gutierrez', 'Chavez'],
            asian: ['Wang', 'Li', 'Zhang', 'Liu', 'Chen', 'Yang', 'Huang', 'Wu', 'Kim', 'Park', 'Lee', 'Choi', 'Nguyen', 'Tran', 'Pham', 'Tanaka', 'Yamamoto', 'Suzuki', 'Patel', 'Shah']
        }
    };

    // ============================================
    // SAMPLING UTILITIES
    // ============================================
    
    function weightedChoice(distribution) {
        const entries = Object.entries(distribution);
        const total = entries.reduce((sum, [_, p]) => sum + p, 0);
        let r = Math.random() * total;
        
        for (const [value, prob] of entries) {
            r -= prob;
            if (r <= 0) return value;
        }
        return entries[entries.length - 1][0];
    }
    
    function randomChoice(arr) {
        return arr[Math.floor(Math.random() * arr.length)];
    }
    
    function normalRandom(mean, std) {
        // Box-Muller transform
        const u1 = Math.random();
        const u2 = Math.random();
        const z = Math.sqrt(-2 * Math.log(u1)) * Math.cos(2 * Math.PI * u2);
        return mean + z * std;
    }
    
    function truncatedNormal(mean, std, min, max) {
        let value;
        do {
            value = normalRandom(mean, std);
        } while (value < min || value > max);
        return value;
    }

    // ============================================
    // REJECTION SAMPLING (Paper's approach)
    // ============================================
    
    function generateUnconstrainedNPC(group) {
        group = group || 'ca_resident';
        
        // Sample from prior distributions following the Bayes net
        const ethnicity = weightedChoice(DEMOGRAPHICS.ethnicity[group]);
        const gender = weightedChoice(DEMOGRAPHICS.gender[group]);
        
        const eyeColor = weightedChoice(DEMOGRAPHICS.eyeColor[ethnicity]);
        const hairColor = weightedChoice(DEMOGRAPHICS.hairColor[ethnicity]);
        
        const heightParams = DEMOGRAPHICS.height[gender];
        const height = truncatedNormal(heightParams.mean, heightParams.std, 140, 210);
        
        const bmiParams = DEMOGRAPHICS.bmi[gender];
        const bmi = truncatedNormal(bmiParams.mean, bmiParams.std, 16, 45);
        
        const weight = 0.0001 * bmi * height * height;
        
        const age = Math.floor(18 + Math.random() * 62);
        
        const firstName = randomChoice(NAMES.firstName[gender][ethnicity]);
        const lastName = randomChoice(NAMES.lastName[ethnicity]);
        
        return {
            group, ethnicity, gender, eyeColor, hairColor,
            height, bmi, weight, age, firstName, lastName
        };
    }
    
    function rejectionSample(constraints, maxAttempts = 10000) {
        let attempts = 0;
        
        while (attempts < maxAttempts) {
            attempts++;
            const npc = generateUnconstrainedNPC(constraints.group || null);
            
            // Check all constraints
            let valid = true;
            
            if (constraints.gender && npc.gender !== constraints.gender) valid = false;
            if (constraints.ethnicity && npc.ethnicity !== constraints.ethnicity) valid = false;
            if (constraints.eyes && npc.eyeColor !== constraints.eyes) valid = false;
            if (constraints.hair && npc.hairColor !== constraints.hair) valid = false;
            if (constraints.minHeight && npc.height < constraints.minHeight) valid = false;
            if (constraints.maxBmi && npc.bmi > constraints.maxBmi) valid = false;
            
            if (valid) {
                return { npc, attempts };
            }
        }
        
        return { npc: null, attempts };
    }

    // ============================================
    // MCMC-STYLE SAMPLING (PyMC approach)
    // ============================================
    
    // Compute posterior distribution given constraints
    function computePosterior(constraints) {
        const group = constraints.group || 'ca_resident';
        
        // Start with priors
        let ethnicityDist = { ...DEMOGRAPHICS.ethnicity[group] };
        let genderDist = { ...DEMOGRAPHICS.gender[group] };
        
        // If ethnicity is constrained, set it deterministically
        if (constraints.ethnicity) {
            ethnicityDist = { [constraints.ethnicity]: 1.0 };
        }
        
        // If gender is constrained, set it deterministically
        if (constraints.gender) {
            genderDist = { [constraints.gender]: 1.0 };
        }
        
        // If eye color is constrained, update ethnicity posterior
        // P(ethnicity | eye_color) ∝ P(eye_color | ethnicity) * P(ethnicity)
        if (constraints.eyes && !constraints.ethnicity) {
            const newEthDist = {};
            let total = 0;
            
            for (const [eth, prior] of Object.entries(ethnicityDist)) {
                const eyeDist = DEMOGRAPHICS.eyeColor[eth];
                const likelihood = eyeDist[constraints.eyes] || 0;
                newEthDist[eth] = likelihood * prior;
                total += newEthDist[eth];
            }
            
            // Normalize
            if (total > 0) {
                for (const eth of Object.keys(newEthDist)) {
                    newEthDist[eth] /= total;
                }
                ethnicityDist = newEthDist;
            }
        }
        
        // Similar update for hair color constraint
        if (constraints.hair && !constraints.ethnicity) {
            const newEthDist = {};
            let total = 0;
            
            for (const [eth, prior] of Object.entries(ethnicityDist)) {
                const hairDist = DEMOGRAPHICS.hairColor[eth];
                const likelihood = hairDist[constraints.hair] || 0;
                newEthDist[eth] = likelihood * prior;
                total += newEthDist[eth];
            }
            
            if (total > 0) {
                for (const eth of Object.keys(newEthDist)) {
                    newEthDist[eth] /= total;
                }
                ethnicityDist = newEthDist;
            }
        }
        
        // Height constraint affects gender posterior
        // Taller people more likely male
        if (constraints.minHeight && !constraints.gender) {
            const newGenderDist = {};
            let total = 0;
            
            for (const [g, prior] of Object.entries(genderDist)) {
                const params = DEMOGRAPHICS.height[g];
                // P(height > minHeight | gender) using normal CDF approximation
                const z = (constraints.minHeight - params.mean) / params.std;
                const likelihood = 1 - normalCDF(z);
                newGenderDist[g] = likelihood * prior;
                total += newGenderDist[g];
            }
            
            if (total > 0) {
                for (const g of Object.keys(newGenderDist)) {
                    newGenderDist[g] /= total;
                }
                genderDist = newGenderDist;
            }
        }
        
        return { ethnicityDist, genderDist, group };
    }
    
    function normalCDF(z) {
        // Approximation of standard normal CDF
        const a1 =  0.254829592;
        const a2 = -0.284496736;
        const a3 =  1.421413741;
        const a4 = -1.453152027;
        const a5 =  1.061405429;
        const p  =  0.3275911;
        
        const sign = z < 0 ? -1 : 1;
        z = Math.abs(z) / Math.sqrt(2);
        
        const t = 1.0 / (1.0 + p * z);
        const y = 1.0 - (((((a5 * t + a4) * t) + a3) * t + a2) * t + a1) * t * Math.exp(-z * z);
        
        return 0.5 * (1.0 + sign * y);
    }
    
    function mcmcSample(constraints) {
        const posterior = computePosterior(constraints);
        
        // Sample from posterior distributions
        const ethnicity = constraints.ethnicity || weightedChoice(posterior.ethnicityDist);
        const gender = constraints.gender || weightedChoice(posterior.genderDist);
        
        // Sample eye color given ethnicity (and constraint if any)
        let eyeColor;
        if (constraints.eyes) {
            eyeColor = constraints.eyes;
        } else {
            eyeColor = weightedChoice(DEMOGRAPHICS.eyeColor[ethnicity]);
        }
        
        // Sample hair color given ethnicity (and constraint if any)
        let hairColor;
        if (constraints.hair) {
            hairColor = constraints.hair;
        } else {
            hairColor = weightedChoice(DEMOGRAPHICS.hairColor[ethnicity]);
        }
        
        // Sample height given gender (with constraint if any)
        const heightParams = DEMOGRAPHICS.height[gender];
        let height;
        if (constraints.minHeight) {
            // Truncated normal above minHeight
            height = truncatedNormal(heightParams.mean, heightParams.std, constraints.minHeight, 210);
        } else {
            height = truncatedNormal(heightParams.mean, heightParams.std, 140, 210);
        }
        
        // Sample BMI given gender (with constraint if any)
        const bmiParams = DEMOGRAPHICS.bmi[gender];
        let bmi;
        if (constraints.maxBmi) {
            bmi = truncatedNormal(bmiParams.mean, bmiParams.std, 16, constraints.maxBmi);
        } else {
            bmi = truncatedNormal(bmiParams.mean, bmiParams.std, 16, 45);
        }
        
        const weight = 0.0001 * bmi * height * height;
        const age = Math.floor(18 + Math.random() * 62);
        
        const firstName = randomChoice(NAMES.firstName[gender][ethnicity]);
        const lastName = randomChoice(NAMES.lastName[ethnicity]);
        
        return {
            npc: {
                group: posterior.group,
                ethnicity, gender, eyeColor, hairColor,
                height, bmi, weight, age, firstName, lastName
            },
            attempts: 1,
            posterior
        };
    }

    // ============================================
    // UI FUNCTIONS
    // ============================================
    
    function getConstraints() {
        return {
            group: document.getElementById('group').value || null,
            gender: document.getElementById('gender').value || null,
            ethnicity: document.getElementById('ethnicity').value || null,
            eyes: document.getElementById('eyes').value || null,
            hair: document.getElementById('hair').value || null,
            minHeight: parseFloat(document.getElementById('minHeight').value) || null,
            maxBmi: parseFloat(document.getElementById('maxBmi').value) || null
        };
    }
    
    function formatHeight(cm) {
        const inches = cm / 2.54;
        const feet = Math.floor(inches / 12);
        const remainingInches = Math.round(inches % 12);
        return `${feet}'${remainingInches}" (${Math.round(cm)} cm)`;
    }
    
    function formatWeight(kg) {
        const lbs = kg * 2.205;
        return `${Math.round(lbs)} lbs (${Math.round(kg)} kg)`;
    }
    
    function renderNPC(npc, attempts, method) {
        if (!npc) {
            return `
                <div class="npc-card">
                    <div class="npc-name">❌ Failed to generate</div>
                    <div class="npc-demo">Could not find valid NPC after ${attempts} attempts</div>
                    <p>Try relaxing some constraints!</p>
                </div>
            `;
        }
        
        return `
            <div class="npc-card">
                <div class="npc-name">${npc.firstName} ${npc.lastName}</div>
                <div class="npc-demo">${npc.ethnicity} ${npc.gender}, Age ${npc.age}</div>
                <div class="npc-stats">
                    <div class="stat">
                        <div class="stat-label">Height</div>
                        <div class="stat-value">${formatHeight(npc.height)}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Weight</div>
                        <div class="stat-value">${formatWeight(npc.weight)}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">BMI</div>
                        <div class="stat-value">${npc.bmi.toFixed(1)}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Eyes</div>
                        <div class="stat-value">${npc.eyeColor}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Hair</div>
                        <div class="stat-value">${npc.hairColor}</div>
                    </div>
                    <div class="stat">
                        <div class="stat-label">Group</div>
                        <div class="stat-value">${npc.group}</div>
                    </div>
                </div>
                <p style="margin-top: 15px; color: #888;">
                    Generated via <strong>${method}</strong> in <strong>${attempts}</strong> attempt(s)
                </p>
            </div>
        `;
    }
    
    function renderDistribution(title, dist) {
        const entries = Object.entries(dist).sort((a, b) => b[1] - a[1]);
        const bars = entries.map(([label, prob]) => `
            <div class="bar-row">
                <div class="bar-label">${label}</div>
                <div class="bar-container">
                    <div class="bar-fill" style="width: ${prob * 100}%"></div>
                </div>
                <div class="bar-value">${(prob * 100).toFixed(1)}%</div>
            </div>
        `).join('');
        
        return `
            <div class="stat-box">
                <h4>${title}</h4>
                <div class="bar-chart">${bars}</div>
            </div>
        `;
    }
    
    function generateNPC() {
        const constraints = getConstraints();
        
        // Use MCMC-style sampling
        const result = mcmcSample(constraints);
        
        document.getElementById('npcResult').innerHTML = renderNPC(result.npc, result.attempts, 'MCMC-style Posterior Sampling');
        
        updateDistributions(constraints);
    }
    
    function updateDistributions(constraints) {
        constraints = constraints || getConstraints();
        const posterior = computePosterior(constraints);
        const group = constraints.group || 'ca_resident';
        
        // Get eye color distribution given current ethnicity posterior
        const eyeDist = {};
        for (const [eth, ethProb] of Object.entries(posterior.ethnicityDist)) {
            for (const [eye, eyeProb] of Object.entries(DEMOGRAPHICS.eyeColor[eth])) {
                eyeDist[eye] = (eyeDist[eye] || 0) + eyeProb * ethProb;
            }
        }
        
        // Get hair color distribution
        const hairDist = {};
        for (const [eth, ethProb] of Object.entries(posterior.ethnicityDist)) {
            for (const [hair, hairProb] of Object.entries(DEMOGRAPHICS.hairColor[eth])) {
                hairDist[hair] = (hairDist[hair] || 0) + hairProb * ethProb;
            }
        }
        
        document.getElementById('distributions').innerHTML = `
            ${renderDistribution('P(Ethnicity | constraints)', posterior.ethnicityDist)}
            ${renderDistribution('P(Gender | constraints)', posterior.genderDist)}
            ${renderDistribution('P(Eye Color | constraints)', eyeDist)}
            ${renderDistribution('P(Hair Color | constraints)', hairDist)}
        `;
    }
    
    function compareMethod() {
        const constraints = getConstraints();
        const numSamples = 100;
        
        // Rejection sampling
        let rejectionAttempts = 0;
        let rejectionSuccesses = 0;
        const rejectionEthCounts = {};
        const rejectionGenderCounts = {};
        
        for (let i = 0; i < numSamples; i++) {
            const result = rejectionSample(constraints, 1000);
            rejectionAttempts += result.attempts;
            if (result.npc) {
                rejectionSuccesses++;
                rejectionEthCounts[result.npc.ethnicity] = (rejectionEthCounts[result.npc.ethnicity] || 0) + 1;
                rejectionGenderCounts[result.npc.gender] = (rejectionGenderCounts[result.npc.gender] || 0) + 1;
            }
        }
        
        // MCMC sampling
        const mcmcEthCounts = {};
        const mcmcGenderCounts = {};
        
        for (let i = 0; i < numSamples; i++) {
            const result = mcmcSample(constraints);
            mcmcEthCounts[result.npc.ethnicity] = (mcmcEthCounts[result.npc.ethnicity] || 0) + 1;
            mcmcGenderCounts[result.npc.gender] = (mcmcGenderCounts[result.npc.gender] || 0) + 1;
        }
        
        // Normalize to distributions
        const normalize = (counts) => {
            const total = Object.values(counts).reduce((a, b) => a + b, 0);
            const dist = {};
            for (const [k, v] of Object.entries(counts)) {
                dist[k] = v / total;
            }
            return dist;
        };
        
        document.getElementById('comparison').innerHTML = `
            <h2>📊 Method Comparison (${numSamples} samples each)</h2>
            <div class="method-comparison">
                <div class="method-box">
                    <h3>❌ Rejection Sampling (Paper's Method)</h3>
                    <div class="attempt-counter">${rejectionAttempts} total attempts</div>
                    <p>Success rate: ${rejectionSuccesses}/${numSamples}</p>
                    <p>Avg attempts per success: ${(rejectionAttempts / rejectionSuccesses).toFixed(1)}</p>
                    ${renderDistribution('Ethnicity Distribution', normalize(rejectionEthCounts))}
                    ${renderDistribution('Gender Distribution', normalize(rejectionGenderCounts))}
                </div>
                <div class="method-box">
                    <h3>✅ MCMC-style Sampling (PyMC approach)</h3>
                    <div class="attempt-counter">${numSamples} total attempts</div>
                    <p>Success rate: 100%</p>
                    <p>Avg attempts per success: 1.0</p>
                    ${renderDistribution('Ethnicity Distribution', normalize(mcmcEthCounts))}
                    ${renderDistribution('Gender Distribution', normalize(mcmcGenderCounts))}
                </div>
            </div>
            <div class="info-box" style="margin-top: 20px;">
                <strong>Key insight:</strong> Try setting <code>Eye Color = blue</code> and/or <code>Hair = blonde</code>. 
                These are rare in non-white populations, so rejection sampling needs many attempts, while 
                MCMC-style sampling correctly updates the ethnicity posterior (making white more likely) 
                and samples efficiently. The paper notes this problem with continuous variables 
                (height/BMI) being "effectively impossible" to constrain with rejection sampling.
            </div>
        `;
    }
    
    function resetConstraints() {
        document.getElementById('group').value = '';
        document.getElementById('gender').value = '';
        document.getElementById('ethnicity').value = '';
        document.getElementById('eyes').value = '';
        document.getElementById('hair').value = '';
        document.getElementById('minHeight').value = '';
        document.getElementById('maxBmi').value = '';
        
        document.getElementById('npcResult').innerHTML = '';
        document.getElementById('comparison').innerHTML = '';
        updateDistributions();
    }
    
    // Initialize
    updateDistributions();
    </script>
</body>
</html>
What This Demonstrates

Try these experiments:

No constraints - Both methods work equally well
Set Eye Color = Blue - Watch rejection sampling struggle (needs ~10x more attempts) while MCMC instantly updates the ethnicity posterior to favor white
Set Hair = Blonde - Same effect
Set Min Height = 185cm - MCMC correctly shifts gender posterior toward male
Combine constraints (Blue eyes + Blonde hair + Height > 185) - Rejection sampling might need 100+ attempts; MCMC stays at 1

Key differences shown:


Aspect
Rejection Sampling
MCMC-style


Attempts needed
Scales with 1/P(constraints)
Always 1


Posterior updates
None (just filters)
Computes P(ethnicity|eye_color) etc.


Continuous constraints
Nearly impossible
Uses truncated distributions


Efficiency
Poor for rare combos
Constant


The "Compare Methods" button runs 100 samples with each method so you can see the efficiency difference quantitatively!
USER

too much code to read, just explain the model behind what you implemented there
ASSISTANT

The Bayesian Model

The paper describes a Bayes net (directed graphical model) where character features depend on each other in a tree structure:
Demographic Group
       │
       ├──────────────┐
       ▼              ▼
   Ethnicity        Gender
       │              │
       ├──────┬───────┼─────────┐
       ▼      ▼       ▼         ▼
   Eye Color  Hair   Height    BMI
                       │        │
                       └───┬────┘
                           ▼
                        Weight (deterministic)

Each arrow represents a conditional probability:

P(Ethnicity | Group) — SF police are 45% white, CA residents are 37% white, etc.
P(Gender | Group) — SF police are 82% male, STEM workers 73% male, etc.
P(Eye Color | Ethnicity) — white people can have blue/green eyes, others mostly brown
P(Hair Color | Ethnicity) — blonde/red hair gated to white ethnicity
P(Height | Gender) — males ~175cm, females ~162cm (normal distributions)
P(BMI | Gender) — similar normal distributions
Weight = 0.0001 × BMI × Height² — deterministic formula


The Two Sampling Approaches

Rejection Sampling (Paper's Method)


Sample forward through the net: pick ethnicity, then gender, then eye color given ethnicity, etc.
Check if result matches user's constraints
If no, throw it away and start over

Problem: If user wants blue eyes (rare overall, ~10% of population), you reject ~90% of samples. If they want blue eyes AND blonde hair AND tall... you might reject 99%+ of samples.
MCMC/Posterior Sampling (PyMC Approach)

Instead of sampling then rejecting, compute the posterior first:
If user specifies blue eyes, use Bayes' theorem:
P(Ethnicity | blue eyes) ∝ P(blue eyes | Ethnicity) × P(Ethnicity)

This shifts the ethnicity distribution toward white (since P(blue eyes | white) >> P(blue eyes | asian)).
Then sample ethnicity from this updated distribution. Every sample is valid — no rejection needed.
For continuous variables like height, instead of hoping you randomly generate exactly 185cm, you just sample from a truncated normal that's already constrained to be ≥185cm.

Why This Matters


Constraint
Rejection Attempts
MCMC Attempts


None
1
1


Blue eyes
~10
1


Blue eyes + blonde
~50
1


Height = 185.0 (exact)
∞ (impossible)
1


The paper explicitly admits the continuous variable problem — PyMC's approach solves it by design.
USER

what does the “compare methods” button do in the demo?
ASSISTANT

What "Compare Methods" Does

It runs 100 NPC generations using each method with your current constraints, then shows you side-by-side:


Total attempts needed — Rejection sampling might need 500+ attempts to get 100 valid NPCs, while MCMC always needs exactly 100


Success rate — Rejection sampling can fail entirely if constraints are too tight (it gives up after 1000 attempts per NPC)


The resulting distributions — Shows what ethnicity/gender mix you actually got from each method


The Key Insight

Try this experiment:

Set Eye Color = Blue
Click "Compare Methods"

You'll see something like:


Metric
Rejection
MCMC


Total attempts
~800
100


Ethnicity dist
~90% white
~90% white


Both methods give similar distributions (that's correct — blue-eyed people are mostly white), but rejection sampling needed 8x more work to get there.
Now try Blue eyes + Blonde hair + Min Height 185cm:


Metric
Rejection
MCMC


Total attempts
~5000+
100


Failures
Some
None


Rejection sampling is burning tons of CPU generating NPCs that get thrown away. MCMC computes the right posterior upfront and every sample counts.
Aspect	Paper's Rejection Sampling	PyMC's MCMC
Efficiency	O(1/p) where p is probability of constraints	Adapts to target distribution
Continuous variables	Essentially impossible to condition on exact values	Handles naturally via gradients
Complex constraints	Must restart from beginning	Explores constrained space efficiently
Inference	Only forward sampling	Can do both prior and posterior inference
Correlated features	Hard to capture covariance (paper notes hair+eye color issue)	Naturally captures via joint posterior
Aspect	Rejection Sampling	MCMC-style
Attempts needed	Scales with 1/P(constraints)	Always 1
Posterior updates	None (just filters)	Computes P(ethnicity\|eye_color) etc.
Continuous constraints	Nearly impossible	Uses truncated distributions
Efficiency	Poor for rare combos	Constant
Constraint	Rejection Attempts	MCMC Attempts
None	1	1
Blue eyes	~10	1
Blue eyes + blonde	~50	1
Height = 185.0 (exact)	∞ (impossible)	1