Skip to content

Instantly share code, notes, and snippets.

@lizthegrey
Last active January 22, 2026 14:40
Show Gist options
  • Select an option

  • Save lizthegrey/42b4005d8c66fb41a208199d2d679394 to your computer and use it in GitHub Desktop.

Select an option

Save lizthegrey/42b4005d8c66fb41a208199d2d679394 to your computer and use it in GitHub Desktop.
Automated Bluesky Tech Digest with Claude Code - generates daily tech digest from Bluesky timeline

Automated Bluesky Tech Digest with Claude Code

An automated system that generates a daily tech digest from your Bluesky timeline using Claude Code, running every weekday morning at 5am to start your Claude API quota window.

Overview

This system fetches your Bluesky timeline, uses Claude to identify and categorize tech-related posts, and generates a beautiful HTML digest that automatically opens in your browser.

Architecture

Three-stage pipeline for reliability and efficiency:

  1. Fetch & Parse (fetch-bsky-posts.py): Fetches 500 posts from Bluesky MCP, parses XML to clean JSON (~180KB)
  2. Analyze (Claude): Reads JSON, identifies tech themes, outputs structured analysis JSON (~3KB)
  3. Generate HTML (generate-digest-html.py): Transforms analysis JSON into beautiful HTML digest (~24KB)

Why separate analysis from HTML generation?

  • ✅ Claude outputs small JSON instead of large HTML (8x smaller = more reliable, lower cost)
  • ✅ Python ensures accurate post links and consistent formatting
  • ✅ Easier to debug and test each stage independently

Setup

Prerequisites

Installation

  1. Copy scripts to ~/bin/:

    • fetch-bsky-posts.py - Fetches posts from Bluesky
    • generate-digest-html.py - Generates HTML from analysis
    • bsky-digest.sh - Main orchestration script
  2. Make scripts executable:

    chmod +x ~/bin/{fetch-bsky-posts.py,generate-digest-html.py,bsky-digest.sh}
  3. Install launchd plist:

    cp com.lizf.bsky-digest.plist ~/Library/LaunchAgents/
    launchctl load ~/Library/LaunchAgents/com.lizf.bsky-digest.plist
  4. Configure system wake (so your Mac wakes up for the 5am job):

    sudo pmset repeat wake MTWRF 04:59:00

Testing

Run manually to test:

~/bin/bsky-digest.sh

Check logs:

tail -f /tmp/bsky-digest.log

Security Features

Defense in depth:

  • ✅ Working directory sandboxed to /tmp
  • ✅ Only Read,Write tools allowed (no Bash)
  • ✅ Read-only Bluesky MCP access (no posting/liking)
  • ✅ No dynamic code execution from Claude
  • ✅ Pre-vetted Python scripts only
  • --permission-mode dontAsk for headless execution

Why this matters:

  • Prevents "lethal triad" (Write + Bash + untrusted data)
  • No prompt injection risk from malicious Bluesky posts
  • Predictable, auditable behavior

Output

Files generated daily:

  • /tmp/bsky-posts-YYYY-MM-DD.json - Raw posts (180KB)
  • /tmp/bsky-analysis-YYYY-MM-DD.json - Claude's analysis (3KB)
  • /tmp/bsky-tech-digest-YYYY-MM-DD.html - Final digest (24KB)

HTML features:

  • Gradient header with stats
  • Posts organized by theme
  • Theme summaries
  • Direct links to original posts
  • Responsive design

Cost & Performance

Typical run:

  • Fetches ~400 posts
  • Claude identifies 20-30 tech posts
  • Groups into 5-8 themes
  • Takes ~30 seconds total

Claude usage:

  • Input: ~180KB JSON (posts)
  • Output: ~3KB JSON (analysis)
  • Model: Haiku or Sonnet via claude -p

Use Cases

Why run at 5am?

  • Starts your Claude API quota usage window early in the day
  • Fresh tech news digest with your morning coffee
  • Automated workflow - no manual intervention needed

Customization:

  • Edit tech keywords in the prompt
  • Adjust theme count (5-8 default)
  • Change schedule in launchd plist
  • Modify HTML styling in generate-digest-html.py

Technical Details

MCP Integration: Uses claude --mcp-cli call bluesky/get-timeline-posts to fetch posts

XML Parsing: Bluesky MCP returns XML-formatted posts. Uses regex parsing (more robust than XML parsers for this malformed data)

Claude Invocation:

echo "$PROMPT" | claude -p \
    --permission-mode dontAsk \
    --allowed-tools "Read,Write"

Credits

Created by @lizthegrey.com

Powered by Claude Code and the Bluesky MCP server

License

MIT

#!/bin/bash
# Bluesky Tech Digest Generator
# 1. Fetch posts via Python script
# 2. Call Claude to analyze and output structured JSON
# 3. Generate HTML from Claude's analysis via Python script
set -euo pipefail
# Configuration
OUTPUT_DIR="/tmp"
DATE=$(date -v-1d +%Y-%m-%d)
OUTPUT_FILE="${OUTPUT_DIR}/bsky-tech-digest-${DATE}.html"
POSTS_JSON="${OUTPUT_DIR}/bsky-posts-${DATE}.json"
ANALYSIS_JSON="${OUTPUT_DIR}/bsky-analysis-${DATE}.json"
LOG_FILE="${OUTPUT_DIR}/bsky-digest.log"
FETCH_SCRIPT="/Users/lizf/bin/fetch-bsky-posts.py"
HTML_GENERATOR="/Users/lizf/bin/generate-digest-html.py"
CLAUDE_CMD="/opt/homebrew/bin/claude"
# Ensure output directory exists
mkdir -p "${OUTPUT_DIR}"
# Log execution
echo "=== Bluesky Digest Generation Started at $(date) ===" >> "${LOG_FILE}"
# Step 1: Fetch and parse posts
if [ -f "${FETCH_SCRIPT}" ]; then
echo "Fetching Bluesky posts..." | tee -a "${LOG_FILE}"
python3 "${FETCH_SCRIPT}" "${POSTS_JSON}" 2>&1 | tee -a "${LOG_FILE}"
else
echo "✗ Fetch script not found: ${FETCH_SCRIPT}" | tee -a "${LOG_FILE}"
exit 1
fi
# Check if posts were fetched
if [ ! -f "${POSTS_JSON}" ]; then
echo "✗ Failed to fetch posts" | tee -a "${LOG_FILE}"
exit 1
fi
# Step 2: Call Claude to analyze and output structured JSON
echo "Calling Claude to analyze posts..." | tee -a "${LOG_FILE}"
# Change to /tmp for sandboxing
cd "${OUTPUT_DIR}"
PROMPT="You are analyzing Bluesky posts to identify tech themes.
INPUT: Read \`$(basename ${POSTS_JSON})\` which contains ~400 posts from my Bluesky timeline.
TASK:
1. Identify posts about: technology, software engineering, infrastructure, observability, DevOps, AI/ML, databases, cloud, security
2. Group tech posts by major themes (e.g., 'LLM Security Concerns', 'Kubernetes Deployment Issues', 'Database Performance Patterns')
3. For each theme, select 3-10 most interesting posts (prioritize: high engagement, substantive content, known experts)
4. Write a structured JSON file to \`$(basename ${ANALYSIS_JSON})\`
JSON FORMAT:
{
\"total_posts\": <number>,
\"tech_posts\": <number>,
\"themes\": [
{
\"name\": \"Theme Name\",
\"summary\": \"1-2 sentence summary of what this theme is about\",
\"posts\": [\"<bsky_url_1>\", \"<bsky_url_2>\", ...]
}
]
}
CONSTRAINTS:
- Working directory is ${OUTPUT_DIR}
- Use Read to load posts JSON
- Output ONLY valid JSON to $(basename ${ANALYSIS_JSON})
- Include bsky_url values from the posts JSON (these are the direct links)
- Focus on signal over noise: quality > quantity
- Limit to 5-8 themes maximum
OUTPUT: Write structured JSON to \`$(basename ${ANALYSIS_JSON})\`"
echo "${PROMPT}" | "${CLAUDE_CMD}" -p \
--permission-mode dontAsk \
--allowed-tools "Read,Write" \
2>&1 | tee -a "${LOG_FILE}"
# Check if analysis was created
if [ ! -f "${ANALYSIS_JSON}" ]; then
echo "✗ Failed to create analysis JSON" | tee -a "${LOG_FILE}"
exit 1
fi
# Step 3: Generate HTML from structured analysis
echo "Generating HTML digest..." | tee -a "${LOG_FILE}"
if [ -f "${HTML_GENERATOR}" ]; then
python3 "${HTML_GENERATOR}" "${ANALYSIS_JSON}" "${POSTS_JSON}" "${OUTPUT_FILE}" 2>&1 | tee -a "${LOG_FILE}"
else
echo "✗ HTML generator not found: ${HTML_GENERATOR}" | tee -a "${LOG_FILE}"
exit 1
fi
# Check if HTML was created
if [ -f "${OUTPUT_FILE}" ]; then
echo "✓ Digest created successfully at ${OUTPUT_FILE}" | tee -a "${LOG_FILE}"
# Open in Chrome
open -a "Google Chrome" "${OUTPUT_FILE}" 2>&1 | tee -a "${LOG_FILE}"
echo "✓ Opened in Chrome" | tee -a "${LOG_FILE}"
else
echo "✗ Failed to create digest file" | tee -a "${LOG_FILE}"
exit 1
fi
echo "=== Bluesky Digest Generation Completed at $(date) ===" >> "${LOG_FILE}"
echo "" >> "${LOG_FILE}"
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.lizf.bsky-digest</string>
<key>ProgramArguments</key>
<array>
<string>/Users/lizf/bin/bsky-digest.sh</string>
</array>
<key>StartCalendarInterval</key>
<array>
<dict>
<key>Weekday</key>
<integer>1</integer>
<key>Hour</key>
<integer>5</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<dict>
<key>Weekday</key>
<integer>2</integer>
<key>Hour</key>
<integer>5</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<dict>
<key>Weekday</key>
<integer>3</integer>
<key>Hour</key>
<integer>5</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<dict>
<key>Weekday</key>
<integer>4</integer>
<key>Hour</key>
<integer>5</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<dict>
<key>Weekday</key>
<integer>5</integer>
<key>Hour</key>
<integer>5</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
</array>
<key>StandardOutPath</key>
<string>/tmp/bsky-digest-launchd.stdout</string>
<key>StandardErrorPath</key>
<string>/tmp/bsky-digest-launchd.stderr</string>
<key>RunAtLoad</key>
<false/>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/homebrew/bin</string>
<key>HOME</key>
<string>/Users/lizf</string>
</dict>
</dict>
</plist>
#!/usr/bin/env python3
"""
Fetch Bluesky posts and parse into clean JSON
This script just does data fetching/parsing, Claude does the analysis
"""
import json
import subprocess
import sys
import re
from pathlib import Path
def run_mcp_command(tool_name, args='{}'):
"""Run an MCP tool via claude --mcp-cli"""
try:
result = subprocess.run(
['/opt/homebrew/bin/claude', '--mcp-cli', 'call', f'bluesky/{tool_name}', args],
capture_output=True,
text=True,
timeout=60
)
if result.returncode != 0:
print(f"Error calling {tool_name}: {result.stderr}", file=sys.stderr)
return None
return json.loads(result.stdout)
except Exception as e:
print(f"Exception calling {tool_name}: {e}", file=sys.stderr)
return None
def clean_text(text):
"""Clean and normalize text content"""
if not text:
return ""
# Remove extra whitespace while preserving line breaks
lines = [line.strip() for line in text.strip().split('\n')]
return '\n'.join(line for line in lines if line)
def parse_xml_posts(xml_text):
"""Parse posts from XML using regex (more robust than XML parsing)"""
import re
posts = []
try:
# Pattern for top-level posts (not replies, not nested in reposts)
# Match: <post type="..." uri="..." bsky_url="..." author_name="..." author_handle="..." posted_at="...">
post_pattern = r'<post\s+type="([^"]*)"[^>]*bsky_url="([^"]*)"[^>]*author_name="([^"]*)"[^>]*author_handle="([^"]*)"[^>]*posted_at="([^"]*)"[^>]*>\s*<content>\s*(.*?)\s*</content>'
# Pattern for reposts containing posts
repost_pattern = r'<repost[^>]*author_handle="([^"]*)"[^>]*>\s*<post\s+type="([^"]*)"[^>]*bsky_url="([^"]*)"[^>]*author_name="([^"]*)"[^>]*author_handle="([^"]*)"[^>]*posted_at="([^"]*)"[^>]*>\s*<content>\s*(.*?)\s*</content>'
# Find all non-reply posts
for match in re.finditer(post_pattern, xml_text, re.DOTALL):
post_type, bsky_url, author_name, author_handle, posted_at, content = match.groups()
# Skip replies
if 'reply' in post_type:
continue
# Clean content
content = clean_text(content)
if content and len(content) > 10: # Ignore very short posts
posts.append({
'type': post_type,
'bsky_url': bsky_url,
'author_name': author_name,
'author_handle': author_handle,
'posted_at': posted_at,
'content': content,
})
# Find reposted posts
for match in re.finditer(repost_pattern, xml_text, re.DOTALL):
reposted_by, post_type, bsky_url, author_name, author_handle, posted_at, content = match.groups()
# Skip replies
if 'reply' in post_type:
continue
# Clean content
content = clean_text(content)
if content and len(content) > 10:
posts.append({
'type': post_type,
'bsky_url': bsky_url,
'author_name': author_name,
'author_handle': author_handle,
'posted_at': posted_at,
'content': content,
'reposted_by': reposted_by,
})
except Exception as e:
print(f"Error parsing posts: {e}", file=sys.stderr)
import traceback
traceback.print_exc()
return posts
def main():
output_file = sys.argv[1] if len(sys.argv) > 1 else '/tmp/bsky-posts.json'
print("Fetching Bluesky timeline (500 posts)...")
result = run_mcp_command('get-timeline-posts', '{"count": 500, "type": "posts"}')
if not result:
print("Failed to fetch timeline", file=sys.stderr)
sys.exit(1)
# Extract XML text from MCP response
xml_text = ''
if 'content' in result and isinstance(result['content'], list):
for item in result['content']:
if item.get('type') == 'text':
xml_text = item.get('text', '')
break
if not xml_text:
print("No XML content found in response", file=sys.stderr)
sys.exit(1)
print("Parsing posts...")
posts = parse_xml_posts(xml_text)
if not posts:
print("No posts extracted", file=sys.stderr)
sys.exit(1)
print(f"Successfully parsed {len(posts)} posts")
# Save to JSON
Path(output_file).write_text(json.dumps(posts, indent=2))
print(f"✓ Saved to {output_file}")
if __name__ == '__main__':
main()
#!/usr/bin/env python3
"""
Generate HTML digest from structured analysis output
Takes Claude's theme/post analysis and generates clean HTML
"""
import json
import sys
from pathlib import Path
from html import escape
from datetime import datetime
def generate_html(analysis, all_posts, output_file):
"""Generate HTML from structured analysis"""
# Create lookup dict for posts
posts_by_url = {post['bsky_url']: post for post in all_posts}
# Start HTML
html = f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Bluesky Tech Digest - {datetime.now().strftime('%Y-%m-%d')}</title>
<style>
* {{ margin: 0; padding: 0; box-sizing: border-box; }}
body {{
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
line-height: 1.6;
color: #1a1a1a;
background: #f5f7fa;
}}
.container {{
max-width: 1000px;
margin: 0 auto;
padding: 20px;
}}
header {{
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 40px;
border-radius: 12px;
margin-bottom: 30px;
box-shadow: 0 4px 6px rgba(0,0,0,0.1);
}}
h1 {{
font-size: 2.5em;
margin-bottom: 10px;
font-weight: 600;
}}
.subtitle {{
font-size: 1.1em;
opacity: 0.95;
}}
.stats {{
background: white;
padding: 20px;
border-radius: 10px;
margin-bottom: 30px;
box-shadow: 0 2px 4px rgba(0,0,0,0.05);
display: flex;
gap: 30px;
flex-wrap: wrap;
}}
.stat {{
flex: 1;
min-width: 150px;
}}
.stat-number {{
font-size: 2em;
font-weight: bold;
color: #667eea;
}}
.stat-label {{
color: #666;
font-size: 0.9em;
}}
.theme-section {{
background: white;
border-radius: 12px;
padding: 30px;
margin-bottom: 30px;
box-shadow: 0 2px 4px rgba(0,0,0,0.05);
}}
.theme-header {{
display: flex;
align-items: center;
margin-bottom: 20px;
padding-bottom: 15px;
border-bottom: 3px solid #667eea;
}}
.theme-title {{
font-size: 1.8em;
font-weight: 600;
color: #667eea;
flex: 1;
}}
.theme-count {{
background: #667eea;
color: white;
padding: 5px 15px;
border-radius: 20px;
font-weight: 600;
}}
.theme-summary {{
color: #555;
font-size: 1.05em;
margin-bottom: 25px;
line-height: 1.7;
font-style: italic;
}}
.post-card {{
background: #fafbfc;
border-left: 4px solid #667eea;
padding: 20px;
margin-bottom: 20px;
border-radius: 8px;
transition: transform 0.2s, box-shadow 0.2s;
}}
.post-card:hover {{
transform: translateX(5px);
box-shadow: 0 4px 8px rgba(0,0,0,0.1);
}}
.post-author {{
font-weight: 600;
color: #667eea;
margin-bottom: 8px;
display: flex;
align-items: center;
gap: 10px;
}}
.post-handle {{
color: #888;
font-weight: normal;
font-size: 0.9em;
}}
.post-time {{
color: #999;
font-size: 0.85em;
}}
.post-content {{
color: #2d3748;
margin: 15px 0;
line-height: 1.7;
white-space: pre-wrap;
}}
.post-link {{
display: inline-block;
color: #764ba2;
text-decoration: none;
font-weight: 500;
margin-top: 10px;
transition: color 0.2s;
}}
.post-link:hover {{
color: #667eea;
text-decoration: underline;
}}
.post-link::after {{
content: " →";
}}
footer {{
text-align: center;
padding: 30px;
color: #666;
font-size: 0.9em;
}}
@media (max-width: 768px) {{
.container {{ padding: 10px; }}
header {{ padding: 25px; }}
h1 {{ font-size: 1.8em; }}
.stats {{ flex-direction: column; gap: 15px; }}
}}
</style>
</head>
<body>
<div class="container">
<header>
<h1>📡 Bluesky Tech Digest</h1>
<div class="subtitle">Generated {datetime.now().strftime('%A, %B %d, %Y at %I:%M %p')}</div>
</header>
<div class="stats">
<div class="stat">
<div class="stat-number">{analysis.get('total_posts', 0)}</div>
<div class="stat-label">Posts Analyzed</div>
</div>
<div class="stat">
<div class="stat-number">{analysis.get('tech_posts', 0)}</div>
<div class="stat-label">Tech Posts</div>
</div>
<div class="stat">
<div class="stat-number">{len(analysis.get('themes', []))}</div>
<div class="stat-label">Themes</div>
</div>
</div>
"""
# Generate theme sections
for theme in analysis.get('themes', []):
theme_name = escape(theme.get('name', 'Untitled Theme'))
theme_summary = escape(theme.get('summary', ''))
post_urls = theme.get('posts', [])
html += f"""
<div class="theme-section">
<div class="theme-header">
<div class="theme-title">{theme_name}</div>
<div class="theme-count">{len(post_urls)}</div>
</div>
"""
if theme_summary:
html += f"""
<div class="theme-summary">{theme_summary}</div>
"""
# Add posts for this theme
for post_url in post_urls:
post = posts_by_url.get(post_url)
if not post:
continue
author_name = escape(post.get('author_name', ''))
author_handle = escape(post.get('author_handle', ''))
posted_at = post.get('posted_at', '')
content = escape(post.get('content', ''))
# Truncate long content
if len(content) > 600:
content = content[:600] + '...'
html += f"""
<div class="post-card">
<div class="post-author">
<span>{author_name}</span>
<span class="post-handle">@{author_handle}</span>
<span class="post-time">• {posted_at}</span>
</div>
<div class="post-content">{content}</div>
<a href="{escape(post_url)}" class="post-link" target="_blank">View on Bluesky</a>
</div>
"""
html += " </div>\n"
# Footer
html += """
<footer>
Generated with Claude Code • <a href="https://bsky.app/profile/lizthegrey.com" style="color: #667eea;">@lizthegrey.com</a>
</footer>
</div>
</body>
</html>
"""
Path(output_file).write_text(html)
print(f"✓ Generated HTML digest: {output_file}")
def main():
if len(sys.argv) < 4:
print("Usage: generate-digest-html.py <analysis.json> <posts.json> <output.html>")
sys.exit(1)
analysis_file = sys.argv[1]
posts_file = sys.argv[2]
output_file = sys.argv[3]
# Load analysis from Claude
analysis = json.loads(Path(analysis_file).read_text())
# Load all posts
all_posts = json.loads(Path(posts_file).read_text())
# Generate HTML
generate_html(analysis, all_posts, output_file)
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment