Skip to content

Instantly share code, notes, and snippets.

@mrpollo
Last active March 10, 2026 03:56
Show Gist options
  • Select an option

  • Save mrpollo/9a72b7b9190c628604a6552bdbd47245 to your computer and use it in GitHub Desktop.

Select an option

Save mrpollo/9a72b7b9190c628604a6552bdbd47245 to your computer and use it in GitHub Desktop.
Proposal: Move /dbinfo to S3 + CloudFront

Proposal: Move /dbinfo to S3 + CloudFront

Problem

The /dbinfo endpoint on logs.px4.io is completely non-functional. It queries SQLite for all ~353k public logs using an N+1 query pattern (one SELECT per log to join generated data), serializes ~205MB of JSON in-process, and attempts to serve it synchronously through Tornado. Every single request exceeds nginx's proxy timeout and returns a 504.

Over the 14.5 days of available nginx logs (Feb 24 – Mar 10, 2026), there were 1,679 requests to /dbinfo. Of these, 1,618 (96.4%) returned 504 gateway timeout, 41 returned 499 (client closed connection while waiting), and the remainder were 404/500/502 errors. Zero requests returned a successful 200 response. The endpoint is 100% broken.

A single IP address (122.179.88.65, using python-requests/2.32.5) accounts for 89% of all /dbinfo traffic. It polls 24/7 in a retry pattern: 5 requests at ~1 minute intervals, a 10-minute pause, then repeats. This is consistent with a client that sends a request, waits 60 seconds for the 504 timeout, retries up to 5 times, backs off, and starts over. In total, 42 unique IPs attempted to use the endpoint during the log period, including some browser users, all unsuccessfully.

While the bandwidth impact is negligible (0.3 MB total, since all responses are small error pages), the backend cost is significant. Each request occupies a Tornado worker for the full 60-second timeout duration before nginx gives up. Across 1,679 requests, that amounts to roughly 28 hours of cumulative worker time wasted over 14.5 days — on an endpoint that never returns data. With only 2 Tornado workers available, this directly impacts the server's ability to handle legitimate requests like log uploads, plot rendering, and browsing.

Proposed solution

Remove /dbinfo from the live request path entirely and serve the data as a static JSON file from S3 via CloudFront. The application code stays untouched.

Generator script. A standalone Python script (tools/generate_dbinfo_json.py, already written) runs on the EC2 instance as a cron job every 6 hours. It connects directly to the SQLite database, runs the same queries the Tornado handler does, and writes the JSON to a local temp file. It then uploads the result to S3 via aws s3 cp. The script uses atomic file writes (temp file + rename) to avoid partial output. Example cron entry:

0 */6 * * * python3 /home/ubuntu/flight_review/tools/generate_dbinfo_json.py \
    /home/ubuntu/logs.sqlite /tmp/dbinfo.json \
    && aws s3 cp /tmp/dbinfo.json s3://BUCKET/dbinfo.json \
       --content-type application/json

S3 storage. The generated JSON file is uploaded to S3 (either a new bucket or a prefix in the existing logs bucket). Public read access is not required since CloudFront will use an Origin Access Identity to fetch from S3.

CloudFront distribution. A CloudFront distribution sits in front of the S3 object with a TTL of 6 hours matching the cron interval. Repeated polling hits CloudFront edge caches instead of S3. The cost is minimal: one origin fetch per 6 hours per edge location, and CloudFront egress is cheaper per GB than both direct S3 and EC2 egress.

nginx redirect. In the production nginx config (default_ssl.conf), add a location block that returns a 302 redirect to the CloudFront URL:

location = /dbinfo {
    return 302 https://CLOUDFRONT_DOMAIN/dbinfo.json;
}

This is the only production config change required. The Tornado handler never sees /dbinfo requests anymore.

Expected outcome

The EC2 instance stops wasting worker time on requests that always time out. The ~28 hours of wasted worker time per 14.5 days is eliminated. The database is no longer queried by external polling traffic. Consumers of /dbinfo start getting successful responses again (the endpoint has been returning 0% success for at least 2 weeks). Data staleness is at most 6 hours, which is acceptable for a log index. Egress cost shifts from EC2 to CloudFront edge, which is cheaper per GB. The application codebase requires zero changes, and self-hosted deployments are completely unaffected since they don't have the nginx redirect.

Considerations

The 302 redirect changes the response behavior from inline JSON to a redirect. Any consumer that doesn't follow HTTP redirects would need updating. The dominant poller (python-requests) follows redirects by default, so it should work transparently. CORS headers may need to be configured on the CloudFront distribution if any browser-based JavaScript clients call /dbinfo directly. The generator script replicates the same N+1 query pattern as the original handler for output compatibility, but since it runs offline as a cron job, the query time (several seconds) is irrelevant.

#!/usr/bin/env python3
"""
Standalone script to generate the /dbinfo JSON file from the SQLite database.
Produces output identical to the /dbinfo Tornado endpoint but writes it to a
file instead of serving it over HTTP. Intended to run as a cron job, with the
output uploaded to S3 and served via CloudFront.
Usage:
python3 generate_dbinfo_json.py /path/to/logs.sqlite /path/to/output.json
# With S3 upload:
python3 generate_dbinfo_json.py /path/to/logs.sqlite /tmp/dbinfo.json \
&& aws s3 cp /tmp/dbinfo.json s3://BUCKET/dbinfo.json \
--content-type application/json
"""
import json
import os
import sqlite3
import sys
import tempfile
import urllib.request
import xml.etree.ElementTree
AIRFRAMES_URL = 'https://px4-travis.s3.amazonaws.com/Firmware/master/_general/airframes.xml'
def download_airframes_xml():
"""Download airframes.xml from PX4 S3 and parse into a dict mapping
autostart_id -> {name, type}."""
airframes = {}
try:
print('Downloading airframes.xml...')
with tempfile.NamedTemporaryFile(suffix='.xml', delete=True) as tmp:
urllib.request.urlretrieve(AIRFRAMES_URL, tmp.name)
root = xml.etree.ElementTree.parse(tmp.name).getroot()
for group in root.findall('airframe_group'):
for airframe in group.findall('airframe'):
aid = airframe.get('id')
entry = {'name': airframe.get('name', '')}
type_elem = airframe.find('type')
if type_elem is not None and type_elem.text:
entry['type'] = type_elem.text
airframes[aid] = entry
print('Loaded %d airframes' % len(airframes))
except Exception as e:
print('Warning: failed to download airframes.xml: %s' % e,
file=sys.stderr)
print('Airframe names will be empty')
return airframes
def generate(db_path, airframes):
"""Query the database and return the JSON-serializable list."""
con = sqlite3.connect(db_path, detect_types=sqlite3.PARSE_DECLTYPES)
cur = con.cursor()
# Vehicle UUID -> name mapping
cur.execute('SELECT UUID, Name FROM Vehicle')
vehicle_table = {row[0]: row[1] for row in cur.fetchall()}
# All public non-CI logs
cur.execute(
'SELECT Id, Date, Description, WindSpeed, Rating, VideoUrl, '
'ErrorLabels, Source, Feedback, Type '
'FROM Logs WHERE Public = 1 AND NOT Source = "CI"'
)
logs = cur.fetchall()
jsonlist = []
for row in logs:
log_id = row[0]
# Get generated data via a second query (same N+1 pattern as the
# original handler — a JOIN would be faster but we want identical
# output for now)
cur.execute('SELECT * FROM LogsGenerated WHERE Id = ?', [log_id])
gen = cur.fetchone()
if gen is None:
continue
# Parse fields exactly as the app does
error_labels = sorted(
[int(x) for x in row[6].split(',') if len(x) > 0]
) if row[6] else []
flight_modes = {int(x) for x in gen[9].split(',') if len(x) > 0}
flight_mode_durations = [
tuple(map(int, x.split(':')))
for x in gen[12].split(',') if len(x) > 0
]
vehicle_uuid = gen[11] if gen[11] else ''
sys_autostart_id = int(gen[4]) if gen[4] else 0
airframe_data = airframes.get(str(sys_autostart_id))
airframe_name = airframe_data.get('name', '') if airframe_data else ''
airframe_type = (
airframe_data.get('type', sys_autostart_id)
if airframe_data else sys_autostart_id
)
entry = {
# From Logs table
'log_id': log_id,
'log_date': row[1].strftime('%Y-%m-%d'),
'description': row[2] if row[2] else '',
'feedback': row[8] if row[8] else '',
'type': row[9] if row[9] else 'personal',
'wind_speed': row[3] if row[3] is not None else -1,
'rating': row[4] if row[4] else '',
'video_url': row[5] if row[5] else '',
'error_labels': error_labels,
'source': row[7] if row[7] else '',
# From LogsGenerated table
'duration_s': int(gen[1]) if gen[1] else 0,
'mav_type': gen[2] if gen[2] else '',
'estimator': gen[3] if gen[3] else '',
'sys_autostart_id': sys_autostart_id,
'sys_hw': gen[5] if gen[5] else '',
'ver_sw': gen[6] if gen[6] else '',
'ver_sw_release': gen[10] if gen[10] else '',
'num_logged_errors': gen[7] if gen[7] else 0,
'num_logged_warnings': gen[8] if gen[8] else 0,
'flight_modes': list(flight_modes),
'vehicle_uuid': vehicle_uuid,
'flight_mode_durations': flight_mode_durations,
# Derived
'vehicle_name': vehicle_table.get(vehicle_uuid, ''),
'airframe_name': airframe_name,
'airframe_type': airframe_type,
}
jsonlist.append(entry)
cur.close()
con.close()
return jsonlist
def main():
if len(sys.argv) < 3:
print('Usage: %s <db_path> <output_json_path>' % sys.argv[0],
file=sys.stderr)
sys.exit(1)
db_path = sys.argv[1]
output_path = sys.argv[2]
if not os.path.exists(db_path):
print('Error: database not found: %s' % db_path, file=sys.stderr)
sys.exit(1)
airframes = download_airframes_xml()
print('Querying database: %s' % db_path)
jsonlist = generate(db_path, airframes)
print('Generated %d entries' % len(jsonlist))
json_data = json.dumps(jsonlist)
print('JSON size: %.1f MB' % (len(json_data) / (1024 * 1024)))
# Write atomically: temp file then rename
tmp_path = output_path + '.tmp'
with open(tmp_path, 'w', encoding='utf-8') as f:
f.write(json_data)
os.replace(tmp_path, output_path)
print('Written to: %s' % output_path)
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment