Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jmealo/65385bc229bf1cadbb176f69c7c338d6 to your computer and use it in GitHub Desktop.

Select an option

Save jmealo/65385bc229bf1cadbb176f69c7c338d6 to your computer and use it in GitHub Desktop.
EXPLAIN ANALYZE queries for notification-sender bottleneck — real prod IDs from 2026-03-06

EXPLAIN ANALYZE Queries for Notification-Sender Hot Path

Source: Production logs, 2026-03-06 08:44 UTC Charter org_id: 9de5d801-235a-4451-8c89-d2c3974c71e8

All queries use real IDs extracted from prod notification-sender logs.

WARNING: Run these inside a BEGIN; ... ROLLBACK; transaction on a read replica if possible. EXPLAIN ANALYZE actually executes the query.


Real IDs from Production Logs

intel_request_ids (from get_intel_request timing logs)

intel_request_id Duration Org
c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436 4.801s Charter
b7c4f8c0-6f9a-4044-886a-f35266a0f746 4.410s Charter
336fc99b-b743-4cb3-9732-cdb749ac0d56 2.250s Charter
6a6a858c-69fc-46e2-9b5b-8b1206ff773c 1.958s Charter
00cb72d3-4676-471a-b84c-5e7bb36fff30 1.775s Charter
f5f7eb3a-1b30-43e5-90ed-bc241875a87b 1.487s Charter

incident_ids (from get_incident timing logs)

incident_id Duration Org
b2ee02bd-798a-4961-8505-2df53db42634 4.874s 953fd8c4-1b66-438e-b547-4b1d6734ab94 (non-Charter)
d280fc24-eab8-4cdf-9a2c-0be04c3b9bf9 1.981s Charter
f32ade91-44db-44eb-98da-91b3c0fc2dbd 1.729s Charter

AAA Organization Calls

Every message calls get_organization(organization_id) via the AAA API:

  • Endpoint: GET /organizations/9de5d801-235a-4451-8c89-d2c3974c71e8
  • Duration: 49-55ms consistently (not a bottleneck)
  • Frequency: Called once per message, after generate_context_data() completes
  • Note: Independent of context generation — could be parallelized for ~50ms savings
  • No SQL here — this is an HTTP call to the AAA service, not a DB query

Query 1: Recursive Dependencies CTE (THE #1 BOTTLENECK)

Source: gisual-data-mixins/models.py:219-272 Called by: GET /intel-requests/{id}?get_recursive_dependencies=True&recursive_dependency_type=dependents&recursive_record_types=intel_requests Database: app (schema: dependencies) Observed latency: 1.5-4.8s for the API call (includes HTTP overhead + this query + direct deps query)

Use the slowest intel_request_id first (c4fd4a5d):

-- Query 1a: Recursive dependents for intel_request (SLOWEST ID: 4.801s API time)
-- This is the exact query from models.py:219-272 with direction=dependents
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
WITH RECURSIVE dependents (
    dependent_type, dependent_id, blocker_type, blocker_id,
    created, deleted, dependency_distance, dependency_path
) AS (
    -- Base case: direct dependents of this intel_request
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           1 AS dependency_distance,
           ARRAY[ROW(d.dependent_type, d.dependent_id,
                     d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
     WHERE d.blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'
       AND d.blocker_type = 'intel_requests'
       AND d.deleted IS NULL

    UNION ALL

    -- Recursive case: follow the graph downstream
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           rd.dependency_distance + 1 AS dependency_distance,
           rd.dependency_path || ARRAY[ROW(d.dependent_type, d.dependent_id,
                                           d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
      JOIN dependents AS rd
           ON d.blocker_id = rd.dependent_id
          AND d.blocker_type = rd.dependent_type
     WHERE NOT ROW(d.dependent_type, d.dependent_id,
                   d.blocker_type, d.blocker_id) = ANY(rd.dependency_path)
       AND d.deleted IS NULL
)
SELECT * FROM dependents;
ROLLBACK;
CTE Scan on dependents  (cost=37329.42..37335.36 rows=297 width=148) (actual time=0.015..0.016 rows=0 loops=1)
  Buffers: shared hit=3
  CTE dependents
    ->  Recursive Union  (cost=0.42..37329.42 rows=297 width=112) (actual time=0.015..0.015 rows=0 loops=1)
          Buffers: shared hit=3
          ->  Index Scan using dep_blocker_live_idx on dependencies d  (cost=0.42..22.56 rows=17 width=112) (actual time=0.014..0.014 rows=0 loops=1)
                Index Cond: ((blocker_type = 'intel_requests'::text) AND (blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'::uuid))
                Buffers: shared hit=3
          ->  Nested Loop  (cost=0.42..3730.09 rows=28 width=112) (actual time=0.001..0.001 rows=0 loops=1)
                ->  WorkTable Scan on dependents rd  (cost=0.00..3.40 rows=170 width=84) (actual time=0.000..0.001 rows=0 loops=1)
                ->  Index Scan using dep_blocker_live_idx on dependencies d_1  (cost=0.42..21.78 rows=14 width=76) (never executed)
                      Index Cond: ((blocker_type = rd.dependent_type) AND (blocker_id = rd.dependent_id))
"                      Filter: (ROW(dependent_type, dependent_id, blocker_type, blocker_id) <> ALL (rd.dependency_path))"
Planning:
  Buffers: shared hit=719
Planning Time: 2.036 ms
Execution Time: 0.104 ms
-- Query 1b: Same query with the 2nd-slowest ID (4.410s)
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
WITH RECURSIVE dependents (
    dependent_type, dependent_id, blocker_type, blocker_id,
    created, deleted, dependency_distance, dependency_path
) AS (
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           1 AS dependency_distance,
           ARRAY[ROW(d.dependent_type, d.dependent_id,
                     d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
     WHERE d.blocker_id = 'b7c4f8c0-6f9a-4044-886a-f35266a0f746'
       AND d.blocker_type = 'intel_requests'
       AND d.deleted IS NULL
    UNION ALL
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           rd.dependency_distance + 1 AS dependency_distance,
           rd.dependency_path || ARRAY[ROW(d.dependent_type, d.dependent_id,
                                           d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
      JOIN dependents AS rd
           ON d.blocker_id = rd.dependent_id
          AND d.blocker_type = rd.dependent_type
     WHERE NOT ROW(d.dependent_type, d.dependent_id,
                   d.blocker_type, d.blocker_id) = ANY(rd.dependency_path)
       AND d.deleted IS NULL
)
SELECT * FROM dependents;
ROLLBACK;
CTE Scan on dependents  (cost=37329.42..37335.36 rows=297 width=148) (actual time=0.011..0.012 rows=0 loops=1)
  Buffers: shared hit=3
  CTE dependents
    ->  Recursive Union  (cost=0.42..37329.42 rows=297 width=112) (actual time=0.011..0.011 rows=0 loops=1)
          Buffers: shared hit=3
          ->  Index Scan using dep_blocker_live_idx on dependencies d  (cost=0.42..22.56 rows=17 width=112) (actual time=0.010..0.010 rows=0 loops=1)
                Index Cond: ((blocker_type = 'intel_requests'::text) AND (blocker_id = 'b7c4f8c0-6f9a-4044-886a-f35266a0f746'::uuid))
                Buffers: shared hit=3
          ->  Nested Loop  (cost=0.42..3730.09 rows=28 width=112) (actual time=0.001..0.001 rows=0 loops=1)
                ->  WorkTable Scan on dependents rd  (cost=0.00..3.40 rows=170 width=84) (actual time=0.000..0.001 rows=0 loops=1)
                ->  Index Scan using dep_blocker_live_idx on dependencies d_1  (cost=0.42..21.78 rows=14 width=76) (never executed)
                      Index Cond: ((blocker_type = rd.dependent_type) AND (blocker_id = rd.dependent_id))
"                      Filter: (ROW(dependent_type, dependent_id, blocker_type, blocker_id) <> ALL (rd.dependency_path))"
Planning:
  Buffers: shared hit=2
Planning Time: 0.526 ms
Execution Time: 0.049 ms
-- Query 1c: Same CTE but WITH max_distance=5 (proposed fix)
-- Compare this plan to 1a/1b to see the impact of depth limiting
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
WITH RECURSIVE dependents (
    dependent_type, dependent_id, blocker_type, blocker_id,
    created, deleted, dependency_distance, dependency_path
) AS (
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           1 AS dependency_distance,
           ARRAY[ROW(d.dependent_type, d.dependent_id,
                     d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
     WHERE d.blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'
       AND d.blocker_type = 'intel_requests'
       AND d.deleted IS NULL
    UNION ALL
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           rd.dependency_distance + 1 AS dependency_distance,
           rd.dependency_path || ARRAY[ROW(d.dependent_type, d.dependent_id,
                                           d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
      JOIN dependents AS rd
           ON d.blocker_id = rd.dependent_id
          AND d.blocker_type = rd.dependent_type
     WHERE NOT ROW(d.dependent_type, d.dependent_id,
                   d.blocker_type, d.blocker_id) = ANY(rd.dependency_path)
       AND d.deleted IS NULL
)
SELECT * FROM dependents
 WHERE dependency_distance <= 5;
ROLLBACK;
CTE Scan on dependents  (cost=37329.42..37336.10 rows=99 width=148) (actual time=0.011..0.012 rows=0 loops=1)
  Filter: (dependency_distance <= 5)
  Buffers: shared hit=3
  CTE dependents
    ->  Recursive Union  (cost=0.42..37329.42 rows=297 width=112) (actual time=0.011..0.011 rows=0 loops=1)
          Buffers: shared hit=3
          ->  Index Scan using dep_blocker_live_idx on dependencies d  (cost=0.42..22.56 rows=17 width=112) (actual time=0.010..0.010 rows=0 loops=1)
                Index Cond: ((blocker_type = 'intel_requests'::text) AND (blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'::uuid))
                Buffers: shared hit=3
          ->  Nested Loop  (cost=0.42..3730.09 rows=28 width=112) (actual time=0.001..0.001 rows=0 loops=1)
                ->  WorkTable Scan on dependents rd  (cost=0.00..3.40 rows=170 width=84) (actual time=0.000..0.000 rows=0 loops=1)
                ->  Index Scan using dep_blocker_live_idx on dependencies d_1  (cost=0.42..21.78 rows=14 width=76) (never executed)
                      Index Cond: ((blocker_type = rd.dependent_type) AND (blocker_id = rd.dependent_id))
"                      Filter: (ROW(dependent_type, dependent_id, blocker_type, blocker_id) <> ALL (rd.dependency_path))"
Planning:
  Buffers: shared hit=2
Planning Time: 0.508 ms
Execution Time: 0.047 ms

Query 2: Direct Dependencies Lookup

Source: gisual-data-mixins/models.py:157-171 Called by: incorporate_dependency_data() before the recursive CTE Database: app (schema: dependencies)

-- Query 2: Direct dependencies for an intel_request (both directions)
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
SELECT dependent_type,
       dependent_id,
       blocker_type,
       blocker_id,
       created,
       deleted
  FROM dependencies.dependencies
 WHERE (blocker_type = 'intel_requests'
        AND blocker_id = ANY(ARRAY['c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436']::uuid[])
        AND deleted IS NULL)
    OR (dependent_type = 'intel_requests'
        AND dependent_id = ANY(ARRAY['c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436']::uuid[])
        AND deleted IS NULL)
 ORDER BY created ASC;
ROLLBACK;
Sort  (cost=39.49..39.54 rows=18 width=76) (actual time=0.040..0.041 rows=1 loops=1)
  Sort Key: created
  Sort Method: quicksort  Memory: 25kB
  Buffers: shared hit=10
  ->  Bitmap Heap Scan on dependencies  (cost=5.03..39.12 rows=18 width=76) (actual time=0.022..0.023 rows=1 loops=1)
        Recheck Cond: (((blocker_type = 'intel_requests'::text) AND (blocker_id = ANY ('{c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436}'::uuid[])) AND (deleted IS NULL)) OR (dependent_id = ANY ('{c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436}'::uuid[])))
        Filter: ((deleted IS NULL) AND (((blocker_type = 'intel_requests'::text) AND (blocker_id = ANY ('{c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436}'::uuid[]))) OR ((dependent_type = 'intel_requests'::text) AND (dependent_id = ANY ('{c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436}'::uuid[])))))
        Heap Blocks: exact=1
        Buffers: shared hit=7
        ->  BitmapOr  (cost=5.03..5.03 rows=17 width=0) (actual time=0.015..0.015 rows=0 loops=1)
              Buffers: shared hit=6
              ->  Bitmap Index Scan on dep_blocker_live_idx  (cost=0.00..2.58 rows=16 width=0) (actual time=0.007..0.007 rows=0 loops=1)
                    Index Cond: ((blocker_type = 'intel_requests'::text) AND (blocker_id = ANY ('{c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436}'::uuid[])))
                    Buffers: shared hit=3
              ->  Bitmap Index Scan on dependencies3_dependent_id_idx  (cost=0.00..2.44 rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=1)
                    Index Cond: (dependent_id = ANY ('{c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436}'::uuid[]))
                    Buffers: shared hit=3
Planning:
  Buffers: shared hit=6
Planning Time: 0.262 ms
Execution Time: 0.066 ms

Query 3: Recursive Dependencies for Incident

Source: Same CTE as Query 1, but rooted on an incident Called by: GET /incidents/{id}?get_recursive_dependencies=True&recursive_dependency_type=dependents&recursive_record_types=intel_requests Database: app (schema: dependencies) Observed latency: 1.7-4.9s API time

-- Query 3a: Recursive dependents for incident (SLOWEST: 4.874s)
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
WITH RECURSIVE dependents (
    dependent_type, dependent_id, blocker_type, blocker_id,
    created, deleted, dependency_distance, dependency_path
) AS (
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           1 AS dependency_distance,
           ARRAY[ROW(d.dependent_type, d.dependent_id,
                     d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
     WHERE d.blocker_id = 'b2ee02bd-798a-4961-8505-2df53db42634'
       AND d.blocker_type = 'incidents'
       AND d.deleted IS NULL
    UNION ALL
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           rd.dependency_distance + 1 AS dependency_distance,
           rd.dependency_path || ARRAY[ROW(d.dependent_type, d.dependent_id,
                                           d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
      JOIN dependents AS rd
           ON d.blocker_id = rd.dependent_id
          AND d.blocker_type = rd.dependent_type
     WHERE NOT ROW(d.dependent_type, d.dependent_id,
                   d.blocker_type, d.blocker_id) = ANY(rd.dependency_path)
       AND d.deleted IS NULL
)
SELECT * FROM dependents;
ROLLBACK;
CTE Scan on dependents  (cost=37311.31..37317.25 rows=297 width=148) (actual time=0.017..0.028 rows=1 loops=1)
  Buffers: shared hit=7
  CTE dependents
    ->  Recursive Union  (cost=0.42..37311.31 rows=297 width=112) (actual time=0.016..0.026 rows=1 loops=1)
          Buffers: shared hit=7
          ->  Index Scan using dep_blocker_live_idx on dependencies d  (cost=0.42..4.44 rows=17 width=112) (actual time=0.015..0.016 rows=1 loops=1)
                Index Cond: ((blocker_type = 'incidents'::text) AND (blocker_id = 'b2ee02bd-798a-4961-8505-2df53db42634'::uuid))
                Buffers: shared hit=4
          ->  Nested Loop  (cost=0.42..3730.09 rows=28 width=112) (actual time=0.008..0.009 rows=0 loops=1)
                Buffers: shared hit=3
                ->  WorkTable Scan on dependents rd  (cost=0.00..3.40 rows=170 width=84) (actual time=0.000..0.001 rows=1 loops=1)
                ->  Index Scan using dep_blocker_live_idx on dependencies d_1  (cost=0.42..21.78 rows=14 width=76) (actual time=0.006..0.006 rows=0 loops=1)
                      Index Cond: ((blocker_type = rd.dependent_type) AND (blocker_id = rd.dependent_id))
"                      Filter: (ROW(dependent_type, dependent_id, blocker_type, blocker_id) <> ALL (rd.dependency_path))"
                      Buffers: shared hit=3
Planning:
  Buffers: shared hit=2
Planning Time: 0.518 ms
Execution Time: 0.065 ms
-- Query 3b: Charter incident (1.981s)
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
WITH RECURSIVE dependents (
    dependent_type, dependent_id, blocker_type, blocker_id,
    created, deleted, dependency_distance, dependency_path
) AS (
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           1 AS dependency_distance,
           ARRAY[ROW(d.dependent_type, d.dependent_id,
                     d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
     WHERE d.blocker_id = 'd280fc24-eab8-4cdf-9a2c-0be04c3b9bf9'
       AND d.blocker_type = 'incidents'
       AND d.deleted IS NULL
    UNION ALL
    SELECT d.dependent_type, d.dependent_id, d.blocker_type, d.blocker_id,
           d.created, d.deleted,
           rd.dependency_distance + 1 AS dependency_distance,
           rd.dependency_path || ARRAY[ROW(d.dependent_type, d.dependent_id,
                                           d.blocker_type, d.blocker_id)] AS dependency_path
      FROM dependencies.dependencies AS d
      JOIN dependents AS rd
           ON d.blocker_id = rd.dependent_id
          AND d.blocker_type = rd.dependent_type
     WHERE NOT ROW(d.dependent_type, d.dependent_id,
                   d.blocker_type, d.blocker_id) = ANY(rd.dependency_path)
       AND d.deleted IS NULL
)
SELECT * FROM dependents;
ROLLBACK;
CTE Scan on dependents  (cost=37311.31..37317.25 rows=297 width=148) (actual time=0.017..0.125 rows=13 loops=1)
  Buffers: shared hit=48
  CTE dependents
    ->  Recursive Union  (cost=0.42..37311.31 rows=297 width=112) (actual time=0.016..0.116 rows=13 loops=1)
          Buffers: shared hit=48
          ->  Index Scan using dep_blocker_live_idx on dependencies d  (cost=0.42..4.44 rows=17 width=112) (actual time=0.015..0.016 rows=1 loops=1)
                Index Cond: ((blocker_type = 'incidents'::text) AND (blocker_id = 'd280fc24-eab8-4cdf-9a2c-0be04c3b9bf9'::uuid))
                Buffers: shared hit=4
          ->  Nested Loop  (cost=0.42..3730.09 rows=28 width=112) (actual time=0.038..0.047 rows=6 loops=2)
                Buffers: shared hit=44
                ->  WorkTable Scan on dependents rd  (cost=0.00..3.40 rows=170 width=84) (actual time=0.000..0.001 rows=6 loops=2)
                ->  Index Scan using dep_blocker_live_idx on dependencies d_1  (cost=0.42..21.78 rows=14 width=76) (actual time=0.005..0.006 rows=1 loops=13)
                      Index Cond: ((blocker_type = rd.dependent_type) AND (blocker_id = rd.dependent_id))
"                      Filter: (ROW(dependent_type, dependent_id, blocker_type, blocker_id) <> ALL (rd.dependency_path))"
                      Buffers: shared hit=44
Planning:
  Buffers: shared hit=2
Planning Time: 0.504 ms
Execution Time: 0.161 ms

Query 4: Intel Request Record Fetch (base record, should be fast)

Source: gisual-data-mixins/models.py via get_with_location Called by: GET /intel-requests/{id} (before dependency incorporation) Database: app (schema: intel_requests)

-- Query 4: Base intel_request record with location
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
SELECT ir.*,
       public.ST_AsGeoJSON(l.location) AS geojson,
       l.thoroughfare, l.dependent_thoroughfare, l.premise, l.sub_premise,
       l.dependent_locality, l.locality, l.administrative_area,
       l.administrative_area_2, l.postal_code, l.country
  FROM intel_requests.intel_requests AS ir
  JOIN locations.locations AS l
    ON l.location_id = ir.location_id
   AND l.organization_id = ir.organization_id
   AND l.deleted IS NULL
 WHERE ir.intel_request_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'
   AND ir.deleted IS NULL;
ROLLBACK;
Nested Loop  (cost=0.97..10.27 rows=1 width=923) (actual time=0.044..0.046 rows=1 loops=1)
  Buffers: shared hit=9
  ->  Index Scan using intel_requests_request_id_active_idx on intel_requests ir  (cost=0.41..4.43 rows=1 width=843) (actual time=0.020..0.021 rows=1 loops=1)
        Index Cond: (intel_request_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'::uuid)
        Buffers: shared hit=4
  ->  Index Scan using locations_location_id_archived_key on locations l  (cost=0.56..4.58 rows=1 width=140) (actual time=0.015..0.015 rows=1 loops=1)
        Index Cond: (location_id = ir.location_id)
        Filter: ((deleted IS NULL) AND (ir.organization_id = organization_id))
        Buffers: shared hit=5
Planning:
  Buffers: shared hit=39
Planning Time: 0.379 ms
Execution Time: 0.075 ms

Query 5: Incident Record Fetch (base record)

Source: gisual-data-mixins/models.py via get_with_location Called by: GET /incidents/{id} (before dependency incorporation) Database: app (schema: incidents)

-- Query 5: Base incident record with location
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
SELECT i.*,
       p.incident_priority_name,
       s.incident_status_name,
       t.incident_type_name,
       public.ST_AsGeoJSON(l.location) AS geojson,
       l.thoroughfare, l.dependent_thoroughfare, l.premise, l.sub_premise,
       l.dependent_locality, l.locality, l.administrative_area,
       l.administrative_area_2, l.postal_code, l.country
  FROM incidents.incidents AS i
  JOIN incidents.incident_statuses AS s ON i.incident_status_id = s.incident_status_id AND s.deleted IS NULL
  JOIN incidents.incident_priorities AS p ON i.incident_priority_id = p.incident_priority_id AND p.deleted IS NULL
  JOIN incidents.incident_types AS t ON i.incident_type_id = t.incident_type_id AND t.deleted IS NULL
  JOIN locations.locations AS l ON i.location_id = l.location_id AND i.organization_id = l.organization_id
 WHERE i.incident_id = 'd280fc24-eab8-4cdf-9a2c-0be04c3b9bf9'
   AND i.deleted IS NULL;
ROLLBACK;
Nested Loop  (cost=5.00..13.97 rows=1 width=1048) (actual time=0.067..0.078 rows=1 loops=1)
  Buffers: shared hit=12
  ->  Nested Loop  (cost=4.45..8.14 rows=1 width=968) (actual time=0.050..0.060 rows=1 loops=1)
        Join Filter: (i.incident_priority_id = p.incident_priority_id)
        Rows Removed by Join Filter: 1
        Buffers: shared hit=7
        ->  Nested Loop  (cost=4.45..7.05 rows=1 width=962) (actual time=0.045..0.055 rows=1 loops=1)
              Join Filter: (i.incident_status_id = s.incident_status_id)
              Rows Removed by Join Filter: 2
              Buffers: shared hit=6
              ->  Hash Join  (cost=4.45..5.98 rows=1 width=951) (actual time=0.041..0.051 rows=1 loops=1)
                    Hash Cond: (t.incident_type_id = i.incident_type_id)
                    Buffers: shared hit=5
                    ->  Seq Scan on incident_types t  (cost=0.00..1.41 rows=41 width=30) (actual time=0.004..0.009 rows=41 loops=1)
                          Filter: (deleted IS NULL)
                          Buffers: shared hit=1
                    ->  Hash  (cost=4.44..4.44 rows=1 width=937) (actual time=0.017..0.017 rows=1 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 9kB
                          Buffers: shared hit=4
                          ->  Index Scan using incidents_incident_id_archived_key on incidents i  (cost=0.42..4.44 rows=1 width=937) (actual time=0.012..0.013 rows=1 loops=1)
                                Index Cond: (incident_id = 'd280fc24-eab8-4cdf-9a2c-0be04c3b9bf9'::uuid)
                                Filter: (deleted IS NULL)
                                Buffers: shared hit=4
              ->  Seq Scan on incident_statuses s  (cost=0.00..1.03 rows=3 width=27) (actual time=0.002..0.003 rows=3 loops=1)
                    Filter: (deleted IS NULL)
                    Buffers: shared hit=1
        ->  Seq Scan on incident_priorities p  (cost=0.00..1.04 rows=4 width=22) (actual time=0.004..0.004 rows=2 loops=1)
              Filter: (deleted IS NULL)
              Buffers: shared hit=1
  ->  Index Scan using locations_location_id_archived_key on locations l  (cost=0.56..4.58 rows=1 width=140) (actual time=0.011..0.011 rows=1 loops=1)
        Index Cond: (location_id = i.location_id)
        Filter: (i.organization_id = organization_id)
        Buffers: shared hit=5
Planning:
  Buffers: shared hit=587
Planning Time: 1.901 ms
Execution Time: 0.223 ms

Query 6: Assets Bulk Fetch

Source: assets_api/assets.py:504 via get_with_location Called by: POST /assets/bulk with list of asset_ids Database: app (schema: assets) Note: We don't have asset_ids from the logs directly. Use this query with IDs from Query 2's results (dependent assets).

-- Query 6: First, find asset IDs that are dependents of our intel_request
-- Then use those for the bulk assets query
BEGIN;

-- Step 1: Get dependent asset IDs
SELECT dependent_id
  FROM dependencies.dependencies
 WHERE blocker_type = 'intel_requests'
   AND blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'
   AND dependent_type = 'assets'
   AND deleted IS NULL;

-- Step 2: Use those IDs in the bulk assets query (replace with actual IDs from step 1)
-- EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
-- SELECT a.asset_id, a.asset_name, a.asset_description,
--        t.asset_type_name,
--        a.power_utility_organization_id, a.is_critical_node,
--        a.actively_monitor, a.metadata, a.organization_id,
--        a.created, a.updated,
--        public.ST_AsGeoJSON(l.location) AS geojson,
--        l.thoroughfare, l.dependent_thoroughfare, l.premise, l.sub_premise,
--        l.dependent_locality, l.locality, l.administrative_area,
--        l.administrative_area_2, l.postal_code, l.country
--   FROM assets.assets AS a
--   JOIN assets.asset_types AS t ON t.asset_type_id = a.asset_type_id AND t.deleted IS NULL
--   JOIN locations.locations AS l ON l.location_id = a.location_id AND l.organization_id = a.organization_id AND l.deleted IS NULL
--  WHERE a.asset_id = ANY(ARRAY['<id1>', '<id2>']::uuid[])
--    AND a.deleted IS NULL;

ROLLBACK;

-- This didn't return anything

Query 7: Current Incidents (the optimized LATERAL query from feat/optimize-query)

Source: incidents_api/incidents.py:1061 (feat/optimize-query branch) Called by: GET /incidents/current?organization_id=X Database: app (schemas: incidents, dependencies, assets, locations, internet_outages)

-- Query 7: Current incidents for Charter (the LATERAL-optimized version)
-- Compare with Query 7b (the old flat JOIN version) to validate the 6.5x improvement
BEGIN;
EXPLAIN (ANALYZE, BUFFERS, TIMING, FORMAT TEXT)
SELECT i.incident_id, i.incident_name, i.incident_summary, i.incident_description,
       p.incident_priority_name, s.incident_status_name, t.incident_type_name,
       i.incident_etr, i.incident_origin, i.organization_id,
       i.check_for_updates, i.resolution_hold_until,
       i.metadata, i.impact_polygon, i.created, i.updated, i.deleted, i.last_processed,
       JSON_BUILD_OBJECT(
           'administrative_area', l.administrative_area,
           'geojson', public.ST_AsGeoJSON(l.location),
           'locality', l.locality,
           'postal_code', l.postal_code,
           'thoroughfare', l.thoroughfare
       ) AS location,
       COALESCE(intels_agg.data, '[]'::json) AS intel_requests,
       COALESCE(parent_intels_agg.data, '[]'::json) AS parent_intel_requests
  FROM incidents.incidents AS i
  JOIN incidents.incident_statuses AS s ON i.incident_status_id = s.incident_status_id AND s.deleted IS NULL
  JOIN incidents.incident_priorities AS p ON i.incident_priority_id = p.incident_priority_id AND p.deleted IS NULL
  JOIN incidents.incident_types AS t ON i.incident_type_id = t.incident_type_id AND t.deleted IS NULL
  JOIN locations.locations AS l ON i.location_id = l.location_id AND i.organization_id = l.organization_id
  LEFT JOIN LATERAL (
      SELECT JSON_AGG(JSON_BUILD_OBJECT(
               'created', dir.created, 'deleted', dir.deleted,
               'dependency_distance', 1,
               'record_id', dir.dependent_id, 'record_type', dir.dependent_type
             )) AS data
        FROM dependencies.dependencies AS dir
       WHERE dir.blocker_type = 'incidents'
         AND dir.blocker_id = i.incident_id
         AND dir.dependent_type = 'intel_requests'
         AND dir.deleted IS NULL
  ) intels_agg ON true
  LEFT JOIN LATERAL (
      SELECT JSON_AGG(JSON_BUILD_OBJECT(
               'created', pdir.created, 'deleted', pdir.deleted,
               'dependency_distance', 2,
               'record_id', pdir.dependent_id, 'record_type', pdir.dependent_type
             )) AS data
        FROM dependencies.dependencies AS pdir
        JOIN dependencies.dependencies AS dir
          ON dir.blocker_type = 'incidents'
         AND dir.blocker_id = i.incident_id
         AND dir.dependent_type = 'intel_requests'
         AND dir.deleted IS NULL
       WHERE pdir.blocker_type = 'intel_requests'
         AND pdir.blocker_id = dir.dependent_id
         AND pdir.dependent_type = 'intel_requests'
         AND pdir.deleted IS NULL
  ) parent_intels_agg ON true
 WHERE i.deleted IS NULL
   AND (s.incident_status_name <> 'resolved'
        OR i.updated >= CURRENT_TIMESTAMP - INTERVAL '2 days')
   AND i.organization_id = '9de5d801-235a-4451-8c89-d2c3974c71e8'
 ORDER BY i.updated DESC;
ROLLBACK;
Nested Loop Left Join  (cost=14472.38..844032.83 rows=767 width=960) (actual time=35.850..644.805 rows=3591 loops=1)
  Buffers: shared hit=99691 dirtied=6
  ->  Nested Loop Left Join  (cost=13396.60..17930.15 rows=767 width=980) (actual time=35.803..70.901 rows=3591 loops=1)
        Buffers: shared hit=35412 dirtied=6
        ->  Gather Merge  (cost=13390.83..13480.16 rows=767 width=948) (actual time=35.761..37.467 rows=3591 loops=1)
              Workers Planned: 2
              Workers Launched: 0
              Buffers: shared hit=21001 dirtied=6
              ->  Sort  (cost=12390.81..12391.61 rows=320 width=948) (actual time=34.585..35.633 rows=3591 loops=1)
                    Sort Key: i.updated DESC
                    Sort Method: quicksort  Memory: 4390kB
                    Buffers: shared hit=21001 dirtied=6
                    ->  Hash Join  (cost=163.82..12377.49 rows=320 width=948) (actual time=1.862..28.133 rows=3591 loops=1)
                          Hash Cond: (i.incident_type_id = t.incident_type_id)
                          Buffers: shared hit=21001 dirtied=6
                          ->  Hash Join  (cost=161.90..12374.64 rows=320 width=950) (actual time=1.812..26.846 rows=3591 loops=1)
                                Hash Cond: (i.incident_priority_id = p.incident_priority_id)
                                Buffers: shared hit=21000 dirtied=6
                                ->  Nested Loop  (cost=160.81..12371.82 rows=320 width=960) (actual time=1.773..25.532 rows=3591 loops=1)
                                      Buffers: shared hit=20999 dirtied=6
                                      ->  Hash Join  (cost=160.26..6927.37 rows=1218 width=892) (actual time=1.751..8.184 rows=3591 loops=1)
                                            Hash Cond: (i.incident_status_id = s.incident_status_id)
                                            Join Filter: ((s.incident_status_name <> 'resolved'::text) OR (i.updated >= (CURRENT_TIMESTAMP - '2 days'::interval)))
                                            Rows Removed by Join Filter: 115
                                            Buffers: shared hit=3044
                                            ->  Parallel Bitmap Heap Scan on incidents i  (cost=159.19..6913.13 rows=1754 width=897) (actual time=1.717..6.233 rows=3706 loops=1)
                                                  Recheck Cond: (organization_id = '9de5d801-235a-4451-8c89-d2c3974c71e8'::uuid)
                                                  Filter: (deleted IS NULL)
                                                  Heap Blocks: exact=2891
                                                  Buffers: shared hit=3043
                                                  ->  Bitmap Index Scan on incidents_created_organization_id_incident_id_idx  (cost=0.00..158.14 rows=4229 width=0) (actual time=1.228..1.228 rows=3708 loops=1)
                                                        Index Cond: (organization_id = '9de5d801-235a-4451-8c89-d2c3974c71e8'::uuid)
                                                        Buffers: shared hit=152
                                            ->  Hash  (cost=1.03..1.03 rows=3 width=27) (actual time=0.011..0.012 rows=3 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                  Buffers: shared hit=1
                                                  ->  Seq Scan on incident_statuses s  (cost=0.00..1.03 rows=3 width=27) (actual time=0.002..0.003 rows=3 loops=1)
                                                        Filter: (deleted IS NULL)
                                                        Buffers: shared hit=1
                                      ->  Index Scan using locations_pkey on locations l  (cost=0.56..4.47 rows=1 width=116) (actual time=0.004..0.004 rows=1 loops=3591)
                                            Index Cond: (location_id = i.location_id)
                                            Filter: (organization_id = '9de5d801-235a-4451-8c89-d2c3974c71e8'::uuid)
                                            Buffers: shared hit=17955 dirtied=6
                                ->  Hash  (cost=1.04..1.04 rows=4 width=22) (actual time=0.009..0.010 rows=4 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                      Buffers: shared hit=1
                                      ->  Seq Scan on incident_priorities p  (cost=0.00..1.04 rows=4 width=22) (actual time=0.004..0.005 rows=4 loops=1)
                                            Filter: (deleted IS NULL)
                                            Buffers: shared hit=1
                          ->  Hash  (cost=1.41..1.41 rows=41 width=30) (actual time=0.026..0.027 rows=41 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 11kB
                                Buffers: shared hit=1
                                ->  Seq Scan on incident_types t  (cost=0.00..1.41 rows=41 width=30) (actual time=0.007..0.013 rows=41 loops=1)
                                      Filter: (deleted IS NULL)
                                      Buffers: shared hit=1
        ->  Aggregate  (cost=5.77..5.78 rows=1 width=32) (actual time=0.008..0.008 rows=1 loops=3591)
              Buffers: shared hit=14411
              ->  Index Scan using dep_blocker_live_idx on dependencies dir  (cost=0.42..5.63 rows=28 width=46) (actual time=0.005..0.005 rows=1 loops=3591)
                    Index Cond: ((blocker_type = 'incidents'::text) AND (blocker_id = i.incident_id))
                    Filter: (dependent_type = 'intel_requests'::text)
                    Buffers: shared hit=14411
  ->  Aggregate  (cost=1075.77..1075.78 rows=1 width=32) (actual time=0.156..0.156 rows=1 loops=3591)
        Buffers: shared hit=64279
        ->  Nested Loop  (cost=0.85..1075.65 rows=24 width=46) (actual time=0.009..0.032 rows=42 loops=3591)
              Buffers: shared hit=64279
              ->  Index Scan using dep_blocker_live_idx on dependencies dir_1  (cost=0.42..5.63 rows=28 width=16) (actual time=0.003..0.003 rows=1 loops=3591)
                    Index Cond: ((blocker_type = 'incidents'::text) AND (blocker_id = i.incident_id))
                    Filter: (dependent_type = 'intel_requests'::text)
                    Buffers: shared hit=14411
              ->  Index Scan using dep_blocker_live_idx on dependencies pdir  (cost=0.42..37.96 rows=26 width=62) (actual time=0.006..0.022 rows=41 loops=3670)
                    Index Cond: ((blocker_type = 'intel_requests'::text) AND (blocker_id = dir_1.dependent_id))
                    Filter: (dependent_type = 'intel_requests'::text)
                    Buffers: shared hit=49868
Planning:
  Buffers: shared hit=70
Planning Time: 8.052 ms
Execution Time: 645.797 ms

Query 8: Dependency Table Statistics

Run these to understand the data distribution:

-- How many total dependencies?
SELECT count(*) FROM dependencies.dependencies WHERE deleted IS NULL;

-- Distribution by type pair
SELECT blocker_type, dependent_type, count(*)
  FROM dependencies.dependencies
 WHERE deleted IS NULL
 GROUP BY blocker_type, dependent_type
 ORDER BY count(*) DESC;

-- How many dependencies does Charter's slowest intel_request have?
SELECT count(*)
  FROM dependencies.dependencies
 WHERE blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'
   AND blocker_type = 'intel_requests'
   AND deleted IS NULL;

-- Max recursion depth from that intel_request
WITH RECURSIVE dependents AS (
    SELECT d.dependent_id, d.dependent_type, 1 AS depth
      FROM dependencies.dependencies d
     WHERE d.blocker_id = 'c4fd4a5d-e5ec-4c02-908f-0a88c4b4f436'
       AND d.blocker_type = 'intel_requests'
       AND d.deleted IS NULL
    UNION ALL
    SELECT d.dependent_id, d.dependent_type, r.depth + 1
      FROM dependencies.dependencies d
      JOIN dependents r ON d.blocker_id = r.dependent_id AND d.blocker_type = r.dependent_type
     WHERE d.deleted IS NULL AND r.depth < 20
)
SELECT max(depth), count(*) FROM dependents;

-- Table and index sizes
SELECT pg_size_pretty(pg_total_relation_size('dependencies.dependencies')) AS total_size,
       pg_size_pretty(pg_relation_size('dependencies.dependencies')) AS table_size,
       pg_size_pretty(pg_indexes_size('dependencies.dependencies')) AS indexes_size;

-- Index usage stats
SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
  FROM pg_stat_user_indexes
 WHERE schemaname = 'dependencies'
 ORDER BY idx_scan DESC;

Execution Order

Run in this order for maximum insight:

  1. Query 8 (statistics) — understand the data shape first
  2. Query 1a (recursive CTE, slowest ID) — this is the #1 bottleneck
  3. Query 1c (same with max_distance=5) — compare to see depth-limit impact
  4. Query 2 (direct deps) — should be fast, confirms index usage
  5. Query 3a (incident recursive) — compare incident vs intel_request recursion depth
  6. Query 4 (base record) — confirm PK lookup is instant
  7. Query 7 (current incidents LATERAL) — validate the optimization branch

Anomaly: get_intel_requests (bulk, 0 IDs) taking 1-5 seconds

Multiple log entries show:

Timing: get_intel_requests (bulk, 0 IDs) took 1.227s-5.090s

With 0 IDs, the bulk endpoint should return instantly. This is likely:

  • HTTP connection overhead under load (connection pool exhaustion / queuing)
  • Or the timing includes the incorporate_dependency_data() call on the result of the PREVIOUS get_intel_request() call (the structured logging may be attributing parent timing context)

To verify, check if the intel-requests-api itself logs slow queries for empty bulk calls.


AAA Organization Calls Summary

Call Endpoint Duration Frequency
get_organization GET /organizations/9de5d801-235a-4451-8c89-d2c3974c71e8 49-55ms 1x per message
  • Consistently fast (~50ms) — not a bottleneck
  • Called in app.py:375 AFTER generate_context_data() returns
  • Could be parallelized with generate_context_data() using asyncio.gather() for ~50ms savings
  • No direct SQL — this is an HTTP call to the AAA service which queries its own database
  • Charter org metadata includes: integrations config, notification subscriptions, use_current_incidents_cache flag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment