Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save sergioloppe/1f2785feee7ab4fe89773c35cb9982da to your computer and use it in GitHub Desktop.

Select an option

Save sergioloppe/1f2785feee7ab4fe89773c35cb9982da to your computer and use it in GitHub Desktop.
8 Rules to Improve Query Performance in PostgreSQL

1. Indexing

  • Create indexes on frequently filtered columns: Ensure you have indexes on columns used in WHERE, JOIN, and ORDER BY clauses (e.g., deleted_at IS NULL, created_at).
  • Use conditional indexes: If you’re often filtering by a condition (e.g., deleted_at IS NULL), create a partial index.
  • Use CONCURRENTLY when creating indexes in production: This prevents locking the table during the index creation.
  • Remove duplicate indexes: Multiple identical or overlapping indexes slow down writes without providing extra benefits.

2. Parallelism and PostgreSQL Settings

  • Enable parallel execution: Make sure PostgreSQL is configured to utilize parallelism for large queries by adjusting parameters like max_parallel_workers_per_gather and max_parallel_workers.
  • Use enable_parallel_append for UNION queries: For UNION queries across multiple tables, enabling parallel append can optimize performance.

3. Optimizing Query Structure

  • Avoid UNION unless necessary: If deduplication isn’t required, use UNION ALL instead of UNION. UNION adds overhead by trying to remove duplicates.
  • Use CASE expressions sparingly: Optimize the logic inside CASE expressions if possible, especially when used in JOIN conditions or large result sets.
  • Postpone complex formatting: Functions like pg_size_pretty or similar formatting should be handled after fetching the raw data to avoid slowing down the main query.

4. Reduce Data Scanned

  • Limit data with targeted WHERE clauses: Filter data earlier in the query rather than scanning entire tables.
  • Optimize date filters: When filtering by date (e.g., created_at), ensure that the date range is used effectively with appropriate indexes.
  • Reduce JSON filtering: JSON or text-based filters (e.g., LIKE '%value%') are costly. Try to move this filtering into application logic or preprocess the data.

5. Pre-aggregation and Caching

  • Use materialized views for heavy aggregation: If the result set is relatively static, you can use a materialized view to cache the results and refresh it periodically.
  • Cache results for frequently executed heavy queries: For queries that count large datasets (e.g., counting rows in huge tables), periodically store these results in a summary table and refresh the table periodically.

6. Analyze and Tune Slow Queries

  • Use EXPLAIN (ANALYZE) to understand query execution: This will help you identify which part of the query is slow, whether it’s related to disk I/O, sequential scans, or sorting.
  • Look at buffer usage: Using BUFFERS in EXPLAIN can help analyze how much of the query is reading from disk versus memory.
  • Batch processing or chunking: For large tables, consider breaking the query into smaller batches or using parallel processing if the data can be divided.

7. Optimize Joins

  • Ensure appropriate indexes for JOINs: Index the columns used in JOINs to prevent full table scans.
  • Simplify JOINs when possible: Reduce unnecessary complexity in JOINs by ensuring that only relevant columns and filters are used.

8. General PostgreSQL Tuning

  • Tune PostgreSQL configuration: Adjust settings like work_mem, maintenance_work_mem, and shared_buffers based on the size of your data and queries.
  • Monitor disk I/O: Large scans on unindexed tables can cause disk bottlenecks. Optimize by limiting the size of scanned data or using indexes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment