Picture this: a query runs great for the first few weeks because the table has 10,000 rows. Then the end of the month arrives, data grows, and execution time jumps from 200 ms to 8 seconds. Users call in, your manager asks questions, you dig through logs.
This post is a hands-on guide to diagnosing and fixing slow queries in PostgreSQL — no theory for its own sake. You’ll see the tools, concrete anti-patterns, and one large real-world refactoring example with a measurable result.
EXPLAIN ANALYZE — Start Here, Every Time#
Before changing anything, you need to know why the query is slow. That’s what EXPLAIN ANALYZE is for.
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM applications WHERE status = 'approved';Sample output:
Seq Scan on applications (cost=0.00..46323.45 rows=97234 width=64)
(actual time=0.23..245.12 rows=97234 loops=1)
Filter: (status = 'approved')
Rows Removed by Filter: 1902766
Execution Time: 248.56 msKey fields to read:
| Field | Meaning |
|---|---|
cost=0.00..46323.45 | Planner’s estimated cost (abstract units, not ms) |
actual time=0.23..245.12 | Real time: first row → last row in ms |
rows=97234 | Rows actually returned by this node |
Rows Removed by Filter: 1902766 | Red flag — 95% of scanned data discarded |
Seq Scan | Full table scan — often a signal of a missing or unused index |
Key rule: compare rows in the plan with actual rows. A big gap means stale statistics — fix with ANALYZE applications; or wait for autovacuum.
Anti-Patterns That Kill Performance#
1. Correlated Subqueries#
The most frequent and most expensive mistake — especially for developers used to thinking in ORM terms.
-- BAD: for every user row, PostgreSQL runs 2 separate queries
SELECT
u.full_name,
(SELECT COUNT(*) FROM applications a WHERE a.user_id = u.id) AS app_count,
(SELECT SUM(p.amount) FROM payments p
JOIN applications a ON p.application_id = a.id
WHERE a.user_id = u.id AND p.status = 'paid') AS total_paid
FROM users u
WHERE u.region = 'mazowieckie';If the query returns 10,000 users, PostgreSQL executes 20,000 additional queries. EXPLAIN shows loops=10000 next to each SubPlan.
-- GOOD: one scan, one pass over the data
SELECT
u.full_name,
s.app_count,
s.total_paid
FROM users u
JOIN (
SELECT
a.user_id,
COUNT(*) AS app_count,
COALESCE(SUM(p.amount) FILTER (WHERE p.status = 'paid'), 0) AS total_paid
FROM applications a
LEFT JOIN payments p ON p.application_id = a.id
GROUP BY a.user_id
) s ON u.id = s.user_id
WHERE u.region = 'mazowieckie';2. Functions on Indexed Columns#
A classic mistake that completely disables the index:
-- BAD — index on submitted_at is ignored; PostgreSQL must compute DATE() for every row
WHERE DATE(submitted_at) = '2024-01-15'
WHERE EXTRACT(YEAR FROM submitted_at) = 2024
WHERE LOWER(email) = 'jan@example.com'
-- GOOD — index is used
WHERE submitted_at >= '2024-01-15' AND submitted_at < '2024-01-16'
WHERE submitted_at >= '2024-01-01' AND submitted_at < '2025-01-01'
-- For LOWER: create a functional index once
CREATE INDEX idx_users_email_lower ON users (LOWER(email));PostgreSQL cannot use a B-tree index when the column value is transformed on the fly — the planner must evaluate the function for every row in the table.
3. SELECT * — Harmless-Looking, Actually Costly#
-- BAD
SELECT * FROM applications WHERE user_id = 42;
-- GOOD
SELECT id, status, submitted_at FROM applications WHERE user_id = 42;Three reasons to never write SELECT * in production code:
- Unnecessary I/O — you’re fetching columns you’ll never use
- Covering indexes stop working — a covering index can serve a query without accessing the heap (table).
SELECT *always requires a heap fetch - Future trap — when someone adds a large JSONB column, every
SELECT *query instantly gets slower
4. IN with Subquery vs EXISTS vs JOIN#
-- SLOW for large sets — PostgreSQL materializes the entire subquery result
WHERE user_id IN (SELECT id FROM users WHERE region = 'mazowieckie')
-- FASTER — PostgreSQL can stop scanning at the first match
WHERE EXISTS (
SELECT 1 FROM users u
WHERE u.id = a.user_id AND u.region = 'mazowieckie'
)
-- USUALLY FASTEST — the planner has the most freedom to choose a strategy
JOIN users u ON u.id = a.user_id AND u.region = 'mazowieckie'IN works well for small, static lists (e.g., status IN ('approved', 'pending')). For subqueries — prefer EXISTS or JOIN.
5. DISTINCT as a Band-Aid#
DISTINCT is almost always a signal that your JOIN is producing duplicates and the real problem is deeper.
-- BAD — DISTINCT masks the problem and is expensive on top of that
SELECT DISTINCT u.full_name
FROM users u
JOIN applications a ON a.user_id = u.id
WHERE a.status = 'approved';
-- GOOD — EXISTS checks the condition without generating duplicates
SELECT u.full_name
FROM users u
WHERE EXISTS (
SELECT 1 FROM applications a
WHERE a.user_id = u.id AND a.status = 'approved'
);Indexes — When They Help, When They Don’t#
Composite Indexes — Column Order Is Not Arbitrary#
-- This index supports:
-- WHERE status = 'approved' ✓
-- WHERE status = 'approved' AND submitted_at > '2024-01' ✓
-- WHERE submitted_at > '2024-01' ✗ (index not used!)
CREATE INDEX idx_apps_status_date ON applications (status, submitted_at);Rule: columns with equality conditions (=) come first, then range columns (>, <, BETWEEN). PostgreSQL can only jump to a specific position in an index if all left-side columns are exact matches.
Partial Indexes — Powerful and Underused#
If your queries always filter on the same constant condition, build an index only on that subset:
-- Instead of indexing all 2M rows...
CREATE INDEX idx_apps_status ON applications (status);
-- ...index only the rows you actually query:
CREATE INDEX idx_apps_approved ON applications (submitted_at, user_id)
WHERE status = 'approved';Benefits: a fraction of the full index size, faster INSERT/UPDATE, faster queries on that subset.
When an Index Is NOT Used#
- Function on the column —
WHERE DATE(submitted_at) = ...(covered above) - Small table — a full scan can be faster than an index + heap fetch
- Low selectivity — if 80% of rows have
status = 'active', an index won’t help - Stale statistics — run
ANALYZE table_name;before diagnosing - Insufficient
work_mem— sort/hash join falls to disk instead of using an index
Unused indexes can be found in pg_stat_user_indexes. Unused indexes cost every INSERT and UPDATE — check and remove them regularly.
The Big Example: Reporting Query From 8 Seconds to 120 ms#
Context#
A benefits platform. We need a monthly report: users who submitted approved applications this month, with their total payment amounts.
-- Schema (simplified)
-- users: ~500,000 rows
-- applications: ~2,000,000 rows (status: 'pending' | 'approved' | 'rejected')
-- payments: ~1,500,000 rows (status: 'pending' | 'paid' | 'failed')The Query — “Works, Somehow” Version#
SELECT DISTINCT
u.full_name,
u.region,
(SELECT COUNT(*)
FROM applications a2
WHERE a2.user_id = u.id) AS total_applications,
(SELECT SUM(p.amount)
FROM payments p
JOIN applications a3 ON p.application_id = a3.id
WHERE a3.user_id = u.id
AND p.status = 'paid') AS total_paid
FROM users u
WHERE u.id IN (
SELECT user_id
FROM applications
WHERE status = 'approved'
AND submitted_at::date >= DATE_TRUNC('month', NOW())::date
)
ORDER BY total_paid DESC NULLS LAST;Execution time in production: ~8,200 ms
EXPLAIN ANALYZE — Diagnosis#
Sort (cost=289451.23..289651.23 rows=80000 width=128)
(actual time=8092.34..8099.78 rows=18432 loops=1)
Sort Method: external merge Disk: 2840kB ← ① sort spills to disk
-> HashAggregate [DISTINCT]
-> Hash Semi Join
-> Seq Scan on users u ← ② full scan of 500k rows
(actual time=0.12..48.34 rows=500000 loops=1)
-> Hash
-> Seq Scan on applications ← ③ full scan of 2M rows
Filter: ((submitted_at)::date >= ...)
Rows Removed by Filter: 1902766 ← ④ 95% discarded
(actual time=0.23..245.12 rows=97234 loops=1)
SubPlan 1 ← ⑤ executes 18,432 times
-> Aggregate on applications a2
(actual time=0.289 rows=1 loops=18432)
SubPlan 2 ← ⑥ executes 18,432 times
-> Hash Join on payments + applications
(actual time=0.412 rows=1 loops=18432)
Planning Time: 3.456 ms
Execution Time: 8243.678 msProblem Analysis#
| # | Problem | Impact |
|---|---|---|
| ① | Sort spills to disk (external merge) | Disk I/O instead of RAM |
| ② | Seq Scan on users — 500k rows | No pre-filtering before the join |
| ③ | Seq Scan on applications — 2M rows | submitted_at::date blocks index usage |
| ④ | 1,902,766 rows filtered out | 95% of work done for nothing |
| ⑤⑥ | SubPlans execute ×18,432 | Correlated subqueries = N+1 in SQL |
Refactoring#
Step 1 — fix the date condition (unlocks the index):
-- Before
submitted_at::date >= DATE_TRUNC('month', NOW())::date
-- After
submitted_at >= DATE_TRUNC('month', NOW())
AND submitted_at < DATE_TRUNC('month', NOW()) + INTERVAL '1 month'Step 2 — add the right indexes:
-- Partial index — indexes only approved applications (fraction of the full table)
CREATE INDEX CONCURRENTLY idx_apps_approved_submitted
ON applications (submitted_at, user_id)
WHERE status = 'approved';
-- Covering index for payments — serves the JOIN without heap access
CREATE INDEX CONCURRENTLY idx_payments_app_status_amount
ON payments (application_id, status, amount);Step 3 — rewrite the query:
WITH monthly_approved AS (
-- Fast scan through partial index instead of filtering 2M rows
SELECT DISTINCT user_id
FROM applications
WHERE status = 'approved'
AND submitted_at >= DATE_TRUNC('month', NOW())
AND submitted_at < DATE_TRUNC('month', NOW()) + INTERVAL '1 month'
),
user_stats AS (
-- Single pass over the data instead of N correlated subqueries
SELECT
a.user_id,
COUNT(*) AS total_applications,
COALESCE(SUM(p.amount) FILTER (WHERE p.status = 'paid'), 0) AS total_paid
FROM applications a
LEFT JOIN payments p ON p.application_id = a.id
WHERE a.user_id IN (SELECT user_id FROM monthly_approved)
GROUP BY a.user_id
)
SELECT
u.full_name,
u.region,
us.total_applications,
us.total_paid
FROM users u
JOIN user_stats us ON u.id = us.user_id
ORDER BY us.total_paid DESC NULLS LAST;EXPLAIN After Optimization#
Sort (cost=12834.56..12884.56 rows=20000 width=128)
(actual time=118.34..119.56 rows=18432 loops=1)
Sort Method: quicksort Memory: 2340kB ← sort stays in RAM
-> Hash Join
-> Seq Scan on users u
(actual time=0.12..34.56 rows=500000 loops=1)
-> HashAggregate [user_stats]
-> Hash Join
-> Index Scan using idx_apps_approved_submitted ← partial index
Index Cond: (submitted_at >= ... AND ...)
(actual time=0.23..8.45 rows=18432 loops=1)
-> Index Scan using idx_payments_app_status_amount
(actual time=0.12..18.34 rows=73456 loops=1)
Planning Time: 2.134 ms
Execution Time: 121.234 ms8,243 ms → 121 ms — ×68 improvement
Checklist#
- Never wrap indexed columns in functions — use range conditions instead
- Replace correlated subqueries with a JOIN + GROUP BY or a CTE
- Run
EXPLAIN (ANALYZE, BUFFERS)before changing anything -
SELECT *only in a REPL — never in production code - Partial indexes for constant, frequently-used filter conditions
- In composite indexes: equality columns first, range columns last
-
FILTER (WHERE ...)instead of CASE in aggregations — cleaner and faster -
CREATE INDEX CONCURRENTLY— no table lock in production - Monitor
pg_stat_user_indexes— unused indexes cost every write operation - Run
ANALYZEafter bulk inserts if you can’t wait for autovacuum
Good optimization isn’t about hacking queries for a benchmark. It’s about understanding how the PostgreSQL planner thinks — and writing code that helps it do its job.