PostgreSQL Indexing Strategies: A Complete Performance Guide

Learn how to optimize PostgreSQL performance with the right indexing strategies. Covers B-tree, GIN, GiST indexes and when to use each.

Alex Chen
Alex Chen
December 18, 2024 15 min read
PostgreSQL Indexing Strategies: A Complete Performance Guide

Database performance often comes down to one thing: proper indexing. A well-designed index can turn a query that takes minutes into one that completes in milliseconds. In this comprehensive guide, we’ll explore PostgreSQL indexing strategies that will supercharge your database performance.

Understanding Index Types

PostgreSQL offers several index types, each optimized for different use cases:

Index TypeBest ForExample Use Case
B-treeEquality and range queriesWHERE id = 5, WHERE date > '2024-01-01'
HashEquality comparisons onlyWHERE status = 'active'
GINFull-text search, arrays, JSONBWHERE tags @> ARRAY['vue']
GiSTGeometric data, full-textWHERE location <@ box '...'
BRINLarge tables with natural orderingTime-series data

B-tree Indexes: The Workhorse

B-tree is the default and most commonly used index type. It excels at:

  • Equality comparisons (=)
  • Range queries (<, >, <=, >=, BETWEEN)
  • Pattern matching with prefix (LIKE 'abc%')
  • Sorting (ORDER BY)
-- Basic B-tree index
CREATE INDEX idx_users_email ON users(email);

-- Composite index for multi-column queries
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);

-- Partial index for subset of rows
CREATE INDEX idx_active_users ON users(email)
WHERE status = 'active';

-- Index with included columns (covering index)
CREATE INDEX idx_orders_covering ON orders(user_id)
INCLUDE (total, status);

Multi-Column Index Order Matters

The order of columns in a composite index is crucial:

-- This index supports:
CREATE INDEX idx_example ON orders(user_id, status, created_at);

-- βœ… WHERE user_id = 1
-- βœ… WHERE user_id = 1 AND status = 'pending'
-- βœ… WHERE user_id = 1 AND status = 'pending' AND created_at > '2024-01-01'
-- ❌ WHERE status = 'pending' (first column not used)
-- ❌ WHERE created_at > '2024-01-01' (first columns not used)

GIN Indexes for JSONB and Arrays

GIN (Generalized Inverted Index) is perfect for composite types:

-- JSONB containment queries
CREATE INDEX idx_products_metadata ON products USING GIN(metadata);

-- Query examples that use this index
SELECT * FROM products WHERE metadata @> '{"category": "electronics"}';
SELECT * FROM products WHERE metadata ? 'featured';
SELECT * FROM products WHERE metadata ?& ARRAY['color', 'size'];

-- For specific JSONB paths (more efficient for known paths)
CREATE INDEX idx_products_category ON products
USING GIN((metadata -> 'category'));

-- Array operations
CREATE INDEX idx_posts_tags ON posts USING GIN(tags);

-- Query using the array index
SELECT * FROM posts WHERE tags @> ARRAY['vue', 'typescript'];
-- GIN: Faster lookups, slower updates, larger size
CREATE INDEX idx_articles_fts_gin ON articles
USING GIN(to_tsvector('english', title || ' ' || content));

-- GiST: Faster updates, slower lookups, smaller size
CREATE INDEX idx_articles_fts_gist ON articles
USING GiST(to_tsvector('english', title || ' ' || content));

-- Full-text search query
SELECT * FROM articles
WHERE to_tsvector('english', title || ' ' || content)
      @@ plainto_tsquery('english', 'postgresql performance');

Expression Indexes

Create indexes on expressions, not just columns:

-- Lower-case email lookups
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Query that uses this index
SELECT * FROM users WHERE LOWER(email) = 'john@example.com';

-- Date extraction
CREATE INDEX idx_orders_year ON orders(EXTRACT(YEAR FROM created_at));

-- JSON expression
CREATE INDEX idx_users_settings_theme ON users((settings->>'theme'));

Partial Indexes for Targeted Performance

Partial indexes only include rows that match a condition:

-- Only index active users (much smaller index)
CREATE INDEX idx_active_users ON users(email)
WHERE deleted_at IS NULL;

-- Only index recent orders
CREATE INDEX idx_recent_orders ON orders(user_id, created_at)
WHERE created_at > '2024-01-01';

-- Only index pending items
CREATE INDEX idx_pending_tasks ON tasks(priority, created_at)
WHERE status = 'pending';

BRIN Indexes for Large Tables

BRIN (Block Range INdex) is extremely space-efficient for naturally ordered data:

-- Perfect for time-series data
CREATE INDEX idx_logs_created ON logs USING BRIN(created_at);

-- Check correlation (should be close to 1 or -1 for BRIN to be effective)
SELECT correlation FROM pg_stats
WHERE tablename = 'logs' AND attname = 'created_at';

-- Compare sizes
SELECT pg_size_pretty(pg_relation_size('idx_logs_created_btree')) as btree_size,
       pg_size_pretty(pg_relation_size('idx_logs_created_brin')) as brin_size;

Analyzing Index Usage

Always verify your indexes are being used:

-- Check if index is used
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'john@example.com';

-- Find unused indexes
SELECT schemaname, relname, indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND schemaname NOT IN ('pg_catalog', 'pg_toast');

-- Index size and usage stats
SELECT
    indexrelname as index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
    idx_scan as times_used,
    idx_tup_read as tuples_read
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

Index Maintenance Best Practices

-- Rebuild bloated indexes
REINDEX INDEX CONCURRENTLY idx_users_email;

-- Check index bloat
SELECT
    schemaname, tablename, indexname,
    pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
    idx_scan as number_of_scans,
    idx_tup_read as tuples_read
FROM pg_stat_user_indexes
JOIN pg_index ON indexrelid = pg_index.indexrelid
WHERE NOT indisunique
ORDER BY pg_relation_size(indexrelid) DESC;

-- Analyze table statistics (crucial for query planner)
ANALYZE users;

-- Auto-vacuum settings for high-traffic tables
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.05,
    autovacuum_analyze_scale_factor = 0.02
);

Common Indexing Mistakes

1. Over-indexing

-- ❌ Too many indexes slow down writes
CREATE INDEX idx_users_1 ON users(email);
CREATE INDEX idx_users_2 ON users(email, name);
CREATE INDEX idx_users_3 ON users(name, email);
CREATE INDEX idx_users_4 ON users(name);

-- βœ… Consolidated approach
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_name ON users(name);

2. Ignoring NULL Handling

-- B-tree indexes include NULLs at one end
CREATE INDEX idx_users_deleted ON users(deleted_at);

-- For IS NULL queries, consider partial index
CREATE INDEX idx_users_not_deleted ON users(id)
WHERE deleted_at IS NULL;

3. Wrong Column Order

-- ❌ Low selectivity column first
CREATE INDEX idx_orders_status_user ON orders(status, user_id);

-- βœ… High selectivity column first
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

Real-World Example: E-commerce Query Optimization

-- Common e-commerce query pattern
SELECT p.id, p.name, p.price, c.name as category
FROM products p
JOIN categories c ON p.category_id = c.id
WHERE p.status = 'active'
  AND p.price BETWEEN 10 AND 100
  AND p.metadata @> '{"in_stock": true}'
ORDER BY p.created_at DESC
LIMIT 20;

-- Optimal indexes for this query
CREATE INDEX idx_products_status_price ON products(status, price)
WHERE status = 'active';

CREATE INDEX idx_products_metadata ON products USING GIN(metadata);

CREATE INDEX idx_products_created ON products(created_at DESC)
WHERE status = 'active';

Conclusion

Effective PostgreSQL indexing requires understanding:

  1. Your query patterns: Analyze actual queries before creating indexes
  2. Index types: Choose the right type for your data and queries
  3. Column order: Put high-selectivity columns first in composite indexes
  4. Partial indexes: Reduce index size by targeting specific rows
  5. Maintenance: Regularly analyze and reindex as needed

Start with EXPLAIN ANALYZE, measure your query performance, and iterate. Remember: the best index is one that’s actually used.


Have questions about PostgreSQL performance? Drop them in the comments!

Advertisement

In-Article Ad

Dev Mode

Share this article

Alex Chen

Alex Chen

Senior Full-Stack Developer

I'm a passionate full-stack developer with 10+ years of experience building scalable web applications. I write about Vue.js, Node.js, PostgreSQL, and modern DevOps practices.

Enjoyed this article?

Subscribe to get more tech content delivered to your inbox.

Related Articles