Stop blaming the app: where slowness actually comes from

When an application feels heavy or unresponsive, it is natural to look at the code first. we blame the framework, the language, or the complexity of the logic. but more often than not, the application layer is simply waiting. it is waiting on the database to return data that it should have found milliseconds ago.

Slowness is rarely a mystery; it is usually a math problem involving disk input/output (I/O), memory, and network latency. improving performance isn't about buying more expensive servers or rewriting your entire codebase in a faster language. it is about understanding how your data lives on the disk and how you ask for it. when you stop blaming the app and start respecting the physics of your database, you find the clarity needed to make things fast.

Indexes: make them work for you (not against you)

Good indexes create fast reads, smaller scans, and efficient plans. bad indexes create wasted disk space, slower writes, and surprise bloat. understanding the difference is the first step to a healthy database.

What actually helps

The B-tree index is your bread and butter. use this on primary keys and columns used for equality or range predicates. it behaves predictably and handles the vast majority of standard queries. [PostgreSQL Docs - Indexes]

For hot read paths, look at covering indexes. by including non-key columns (using INCLUDE), you allow Postgres to satisfy queries from the index alone, known as an index-only scan. this removes the need to visit the main table heap entirely for that query. [PostgreSQL Docs - CREATE INDEX]

If you have specific query patterns, partial indexes are a powerful tool. they are tiny and fast because they only index rows that match a specific predicate, such as WHERE deleted = false. use them to avoid indexing millions of archived or irrelevant rows that your active queries never touch. [PostgreSQL Docs - Partial Indexes]

For specialised data, choose the right tool. use GiST or GIN indexes for full-text search, arrays, and jsonb. for huge, append-only tables with natural ordering like time-series data, BRIN indexes are incredibly lightweight and effective. [PostgreSQL Docs - GIN Indexes], [PostgreSQL Docs - BRIN Indexes]

What hurts (and why)

Indexes are not free. indexes on very low-cardinality columns, such as booleans or status flags with few variations, rarely help and simply add overhead. every time you write to the table, the database must update every relevant index. [Percona - How many indexes should I have?]

Duplicate or redundant indexes are another common trap. having the same columns indexed in a different order, or an index that is fully covered by a wider existing index, wastes space and slows down writes without improving read speeds. over-indexing increases write-ahead log (WAL) volume, CPU usage, and storage costs. remember that every insert, update, or delete has a write penalty proportional to your index count. [Percona - How many indexes should I have?]

How to spot unused indexes and bloat

You can query the database statistics to find dead weight. pg_stat_user_indexes tracks scans. if idx_scan is zero, it is a strong signal that the index might be safe to drop, though you should always double-check your reporting queries first. [PostgreSQL Docs - pg_stat_user_indexes]

To prioritise which indexes to clean up, look at their size versus their usefulness. combining pg_relation_size with the scan counts will show you exactly how much space is being wasted by indexes that are never read. [PostgreSQL Docs - Object Size Functions]

Eventually, indexes get bloated. pgstattuple gives a live estimate of dead tuples and fragmentation. proven community scripts and queries can help you compare live tuple counts against on-disk sizes to quantify bloat before you decide to intervene. [PostgreSQL Docs - pgstattuple], [EnterpriseDB - Resolving Bloat]

Safe steps to reclaim space

When you find problems, fix them carefully. REINDEX is useful for corrupted or fragmented indexes. In production, use REINDEX CONCURRENTLY to rebuild the index without taking exclusive locks that stop your application. [PostgreSQL Docs - REINDEX]

VACUUM FULL will shrink tables aggressively, but it locks them completely. avoid this during peak hours. for busy databases that cannot afford downtime, pg_repack is an excellent tool for online table and index repacking. [PostgreSQL Docs - VACUUM], [pg_repack Extension]

For MySQL or InnoDB environments, OPTIMIZE TABLE or rebuilding indexes via ALTER TABLE ... FORCE behaves differently, so always consult the engine-specific documentation. [MySQL Docs - OPTIMIZE TABLE]

A practical maintenance checklist

Monthly: Run a pg_stat_user_indexes report. Sort by index size and scan count. Drop obvious unused indexes, but verify against your query plans first. [pganalyze - Unused Indexes]
Quarterly: Run pgstattuple or a bloat script on your heaviest tables. Decide if you need to run REINDEX, VACUUM FULL, or pg_repack during a low-traffic window.
On Creation: Whenever you add a new index, ask yourself if it reduces row fetches or avoids a table scan. If not, consider if a partial or covering index would be more efficient.
On Performance Drops: If writes slow down after adding indexes, measure WAL growth and CPU usage. Consider dropping non-critical indexes, as every extra index adds a cost to writing.

If you are planning structural work, it helps to start with a solid base. we discuss this in our guide to building a simple yet strong data foundation.

Queries: read the plan, rewrite the plan

Find the slow ones first

Stop guessing. Let the database tell you what is slow. Enable the slow query log in MySQL or use pg_stat_statements in Postgres to get a ranked list of heavy queries. [MySQL Docs - Slow Query Log], [PostgreSQL Docs - pg_stat_statements]

It is also wise to check your application logs and APM traces. Problems like the N+1 pattern usually show up here as a cascade of similar queries triggered by a single request.

Run the plan (and actually execute it)

Use EXPLAIN to see what the planner intends to do, and EXPLAIN ANALYZE to see what actually happened. Postgres will run the query and report real timings. Look for discrepancies between estimated and actual rows, long sequential scans, or nested-loop joins that are blowing up. In MySQL, watch out for operations like "Using temporary" or "Using filesort". [PostgreSQL Docs - EXPLAIN], [MySQL Docs - EXPLAIN ANALYZE]

When reading a plan, check for sequential scans on large tables, which suggest a missing index. If a nested loop is multiplying rows into huge workloads, you might have a bad join order. If estimated rows are way off from actual rows, your statistics are likely stale.

Common anti-patterns and tiny rewrites

Stop using SELECT *. Select only the columns you need. This reduces I/O transfer and enables index-only scans if your index covers those specific columns. [Use The Index, Luke!]

Avoid wrapping indexed columns in functions within your WHERE clause. WHERE LOWER(name) = 'alice' prevents the database from using a standard index on name. Instead, create a functional index on lower(name) or store a normalised column. [PostgreSQL Docs - Indexes on Expressions]

Be careful with DISTINCT. It often hides join problems. Instead of filtering duplicates at the end, remove the cause—usually an unnecessary join—or use EXISTS. DISTINCT often forces expensive sorting operations.

Use UNION ALL instead of UNION unless you specifically need to remove duplicates. UNION performs a sort and unique step that UNION ALL skips entirely.

Small changes, big wins

Replace IN with EXISTS for correlated subqueries. EXISTS can often short-circuit and avoid materialising large lists, which makes a significant difference depending on your engine. [Use The Index, Luke! - IN vs EXISTS]

Switch to keyset pagination instead of using OFFSET. OFFSET forces the database to scan and discard rows, becoming slower the deeper you page. using a WHERE clause on an indexed column (like created_at) is consistent and fast regardless of depth.

If you find yourself stuck, remember that small rewrites combined with one targeted index can often halve your query time.

Schema design that scales (normalize smart, denormalize smarter)

Keep it simple. Normalize to avoid data mess, and denormalize to speed up the reads that matter.

Quick rules of thumb

Favor normalization for correctness. It provides a single source of truth and makes updates easy. Denormalization should be reserved for proven performance problems in contested read paths, not just because you think "it might be faster". Always measure before you optimize. [Martin Fowler - Denormalization]

Data modeling essentials

Start with the domain, not the user interface. Model your entities—users, orders, products—and their relationships first. Map your queries on top of this foundation. This makes later denormalization a targeted exercise rather than a messy default.

Choosing the right data types matters. Use exact numeric types for money, not floating-point numbers. Pick the smallest integer type that fits your needs to reduce index size. Use TIMESTAMPTZ for timestamps to avoid timezone headaches. [PostgreSQL Docs - Data Types]

JSONB is excellent for semi-structured data, but do not use it to avoid designing a schema where relational fields would be faster and safer. If you do use JSONB, index it with GIN for lookup speed. [PostgreSQL Docs - JSON Types]

Explicit keys keep relations clear. Prefer surrogate integer keys for primary keys; they are smaller and faster to join. Always index foreign key columns if you join or filter on them. Postgres does not auto-index foreign keys, and missing them is a common cause of slow joins and deletes. [Use The Index, Luke - Primary Keys], [PostgreSQL Docs - Constraints]

When and how to denormalize

Normalize first. Do this when you have many updates across the same logical item or need strong consistency.

Denormalize intentionally. This is for read-heavy queries that require expensive joins on critical paths, or when you need precomputed aggregates like counts and sums to avoid heavy scans.

Use patterns that actually scale. Store aggregates, such as a comments count, and update them via transactions or async workers. Use materialized views for complex reporting queries that can tolerate some staleness. For extremely hot reads, an external cache layer like Redis is often the best solution. [PostgreSQL Docs - Materialized Views], [Redis Docs - Caching]

Consistency is the challenge with denormalization. Try to update the base row and the denormalized fields in the same transaction. If that is too slow, push updates to a background worker and accept eventually consistent views, but always add monitoring to detect drift.

For a deeper look at handling data in legacy systems, you might find our thoughts on bridging the gap useful.

Partitioning, sharding, and other ways to stop a single node from dying alone

Partitioning slices data inside a single database instance to keep one node happy. Sharding spreads data across multiple instances so no single machine gets all the blame.

When to use what

Use partitioning when tables are huge but you want to stay on a single RDBMS instance. It helps query performance by pruning scans and makes maintenance tasks like backups or bulk deletes significantly easier. It works exceptionally well for time-series data. [PostgreSQL Docs - Partitioning]

Pick sharding (horizontal scaling) when you reach the CPU, I/O, or RAM limits of a single machine, or when you need geographic isolation. Sharding adds operational complexity but offers near-linear scale. Systems like MongoDB or Spanner are built around this. [MongoDB Docs - Sharding], [Google Cloud Spanner - Partitions]

Common pitfalls

The biggest risk is choosing a bad key. Monotonic keys, like sequential IDs, concentrate writes on one shard, creating a "hotspot" that kills performance. Aim for a key with high cardinality and even distribution. [Azure Cosmos DB - Partitioning]

Cross-shard joins and transactions are slow and painful. Design your data model so that your most common queries can be satisfied by a single shard.

Planning for growth

Start by measuring your current limits. Track your growth rates and know when a node will hit its threshold. Prefer partitioning first; it reduces table bloat and lets you postpone the complexity of sharding.

If you must shard, define your key based on real query patterns. Automate your rebalancing and observability early. And consider managed services—Cloud Spanner or Cosmos DB handle many of these trade-offs for you, allowing you to trade complexity for cost.

If you are unsure, start with partitioning, measure the results, and then plan for sharding with a tested strategy. For more on the operational side of scaling, read our checklist for cloud migration without downtime.

Speed ops: caching, connection pooling, cloud features, observability & quick wins (with mini case studies)

Often, slowness isn't about the code—it is about the plumbing. Caching, connection pooling, and blind spots in monitoring are often the real culprits.

Caching and connection pooling

Use caching where reads dominate. Cache database query results or API responses in Redis to avoid repeated heavy reads. A simple cache on your top 10 most-requested queries can drastically cut database load. [Redis - Caching Solutions]

Stop thrashing your database with new connections. Use a pooler like PgBouncer for Postgres or a cloud proxy like AWS RDS Proxy. These tools multiplex many application threads into a manageable number of database connections, preventing exhaustion during traffic spikes. [PgBouncer], [AWS RDS Proxy]

Cloud features and observability

Leverage read replicas to offload heavy SELECT traffic, keeping your primary node free for writes. Just be mindful of connection limits. [Amazon Aurora - Read Replicas]

You cannot fix what you cannot see. Measure p95 and p99 latency, error rates, and queue lengths. Use tracing (like OpenTelemetry) to link frontend latency to specific database queries. Adopt an SLO mindset: focus on improving the tail latency that affects your unhappiest users. [OpenTelemetry], [Google SRE Book - Handling Overload]

Quick wins you can do today

Compression: Turn on gzip or brotli and use a CDN. This cuts payload size globally. [Google Web Developers - Text Compression]
Browser Caching: Set Cache-Control headers for static assets.
Session Cache: Move sessions to a small Redis layer.
Logging: Enable slow-query logging and run EXPLAIN ANALYZE on the top offenders.
Pooling: Enable your cloud DB proxy or install PgBouncer.
N+1: Kill these queries with eager loading. [thoughtbot - Preventing N+1 Queries]

Mini case studies

In one scenario, an API endpoint was returning heavy catalog data. By adding a Redis cache with a 120-second TTL, the cache-hit ratio exceeded 85%, and database read operations dropped by nearly 70%.

In another case, connection pooling saved the day during a sudden traffic spike that was saturating the database. Introducing transaction pooling via PgBouncer eliminated connection errors and stabilized p99 latency immediately.

If you pick one thing to start with today: enable your slow-query logs and add a tiny Redis cache. You will usually see measurable wins inside a day.

Stop Blaming the App: Unleash Your Database's Potential with Optimization Hacks