Why Your Sites Are Slow: TTFB, Database Bottlenecks, and Cache Diagnostics

System AdminFebruary 16, 2022373 views6 min read

Speed Issues Are Rarely Where You Think They Are

When a website feels slow, the instinct is to blame images or JavaScript. And sometimes that instinct is correct. But a surprising number of performance problems live deeper in the stack — in the server response time, in database queries, in misconfigured caching, or in upstream services that add latency invisibly. Fixing the front end while the back end bleeds time is like polishing a car with a blown engine.

This guide provides a diagnostic workflow for identifying the real root cause of slow websites. Start from the server response, work through the database, check caching, and trace upstream dependencies. By the end, you will know where your time is being lost and what to do about it.

Start With TTFB: Time to First Byte

TTFB measures how long the browser waits from the start of the request until it receives the first byte of the response. It includes DNS resolution, TCP connection setup, TLS negotiation, and server processing time. A healthy TTFB for a dynamic page is under 200 milliseconds. Over 600 milliseconds, you have a server-side problem that no amount of front-end optimization can fix.

How to Measure TTFB

Use your browser's developer tools (Network tab), curl with timing output (curl -o /dev/null -w "TTFB: %{time_starttransfer} " https://example.com), or synthetic monitoring tools like WebPageTest. Measure from multiple locations — if TTFB is high from everywhere, the problem is server-side. If it is high only from distant locations, the problem may be geographic (no CDN) or DNS-related.

Breaking Down TTFB

TTFB is a composite metric. To diagnose it, decompose it into its parts:

  • DNS lookup: Typically 20-50ms for cached resolvers, potentially longer for fresh lookups or slow DNS providers.
  • TCP connection: Proportional to the physical distance between client and server. A CDN reduces this for geographically distant visitors.
  • TLS handshake: Adds 50-100ms depending on protocol (TLS 1.2 vs 1.3) and whether session resumption is available.
  • Server processing: The time your application takes to generate the response. This is where most TTFB problems live.

Diagnosing Server Processing Time

If TTFB is high and the network components (DNS, TCP, TLS) are normal, the bottleneck is server processing. The three most common causes:

Database Queries

Slow database queries are the number one cause of slow server responses. A single unindexed query on a large table can take seconds. Identify slow queries using:

  • PostgreSQL's pg_stat_statements extension (shows query execution statistics)
  • MySQL's slow query log (logs queries exceeding a configurable duration)
  • Application-level query logging or ORM profiling tools

For each slow query, run EXPLAIN ANALYZE to see the execution plan. Look for sequential scans on large tables (add an index), nested loops with high row counts (restructure the query or add composite indexes), and sorts on unindexed columns.

Missing Application Cache

If your application recomputes expensive results on every request — fetching sidebar widgets, building navigation menus, aggregating statistics — server processing time adds up. Cache these computed results in Redis or Memcached with appropriate TTLs. The first request computes the result; subsequent requests serve it from cache in under a millisecond.

External Service Latency

If your page generation involves calls to external APIs — payment gateways, CRM lookups, social media APIs — those calls add their latency directly to your TTFB. Measure external call times independently. If an external service adds 500ms, your TTFB cannot be under 500ms regardless of how fast everything else is.

Solutions include caching external API responses (when the data allows it), making external calls asynchronous (render the page and load the data client-side), or setting aggressive timeouts with fallback content so a slow external service does not block your entire page.

Cache Diagnostics: Is Your Cache Actually Working?

You set up caching, but is it doing its job? Here is how to verify each layer:

Browser Cache

Check response headers for Cache-Control and ETag. If static assets are being re-downloaded on every page load (visible in the Network tab as 200 responses instead of 304 or "(disk cache)"), your cache headers are missing or misconfigured.

CDN Cache

Most CDNs add an X-Cache or CF-Cache-Status header to responses. A value of HIT means the CDN served a cached copy. MISS means the request went to your origin. A consistently low cache hit ratio (below 70-80% for static assets) indicates cache configuration problems — missing Cache-Control headers, Set-Cookie headers preventing caching, or cache-busting query strings on every request.

Reverse Proxy Cache

If you use Nginx or Varnish for full-page caching, check the cache status headers. Verify that cache bypass rules are correct — dynamic pages and authenticated content should bypass the cache, but public content should not. A misconfigured proxy_cache_bypass rule that bypasses on every request turns your reverse proxy cache into an expensive no-op.

Application Cache (Redis/Memcached)

Monitor cache hit and miss rates using Redis INFO or Memcached stats commands. A healthy application cache has a hit rate above 90%. A low hit rate suggests your application is not caching the right data, TTLs are too short, or cache invalidation is too aggressive.

Resource Contention: When the Server Is Overwhelmed

Sometimes the individual queries and cache layers are fine, but the server is simply overloaded. Check for:

  • CPU saturation: If CPU usage consistently exceeds 80%, requests queue up and response times increase. Identify the top CPU consumers — is it the web server, the database, or a background process?
  • Memory pressure: When RAM is exhausted, the system swaps to disk. Disk I/O is orders of magnitude slower than RAM. If swap usage is high, you are running out of memory and need to either optimize memory usage or upgrade the server.
  • Disk I/O: Slow disk operations affect database performance, session storage, and log writing. Monitor I/O wait times. If they are consistently high, consider SSD storage (if you are on spinning disks) or optimizing write-heavy operations.
  • Connection limits: Web servers and databases have connection limits. If all connections are consumed, new requests wait. Monitor active connections and raise limits if needed, or use connection pooling to multiplex connections more efficiently.

A Diagnostic Checklist

When a site is slow, work through this checklist in order:

  1. Measure TTFB from multiple locations. Is the problem global or location-specific?
  2. If TTFB is high: check server CPU, memory, and disk I/O. Is the server overloaded?
  3. Profile database queries. Are there slow queries that need indexes or restructuring?
  4. Check application cache hit rates. Is Redis/Memcached being used effectively?
  5. Verify CDN cache hit rates. Are static assets being served from the edge?
  6. Check for external service latency. Are third-party API calls adding wait time?
  7. Review browser cache headers. Are assets being re-downloaded unnecessarily?
  8. If TTFB is fine but the page still feels slow: the problem is on the front end — large images, unoptimized JavaScript, render-blocking resources.

Fixing the Root Cause, Not the Symptom

The most common mistake in performance optimization is treating symptoms instead of causes. Adding a CDN will not fix a slow database query — it will mask it for cached pages and leave dynamic pages just as slow. Adding more server resources will not fix a query that scans a million rows when an index would make it instant.

Always trace the problem to its root cause. Measure, diagnose, fix, and verify. A methodical approach to performance beats random optimization every time.

MySQLLinuxBackupWordPressDevOps