A 49MB web page hit Hacker News this week with 629 points and 288 comments. It struck a nerve because everyone recognizes the pattern: sites that bloat invisibly over years of feature additions until they're unusable on anything but fiber. This is the practical guide I wish existed — real commands, real configs, and the math behind every recommendation.
Let's ground this in real numbers before touching any code.
Google's Core Web Vitals became a confirmed ranking signal in 2021. There are three metrics that actually matter:
The 49MB page story isn't an edge case — it's a snapshot of how web bloat accumulates. Each library added "just in case," each unoptimized image uploaded by a non-technical editor, each analytics script that loads five more scripts. No single decision is obviously wrong; the aggregate is a disaster. Bloat is the #1 silent killer of web performance because it never announces itself.
Before you optimize anything, you need to know what's actually slow. Guessing is expensive. Here's how to get real data.
The browser DevTools Lighthouse is useful, but inconsistent — your local machine has extensions, warm caches, and fast CPU. The CLI lets you run headless, simulate real devices, and get reproducible results:
# Install once
npm install -g lighthouse
# Run against your URL, simulate mobile on 4G
lighthouse https://yoursite.com \
--output=html \
--output-path=./report.html \
--preset=perf \
--emulated-form-factor=mobile \
--throttling-method=simulate
# Open the report
open report.html
The output tells you your LCP, CLS, INP scores with specific elements to blame. More importantly, it tells you why each score is what it is — render-blocking resources, image size, unused JavaScript.
For CI pipelines, use --output=json and parse the categories.performance.score field. Fail the build if it drops below 0.8.
WebPageTest (webpagetest.org) runs from actual locations on real devices. The waterfall view is what you actually want here. Look for:
Open DevTools → Network → check "Disable cache" → hard reload. Sort by Size descending. The top entries are your targets. Key things to check:
This one is underused and brutally revealing. Open DevTools → Command+Shift+P (or Ctrl+Shift+P) → "Coverage" → click record → reload the page → stop. You'll see exactly what percentage of each JavaScript and CSS file was never executed on page load.
On most sites, images account for 60-80% of total page weight. This is where optimizing first almost always pays off the most.
The before/after math is real:
# A typical JPEG hero image
hero.jpg: 450KB
# Convert to WebP
cwebp -q 82 hero.jpg -o hero.webp
# hero.webp: 180KB (60% smaller, visually identical)
# Convert to AVIF (even better compression, slower encode)
avifenc --min 30 --max 63 hero.jpg hero.avif
# hero.avif: 95KB (79% smaller)
AVIF gets you smaller files but takes longer to encode and has slightly less browser support (still 93%+ as of 2026). WebP is the safe default — virtually universal support, massive savings.
Use the <picture> element to serve the right format without losing fallback support:
<picture>
<source srcset="hero.avif" type="image/avif">
<source srcset="hero.webp" type="image/webp">
<img src="hero.jpg" alt="Hero image" width="1200" height="630"
loading="lazy" decoding="async">
</picture>
Serving a 2400px image to a 375px mobile screen is a 6x waste. srcset fixes this:
<img
src="hero-800.webp"
srcset="
hero-400.webp 400w,
hero-800.webp 800w,
hero-1200.webp 1200w,
hero-2400.webp 2400w
"
sizes="
(max-width: 600px) 100vw,
(max-width: 1200px) 80vw,
1200px
"
alt="Hero"
width="1200"
height="630"
loading="lazy"
>
The sizes attribute tells the browser how wide the image will actually render, so it can pick the right source before downloading anything. Without sizes, it defaults to 100vw — almost always wrong.
This one is easy to miss. If you don't specify width and height on an image, the browser doesn't know how much space to reserve until the image loads. Everything below it shifts down. That's your CLS score tanking.
Always include the intrinsic dimensions. CSS can still control the displayed size — the HTML attributes just give the browser the aspect ratio it needs to reserve space:
<!-- Bad: browser doesn't know how tall this will be -->
<img src="photo.webp" alt="Photo" style="width: 100%">
<!-- Good: browser reserves correct space, no layout shift -->
<img src="photo.webp" alt="Photo" width="800" height="600"
style="width: 100%; height: auto">
Native lazy loading is supported everywhere and requires one attribute:
<img src="below-fold.webp" loading="lazy" width="800" height="400" alt="...">
The exception: your hero image, your logo, anything above the fold. Those should have loading="eager" (the default) or be explicitly preloaded. Lazy-loading the LCP element is a common mistake that tanks your score.
<!-- Preload the hero image so it starts loading immediately -->
<link rel="preload" as="image" href="hero.webp"
imagesrcset="hero-400.webp 400w, hero-800.webp 800w"
imagesizes="100vw">
JavaScript is more expensive than any other asset type. A 1MB image and a 1MB JavaScript file are not equivalent — the JS has to be parsed, compiled, and executed. On a mid-range mobile device (Moto G4, the standard benchmark), that 1MB of JS costs roughly 3–4 seconds of CPU time. The image costs a network transfer and a texture decode.
If you're using webpack, Vite, or Rollup, route-based code splitting is the first thing to enable:
// Vite config — automatic chunk splitting by route
// vite.config.js
export default {
build: {
rollupOptions: {
output: {
manualChunks: {
// Vendor chunk: stuff that changes rarely
vendor: ['react', 'react-dom'],
// Separate heavy libraries
charts: ['recharts'],
editor: ['@codemirror/state', '@codemirror/view'],
}
}
}
}
}
With React Router or Next.js, use dynamic imports for route components so users only download the code for pages they visit:
// Instead of:
import ProductPage from './ProductPage'
// Use:
const ProductPage = React.lazy(() => import('./ProductPage'))
// Wrap with Suspense
<Suspense fallback={<Loading />}>
<ProductPage />
</Suspense>
Tree shaking removes code that's imported but never actually called. It requires ES modules (not CommonJS) to work properly. The trap: many libraries still ship CommonJS, which defeats tree shaking entirely.
# Check what's actually in your bundle
npx bundle-buddy ./dist/assets/*.js
# Or use source-map-explorer
npm install -g source-map-explorer
source-map-explorer dist/main.js
The visual treemap shows you where your bundle bytes are coming from. Lodash imported as import _ from 'lodash' drags in all 70KB even if you only use _.debounce. Fix: import debounce from 'lodash/debounce' or switch to lodash-es.
How you load scripts matters enormously for Time to Interactive:
<!-- Blocks HTML parsing. Never do this for non-critical scripts. -->
<script src="app.js"></script>
<!-- Downloads in parallel, executes when ready (order not guaranteed) -->
<script async src="analytics.js"></script>
<!-- Downloads in parallel, executes after HTML is parsed (order preserved) -->
<script defer src="app.js"></script>
<!-- ES modules: implicitly deferred, always strict mode -->
<script type="module" src="app.mjs"></script>
Use defer for your application code. Use async for analytics and ads (they don't need the DOM and don't affect each other). Never use bare <script> tags in the <head> for external files.
# depcheck scans your code and tells you what's imported vs what's in package.json
npx depcheck
# Sample output:
# Unused dependencies
# * moment (you're using date-fns now, remove this)
# * lodash (only used in one file, inline it)
# * react-tooltip (removed from UI, forgot to uninstall)
Every unused dependency is dead weight in your node_modules and potentially in your bundle. Run depcheck before any major performance audit.
CSS blocks rendering. The browser can't paint anything until all CSS is loaded. For the above-the-fold content, you want the critical CSS inlined in <head> so the browser can render without waiting for a stylesheet download.
# Generate critical CSS for your main page
npx critical https://yoursite.com \
--width 1300 \
--height 900 \
--inline \
--base dist/ \
dist/index.html
The output inlines the minimum CSS needed for above-the-fold rendering, then loads the full stylesheet non-blocking. This is one of the highest-leverage LCP improvements you can make.
For the rest of your stylesheet, load it non-blocking:
<!-- Non-blocking stylesheet load -->
<link rel="preload" href="/styles.css" as="style"
onload="this.onload=null;this.rel='stylesheet'">
<noscript><link rel="stylesheet" href="/styles.css"></noscript>
Most projects have significant CSS dead weight — rules for components that were deleted, utility classes that are no longer used, entire vendor themes that ship thousands of selectors. PurgeCSS analyzes your HTML/JS and removes any CSS selector that doesn't appear:
# Install
npm install -D purgecss
# Run against your built files
npx purgecss \
--css dist/styles.css \
--content dist/**/*.html dist/**/*.js \
--output dist/styles.purged.css
If you're using Tailwind CSS, PurgeCSS is built in — Tailwind's JIT mode only generates the classes your templates actually use. A fresh Tailwind project generates under 10KB of CSS. A misconfigured one can ship all 3MB of utility classes.
This is subtle but impactful. Animating properties that trigger layout (width, height, margin, padding, top, left) forces the browser to recalculate the entire document layout on every frame. This runs on the main thread and competes with your JavaScript.
Animating transform and opacity runs on the compositor thread — completely separate from the main thread, hardware-accelerated, silky smooth even under load:
/* Bad: triggers layout recalc every frame */
.slide-in {
animation: slide 0.3s ease;
}
@keyframes slide {
from { left: -100px; }
to { left: 0; }
}
/* Good: compositor-only, no layout impact */
.slide-in {
animation: slide 0.3s ease;
}
@keyframes slide {
from { transform: translateX(-100px); }
to { transform: translateX(0); }
}
Add will-change: transform to elements that will animate, but use it sparingly — it forces the browser to create a separate compositor layer for the element, which costs GPU memory.
Without this, the browser shows invisible text while waiting for the font to load (FOIT — Flash of Invisible Text). With font-display: swap, it shows the fallback font immediately and swaps in the web font when it arrives:
@font-face {
font-family: 'Inter';
src: url('/fonts/inter-regular.woff2') format('woff2');
font-weight: 400;
font-style: normal;
font-display: swap; /* Show fallback immediately, swap when ready */
}
If you're using Google Fonts or another external font service, preconnect eliminates the DNS lookup + TLS handshake delay before the browser can even start downloading the font:
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600;700&display=swap"
rel="stylesheet">
Note the crossorigin attribute on the gstatic preconnect — required because fonts are loaded with CORS.
Self-hosting fonts eliminates the external DNS lookup entirely and gives you control over caching headers. The tradeoff: you lose the (now mostly mythical) shared cache benefit — browsers stopped sharing cross-origin caches in 2020 for privacy reasons.
Self-hosting is generally the right call for performance. Use google-webfonts-helper.herokuapp.com to download the woff2 files directly.
A full Inter font file is 300KB. If your site is English-only, you only need the Latin character set — around 30KB. Subset with pyftsubset:
# Install fonttools
pip install fonttools brotli
# Subset to Latin characters only
pyftsubset inter-regular.ttf \
--output-file=inter-latin.woff2 \
--flavor=woff2 \
--layout-features='' \
--unicodes="U+0020-007E,U+00A0-00FF"
For Google Fonts, you can pass &text= to get a subset URL, but the proper approach is to download and self-host your subset.
The right caching strategy depends on whether your asset filenames are hashed or stable:
# Hashed assets (app.a3f9b2c1.js, styles.8d4e1f20.css)
# These can be cached forever — the filename changes when content changes
Cache-Control: public, max-age=31536000, immutable
# HTML files — never cache these, always revalidate
Cache-Control: no-cache
# APIs — usually no caching or short TTLs
Cache-Control: no-store
# Versioned but stable path assets (images referenced by URL)
# Cache for a week, allow revalidation
Cache-Control: public, max-age=604800, stale-while-revalidate=86400
The immutable directive tells the browser "this file will never change at this URL, don't even bother checking." It eliminates the conditional GET request on repeat visits. Only use it for content-hashed assets.
# /etc/nginx/sites-available/yoursite
server {
listen 443 ssl http2;
server_name yoursite.com;
root /var/www/yoursite;
# HTML: no cache
location ~* \.html$ {
add_header Cache-Control "no-cache";
}
# Hashed assets: cache forever
location ~* \.(js|css)\?v= {
add_header Cache-Control "public, max-age=31536000, immutable";
}
# Hashed filenames (common webpack/vite output)
location ~* \.[0-9a-f]{8,}\.(js|css|woff2)$ {
add_header Cache-Control "public, max-age=31536000, immutable";
}
# Images: cache for a week
location ~* \.(webp|avif|jpg|jpeg|png|gif|svg|ico)$ {
add_header Cache-Control "public, max-age=604800, stale-while-revalidate=86400";
}
}
You can verify these headers are set correctly using the API Tester — just enter your asset URL and check the response headers in the output.
For progressive web apps or any site that benefits from offline access, a service worker gives you a programmable cache. The Cache API lets you cache responses on first visit and serve them instantly on repeat visits — even offline:
// sw.js — minimal service worker for static assets
const CACHE_NAME = 'v1';
const STATIC_ASSETS = ['/', '/styles.css', '/app.js'];
self.addEventListener('install', event => {
event.waitUntil(
caches.open(CACHE_NAME).then(cache => cache.addAll(STATIC_ASSETS))
);
});
self.addEventListener('fetch', event => {
event.respondWith(
caches.match(event.request).then(cached => {
// Serve from cache, fetch in background to update
const networkFetch = fetch(event.request).then(response => {
const clone = response.clone();
caches.open(CACHE_NAME).then(cache => cache.put(event.request, clone));
return response;
});
return cached || networkFetch;
})
);
});
HTTP/2 multiplexes requests over a single connection — eliminating the "6 parallel requests per domain" limit of HTTP/1.1. HTTP/3 uses QUIC instead of TCP, which dramatically reduces latency on lossy connections (mobile networks). Both are enabled at the nginx/load-balancer level:
# nginx.conf — enable HTTP/2 and HTTP/3
server {
listen 443 ssl http2; # HTTP/2
listen 443 quic reuseport; # HTTP/3 (QUIC)
http3 on;
# Advertise HTTP/3 support
add_header Alt-Svc 'h3=":443"; ma=86400';
ssl_certificate /etc/letsencrypt/live/yoursite.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yoursite.com/privkey.pem;
}
Brotli consistently compresses 15-25% better than gzip for text assets (HTML, CSS, JS). It's supported by all modern browsers. On nginx:
# Install brotli module (Debian/Ubuntu)
apt install libnginx-mod-brotli
# nginx.conf
brotli on;
brotli_comp_level 6; # 1-11, 6 is sweet spot for speed vs ratio
brotli_types
text/html
text/css
text/javascript
application/javascript
application/json
application/xml
image/svg+xml
font/woff2;
# Keep gzip as fallback for older browsers
gzip on;
gzip_vary on;
gzip_comp_level 6;
gzip_types text/plain text/css application/javascript application/json;
Check whether your server is actually compressing with:
curl -H "Accept-Encoding: br" -I https://yoursite.com/styles.css | grep content-encoding
# Should return: content-encoding: br
Time to First Byte is a server-side metric. If your TTFB is above 600ms, look at:
A CDN (Content Delivery Network) caches your static assets at edge locations around the world. A user in Tokyo gets your CSS from a Tokyo edge node, not from your Oregon origin server. For purely static sites, a CDN with full-site caching can get TTFB under 50ms globally.
Cloudflare's free tier is a reasonable starting point. For more control, consider Bunny CDN or Fastly. The main configuration task: make sure your Cache-Control headers are set correctly so the CDN actually caches your responses.
Run through these in order — the first items are highest impact for the least effort:
Automate performance testing so regressions get caught before they ship:
# .github/workflows/lighthouse.yml
name: Lighthouse CI
on: [pull_request]
jobs:
lighthouse:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: npm ci
- name: Build
run: npm run build
- name: Run Lighthouse CI
uses: treosh/lighthouse-ci-action@v11
with:
urls: |
http://localhost:3000/
http://localhost:3000/product/
budgetPath: ./lighthouse-budget.json
uploadArtifacts: true
// lighthouse-budget.json — fail the build if these thresholds aren't met
[
{
"path": "/*",
"timings": [
{ "metric": "largest-contentful-paint", "budget": 2500 },
{ "metric": "cumulative-layout-shift", "budget": 0.1 },
{ "metric": "interactive", "budget": 3500 }
],
"resourceSizes": [
{ "resourceType": "total", "budget": 1000 },
{ "resourceType": "script", "budget": 300 },
{ "resourceType": "image", "budget": 500 }
]
}
]
Before adding any npm package, paste it into bundlephobia.com. It shows the minified + gzipped bundle size, the download time on 3G, and whether the package supports tree shaking. Make this a mandatory step in your dependency review process.
WebPageTest has an API you can integrate into CI for real-browser testing from real locations:
curl "https://www.webpagetest.org/runtest.php\
?url=https://yoursite.com\
&k=YOUR_API_KEY\
&f=json\
&location=Dulles:Chrome\
&runs=3\
&video=1" | jq '.data.testId'
Lighthouse measures lab conditions. Real User Monitoring (RUM) captures what actual visitors experience — their device, their connection, their browser extensions:
import { onCLS, onINP, onLCP } from 'web-vitals';
function sendToAnalytics(metric) {
// Send to your analytics endpoint
navigator.sendBeacon('/analytics', JSON.stringify({
name: metric.name,
value: metric.value,
rating: metric.rating, // 'good', 'needs-improvement', or 'poor'
id: metric.id,
url: location.href,
}));
}
onCLS(sendToAnalytics);
onINP(sendToAnalytics);
onLCP(sendToAnalytics);
This gives you a distribution of real-world CWV scores across your actual users. A p75 LCP of 4.2s is much more actionable than a single Lighthouse score. You can also use GoatCounter's event API if you want a lightweight solution without a separate analytics service.