Back to all posts

Short videos on websites are not difficult to get right, but most implementations get them wrong in the same few ways. They load everything at once. They fight over bandwidth. They stall the one video the user is actually watching. This post covers the complete approach — compression, encoding, and scroll behaviour — and it is simpler than most developers expect.

Step 1: Compress First

Before any encoding decisions, compress the video as aggressively as the quality allows. Tools like HandBrake or dedicated compression utilities can reduce file sizes dramatically without visible quality loss, depending on the source. A 50 MB recording often becomes a 5–10 MB file that looks identical in a browser at typical viewing sizes.

This step matters more than anything else in this post. A well-compressed 5 MB file with no other optimisations will outperform a poorly-compressed 50 MB file with every optimisation applied.

Step 2: Remux with Faststart

Once compressed, run the video through FFmpeg to move the metadata to the front of the file. This is called faststart, and it is the difference between a video that plays immediately and one that waits for the entire file to download before showing a single frame.

ffmpeg -i compressed.mp4 -codec:v copy -codec:a copy -movflags +faststart output.mp4

The -codec: copy flag remuxes the container only — it does not re-encode the video. No quality is lost. The only change is the position of the moov atom, a small metadata block that tells the browser how to decode the file. By default, encoders write this block at the end because it can only be finalised after encoding completes. Faststart moves it to the front in a second pass.

Without faststart, the browser must download the entire file before it can play a single frame. With faststart, playback begins as soon as the first chunk arrives over the network.

Step 3: Write the HTML Correctly

Two attributes do most of the work natively, without any JavaScript:

<video
  src="output.mp4"
  loading="lazy"
  preload="none"
  muted
  playsinline
  loop
  width="640"
  height="360"
></video>

loading="lazy" tells the browser not to start downloading the video until it is near the viewport. preload="none" reinforces this — do not buffer anything until playback is requested. Together they ensure videos below the fold use zero bandwidth until the user scrolls toward them.

Width and height are required for lazy loading to work. Without them the browser cannot calculate the element's position in the layout and may never trigger the load.

Step 4: Wire Up Scroll Behaviour

Lazy loading handles the initial deferral. For autoplay and pause on scroll, add an IntersectionObserver:

const observer = new IntersectionObserver((entries) => {
  entries.forEach(entry => {
    if (entry.isIntersecting) {
      entry.target.play().catch(() => {});
    } else {
      entry.target.pause();
    }
  });
}, { threshold: 0.5 });

document.querySelectorAll('video[loading="lazy"]').forEach(v => observer.observe(v));

When a video enters the viewport it plays. When it leaves, it pauses. The download continues in the background — the browser will buffer ahead naturally, and when the user scrolls back the video resumes from where it stopped without re-downloading anything.

What About Stopping the Download Entirely?

You might wonder whether pausing the download as well as the playback would save bandwidth. The short answer is: it is more complicated than it is worth.

The only way to stop a download mid-flight on a native video element is to clear the source entirely:

video.src = '';
video.load();

This aborts the request, confirmed by a net::ERR_ABORTED in the network tab. However it also destroys the element's state. Whether the partially-downloaded bytes are preserved in cache depends entirely on the server's Cache-Control headers. With correct headers, Chrome caches the partial download and resumes from the right byte offset on the next request. Without them, it re-downloads from byte zero every time — which is strictly worse than just pausing.

For most use cases, pause() on scroll-out is sufficient. The browser throttles background downloads naturally once it has buffered a few seconds ahead, so the bandwidth impact of an off-screen paused video is minimal in practice.

What About HLS?

HLS is the right choice for live streams, adaptive bitrate, and DRM. For pre-compressed short videos served from a CDN, it adds complexity without benefit.

Before a single frame plays, HLS requires fetching and parsing a .m3u8 manifest, then fetching the first segment — two serial network round trips. A faststart MP4 needs one: the browser reads the metadata from the front of the file and starts decoding immediately. On a slow connection that gap is 600ms or more. On a fast CDN it is 50–200ms, which compounds across every video on the page.

Cache Headers Matter

One important deployment detail: make sure your server or CDN sends correct cache headers for MP4 files. Without them, every page load re-downloads every video from scratch, even if the user has visited before.

On Cloudflare Pages, static assets are cached automatically. On other platforms, set at minimum:

Cache-Control: public, max-age=31536000, immutable

If your video URLs include a hash or version parameter (e.g. output.abc123.mp4), immutable tells the browser never to revalidate — the content will never change at that URL. If not, use no-cache instead, which allows caching but requires revalidation on each request.

One more note: during local development, avoid combining a no-store or no-cache server with DevTools network throttling. You will see artificially slow video loads because the browser cannot cache anything and must re-download the full file on every request. This does not reflect production behaviour.

The Complete Picture

A video that is ready for the web:

  • Compressed to the smallest acceptable file size before encoding
  • Remuxed with -movflags +faststart so playback starts on the first chunk
  • Served with loading="lazy" and preload="none" so off-screen videos use no bandwidth
  • Played and paused by an IntersectionObserver tied to viewport visibility
  • Cached by the CDN so repeat visits are instant

That is the whole stack. No libraries, no adaptive streaming, no complex download management. The browser handles the rest.

Install This As A Skill

If you want this exact workflow available inside your coding agent, install the companion skill bundle for Claude Code or Codex.

Claude Code

  1. Download CLAUDE.md and setup-video-streaming.md.
  2. Copy CLAUDE.md into your project root, or append it to your existing CLAUDE.md.
  3. Place setup-video-streaming.md in .claude/commands/.
  4. Invoke it with /setup-video-streaming.
your-project/
├── CLAUDE.md
└── .claude/
    └── commands/
        └── setup-video-streaming.md

Codex

  1. Download SKILL.md and openai.yaml.
  2. Create ~/.codex/skills/setup-video-streaming/.
  3. Place SKILL.md in that folder and openai.yaml in agents/.
  4. Invoke it with $setup-video-streaming.
~/.codex/skills/setup-video-streaming/
├── SKILL.md
└── agents/
    └── openai.yaml