Short videos on websites are not difficult to get right, but most implementations get them wrong in the same few ways. They load everything at once. They fight over bandwidth. They stall the one video the user is actually watching. This post covers the complete approach — compression, encoding, and scroll behaviour — and it is simpler than most developers expect.
Step 1: Compress First
Before any encoding decisions, compress the video as aggressively as the quality allows. Tools like HandBrake or dedicated compression utilities can reduce file sizes dramatically without visible quality loss, depending on the source. A 50 MB recording often becomes a 5–10 MB file that looks identical in a browser at typical viewing sizes.
This step matters more than anything else in this post. A well-compressed 5 MB file with no other optimisations will outperform a poorly-compressed 50 MB file with every optimisation applied.
Step 2: Remux with Faststart
Once compressed, run the video through FFmpeg to move the metadata to the front of the file. This is called faststart, and it is the difference between a video that plays immediately and one that waits for the entire file to download before showing a single frame.
ffmpeg -i compressed.mp4 -codec:v copy -codec:a copy -movflags +faststart output.mp4
The -codec: copy flag remuxes the container only — it does not re-encode
the video. No quality is lost. The only change is the position of the moov
atom, a small metadata block that tells the browser how to decode the file. By default,
encoders write this block at the end because it can only be finalised after encoding
completes. Faststart moves it to the front in a second pass.
Without faststart, the browser must download the entire file before it can play a single frame. With faststart, playback begins as soon as the first chunk arrives over the network.
Step 3: Write the HTML Correctly
Two attributes do most of the work natively, without any JavaScript:
<video
src="output.mp4"
loading="lazy"
preload="none"
muted
playsinline
loop
width="640"
height="360"
></video>
loading="lazy" tells the browser not to start downloading the video until
it is near the viewport. preload="none" reinforces this — do not buffer
anything until playback is requested. Together they ensure videos below the fold use
zero bandwidth until the user scrolls toward them.
Width and height are required for lazy loading to work. Without them the browser cannot calculate the element's position in the layout and may never trigger the load.
Step 4: Wire Up Scroll Behaviour
Lazy loading handles the initial deferral. For autoplay and pause on scroll, add an IntersectionObserver:
const observer = new IntersectionObserver((entries) => {
entries.forEach(entry => {
if (entry.isIntersecting) {
entry.target.play().catch(() => {});
} else {
entry.target.pause();
}
});
}, { threshold: 0.5 });
document.querySelectorAll('video[loading="lazy"]').forEach(v => observer.observe(v));
When a video enters the viewport it plays. When it leaves, it pauses. The download continues in the background — the browser will buffer ahead naturally, and when the user scrolls back the video resumes from where it stopped without re-downloading anything.
What About Stopping the Download Entirely?
You might wonder whether pausing the download as well as the playback would save bandwidth. The short answer is: it is more complicated than it is worth.
The only way to stop a download mid-flight on a native video element is to clear the source entirely:
video.src = '';
video.load();
This aborts the request, confirmed by a net::ERR_ABORTED in the network
tab. However it also destroys the element's state. Whether the partially-downloaded
bytes are preserved in cache depends entirely on the server's Cache-Control
headers. With correct headers, Chrome caches the partial download and resumes from the
right byte offset on the next request. Without them, it re-downloads from byte zero
every time — which is strictly worse than just pausing.
For most use cases, pause() on scroll-out is sufficient. The browser
throttles background downloads naturally once it has buffered a few seconds ahead, so
the bandwidth impact of an off-screen paused video is minimal in practice.
What About HLS?
HLS is the right choice for live streams, adaptive bitrate, and DRM. For pre-compressed short videos served from a CDN, it adds complexity without benefit.
Before a single frame plays, HLS requires fetching and parsing a .m3u8
manifest, then fetching the first segment — two serial network round trips. A faststart
MP4 needs one: the browser reads the metadata from the front of the file and starts
decoding immediately. On a slow connection that gap is 600ms or more. On a fast CDN
it is 50–200ms, which compounds across every video on the page.
Cache Headers Matter
One important deployment detail: make sure your server or CDN sends correct cache headers for MP4 files. Without them, every page load re-downloads every video from scratch, even if the user has visited before.
On Cloudflare Pages, static assets are cached automatically. On other platforms, set at minimum:
Cache-Control: public, max-age=31536000, immutable
If your video URLs include a hash or version parameter (e.g. output.abc123.mp4),
immutable tells the browser never to revalidate — the content will never
change at that URL. If not, use no-cache instead, which allows caching
but requires revalidation on each request.
One more note: during local development, avoid combining a no-store or
no-cache server with DevTools network throttling. You will see artificially
slow video loads because the browser cannot cache anything and must re-download the
full file on every request. This does not reflect production behaviour.
The Complete Picture
A video that is ready for the web:
- Compressed to the smallest acceptable file size before encoding
- Remuxed with
-movflags +faststartso playback starts on the first chunk - Served with
loading="lazy"andpreload="none"so off-screen videos use no bandwidth - Played and paused by an IntersectionObserver tied to viewport visibility
- Cached by the CDN so repeat visits are instant
That is the whole stack. No libraries, no adaptive streaming, no complex download management. The browser handles the rest.
Install This As A Skill
If you want this exact workflow available inside your coding agent, install the companion skill bundle for Claude Code or Codex.
Claude Code
- Download
CLAUDE.mdandsetup-video-streaming.md. - Copy
CLAUDE.mdinto your project root, or append it to your existingCLAUDE.md. - Place
setup-video-streaming.mdin.claude/commands/. - Invoke it with
/setup-video-streaming.
your-project/
├── CLAUDE.md
└── .claude/
└── commands/
└── setup-video-streaming.md
Codex
- Download
SKILL.mdandopenai.yaml. - Create
~/.codex/skills/setup-video-streaming/. - Place
SKILL.mdin that folder andopenai.yamlinagents/. - Invoke it with
$setup-video-streaming.
~/.codex/skills/setup-video-streaming/
├── SKILL.md
└── agents/
└── openai.yaml