100% free & open source — no account, no subscription. Get the source →

AI semantic video editor · open source · 100% Free

Edit by listening,
not by scrubbing.

Cadence Lab drops your OBS recording into an AI pipeline: it transcribes the audio, asks Claude to judge every pause and filler word in context, and renders a tight, YouTube-ready cut. Then you just ask it to refine the edit.

  • 100% FREENo subscription, ever
  • Open sourceMIT licensed on GitHub
  • Local-firstYour footage stays put

Free & open source · run it from source on macOS & Linux · you bring your own API keys (~$0.60–$2 / video)

  • Powered by
  • Whisper large-v3
  • /
  • Claude Opus
  • /
  • FFmpeg
  • /
  • CLIP
  • /
  • DeepFilterNet
  • /
  • Tauri

Why it's different

Most “auto-edit” tools are just a regex over the waveform.

They cut anything below −30 dB and call it done — butchering intentional pauses, flattening breaths into something robotic, and keeping both takes when you flub a line. Cadence Lab reads the transcript, not just the volume, so it edits like a human who was actually paying attention.

Features

Editing decisions made by something that understands the words.

01

Pauses, classified seven ways

Every gap is labeled filler, hesitation, breath, emphasis, pre-laughter, transition or listening — each with its own behavior. Breaths get trimmed to 150 ms, not deleted. That's the line between natural and robotic.

02

Context-aware filler removal

“Like” used as filler gets cut. “Nothing else like it” gets kept. The classifier reads the surrounding words — no blunt keyword deletion.

03

Retake detection

Flub a sentence and start over? Cadence spots the second attempt — or a “let me try that again” — and flags the worse take, the way an editor would.

04

Ask Cadence

“Cut every sniffle.” “Find when the walnut table is on screen.” “Pull a 60-second highlight.” Claude executes against your video and proposes edits — you accept each with one click.

05

Semantic visual search

CLIP frame embeddings indexed at 1 fps. Ask “find the part where the dog appears” and get ranked timestamps — without scrubbing the whole tape.

06

Sound-event cleanup

An AudioSet-trained model spots sniffles, coughs, throat-clears and sneezes. Pair it with “remove all sniffles” and Cadence proposes a precise cut for each.

07

Neural denoise

DeepFilterNet — trained on ~100k hours of speech — clears fan hum, keyboard clicks and room noise far better than a classical filter, near real-time on CPU.

08

Splicing timeline

Extract highlight clips, rearrange them, drop black between, and render an assembled cut — a dedicated path separate from the main pacing edit.

09

Hardware-fast render

Hardware H.264 by default — 5–15× faster than a CPU encode YouTube can't tell apart — with an opt-in archival libx264 mode when you want the master.

How it works

A typed pipeline, from raw recording to finished cut.

Each stage writes a structured file the next one reads. Stop anywhere, tweak, and resume.

1

Ingest & transcribe

Cadence extracts your mic track alone, then transcribes with Whisper large-v3 (~30× realtime via Groq, or fully local). Word-level timestamps; desktop audio never masks the speech.

2

Classify in context

A single Claude Opus call reads the whole transcript and judges every pause and filler candidate with schema-constrained output — plus detected retakes. No regex, no fragile parsing.

3

Plan the cuts

Pure interval algebra turns those decisions — plus any cuts you or Ask Cadence add — into clean keep-segments. Fully local, deterministic, with an audit log of the original intent.

4

Review & render

Listen to a 3-second clip around each cut, override with a click, re-plan instantly. Then FFmpeg renders a frame-aligned, loudness-normalized MP4 — ready to upload.

Ask Cadence

Talk to your timeline.

A dedicated Claude agent sits on top of your edit with read tools and action tools. Ask in plain English; it queries your transcript, audio events and visual index, then proposes typed edits you apply one at a time. Kick off a long scan and it picks the conversation back up the moment it finishes.

  • remove the um at 1:23
  • cut every sniffle in the intro
  • find when the walnut table is on screen
  • pull a 60-second highlight from the demo

Every action is a proposal. Nothing touches your video until you click Apply.

cut every sniffle in the intro
Ask CadenceScanning for non-speech events… found 4 sniffles between 0:00–2:10. Proposing a custom cut for each.
also find where I show the table
Ask CadenceIndexing frames… best match for “table on screen” is 14:02–14:48. Want a highlight clip from that range?Create clipNot now

Open source

Yours to run, read, fork and ship.

MIT licensed. Your media and API keys stay on your machine — there's no Cadence Lab cloud in the loop.

Local-first & private

Everything runs on your computer. Bring your own Groq and Anthropic keys, or go fully offline with local Whisper. Your footage never leaves the device unless you pick a cloud transcription backend.

Honest pricing

The app is free. You pay your own model costs — about $0.60–$2.00 for a 30-minute video, mostly transcription and one big classification call. No subscription, no markup.

A real reference build

Structured outputs, prompt caching, agentic tool use, a typed multi-stage data contract, a Tauri + React shell over a Python FastAPI sidecar. Read the code — it's a working example of all of it.

Get the source

Run it from source today.

Cadence Lab is in early open-source development. Clone the repo and run it from the command line — one-click installers are on the roadmap.

macOS · quick start
# 1 — prerequisites
brew install ffmpeg uv

# 2 — clone & run
git clone https://github.com/JosephLeon/Cadence-Lab
cd Cadence-Lab
uv sync && cp .env.example .env
uv run cadence-lab server

Pre-release. macOS & Windows installers are on the roadmap (donations welcome). Until then, developers can clone the repo and run from source; non-developers, hang tight.

Full setup — API keys, the desktop app, all of it — is in the install guide. Requires ffmpeg and your own API keys.