Pauses, classified seven ways
Every gap is labeled filler, hesitation, breath, emphasis, pre-laughter, transition or listening — each with its own behavior. Breaths get trimmed to 150 ms, not deleted. That's the line between natural and robotic.
AI semantic video editor · open source · 100% Free
Cadence Lab drops your OBS recording into an AI pipeline: it transcribes the audio, asks Claude to judge every pause and filler word in context, and renders a tight, YouTube-ready cut. Then you just ask it to refine the edit.
Why it's different
They cut anything below −30 dB and call it done — butchering intentional pauses, flattening breaths into something robotic, and keeping both takes when you flub a line. Cadence Lab reads the transcript, not just the volume, so it edits like a human who was actually paying attention.
Features
Every gap is labeled filler, hesitation, breath, emphasis, pre-laughter, transition or listening — each with its own behavior. Breaths get trimmed to 150 ms, not deleted. That's the line between natural and robotic.
“Like” used as filler gets cut. “Nothing else like it” gets kept. The classifier reads the surrounding words — no blunt keyword deletion.
Flub a sentence and start over? Cadence spots the second attempt — or a “let me try that again” — and flags the worse take, the way an editor would.
“Cut every sniffle.” “Find when the walnut table is on screen.” “Pull a 60-second highlight.” Claude executes against your video and proposes edits — you accept each with one click.
CLIP frame embeddings indexed at 1 fps. Ask “find the part where the dog appears” and get ranked timestamps — without scrubbing the whole tape.
An AudioSet-trained model spots sniffles, coughs, throat-clears and sneezes. Pair it with “remove all sniffles” and Cadence proposes a precise cut for each.
DeepFilterNet — trained on ~100k hours of speech — clears fan hum, keyboard clicks and room noise far better than a classical filter, near real-time on CPU.
Extract highlight clips, rearrange them, drop black between, and render an assembled cut — a dedicated path separate from the main pacing edit.
Hardware H.264 by default — 5–15× faster than a CPU encode YouTube can't tell apart — with an opt-in archival libx264 mode when you want the master.
How it works
Each stage writes a structured file the next one reads. Stop anywhere, tweak, and resume.
Cadence extracts your mic track alone, then transcribes with Whisper large-v3 (~30× realtime via Groq, or fully local). Word-level timestamps; desktop audio never masks the speech.
A single Claude Opus call reads the whole transcript and judges every pause and filler candidate with schema-constrained output — plus detected retakes. No regex, no fragile parsing.
Pure interval algebra turns those decisions — plus any cuts you or Ask Cadence add — into clean keep-segments. Fully local, deterministic, with an audit log of the original intent.
Listen to a 3-second clip around each cut, override with a click, re-plan instantly. Then FFmpeg renders a frame-aligned, loudness-normalized MP4 — ready to upload.
Ask Cadence
A dedicated Claude agent sits on top of your edit with read tools and action tools. Ask in plain English; it queries your transcript, audio events and visual index, then proposes typed edits you apply one at a time. Kick off a long scan and it picks the conversation back up the moment it finishes.
Every action is a proposal. Nothing touches your video until you click Apply.
Open source
MIT licensed. Your media and API keys stay on your machine — there's no Cadence Lab cloud in the loop.
Everything runs on your computer. Bring your own Groq and Anthropic keys, or go fully offline with local Whisper. Your footage never leaves the device unless you pick a cloud transcription backend.
The app is free. You pay your own model costs — about $0.60–$2.00 for a 30-minute video, mostly transcription and one big classification call. No subscription, no markup.
Structured outputs, prompt caching, agentic tool use, a typed multi-stage data contract, a Tauri + React shell over a Python FastAPI sidecar. Read the code — it's a working example of all of it.
Get the source
Cadence Lab is in early open-source development. Clone the repo and run it from the command line — one-click installers are on the roadmap.
# 1 — prerequisites
brew install ffmpeg uv
# 2 — clone & run
git clone https://github.com/JosephLeon/Cadence-Lab
cd Cadence-Lab
uv sync && cp .env.example .env
uv run cadence-lab server
Pre-release. macOS & Windows installers are on the roadmap (donations welcome). Until then, developers can clone the repo and run from source; non-developers, hang tight.
Full setup — API keys, the desktop app, all of it — is in the
install guide.
Requires ffmpeg and your own API keys.