
Audio for Editors: Cleaning, Mixing, and Mastering
Viewers will forgive a soft shot before they forgive crunchy dialogue. Clean, intelligible voiceover and a balanced mix elevate your edit more than any flashy transition. You don’t need a million plugins; you need a repeatable order of operations. This workflow keeps you fast, consistent, and broadcast-safe in any NLE.
1) Organize and listen first
Split tracks: dialogue (DX), music (MX), effects/ambience (FX/AMB). If you inherit a messy timeline, batch-select and move clips to their lanes. Solo the dialogue and listen end-to-end. Drop markers on problem spots: pops, plosives, HVAC hum, traffic, and reverb. A single pass of critical listening saves time later.
2) Dialogue cleanup in four light touches
- Noise reduction: Use gentle broadband noise reduction (NR) to shave constant hiss/hum. Start with a low reduction (2–6 dB) and avoid pumping. If you hear artifacts, back off. Consider a dedicated hum remover (50/60 Hz) for mains buzz.
- High-pass filter: Roll off sub rumble. For male voices start around 70–90 Hz; for female 90–110 Hz. Adjust by ear—too high thins the voice, too low leaves mud.
- EQ sculpting: Subtractive EQ first. Dip 200–350 Hz to reduce boxiness (1–3 dB, medium Q). Tame harshness around 2–4 kHz if needed, but keep consonants crisp. Add a gentle presence boost at 4–6 kHz only if the mic is dull.
- De-esser: Catch piercing “s” around 5–8 kHz. Aim for 2–5 dB reduction. If lisps appear, reduce range or intensity.
Place these in this order on the dialogue bus or a clip-level chain. Subtlety compounds—five small fixes beat one aggressive plugin.
3) Dynamics: control without flattening
Use compression to narrow dynamic range so whispers and shouts live comfortably. A starting point: ratio 2:1 to 3:1, threshold for 2–4 dB of gain reduction, medium attack (10–30 ms) to preserve transients, medium release (60–120 ms). Afterwards, add 1–2 dB of makeup gain to maintain perceived loudness.
If your sequence includes multiple speakers, compress each track lightly, then glue them on the dialogue bus with a gentle bus compressor (1–2 dB GR). This keeps voices consistent without “pumping.”
4) De-reverb when the room fights back
Roomy recordings are harder than noisy ones. A de-reverb tool can help, but artifacts appear fast. Prioritize mic proximity in production; in post, combine a small presence boost with a tight room tone bed to mask reverb tails. If you must de-reverb, treat just the worst phrases via clip-level effects.
5) Music that supports, not competes
Choose tracks that leave midrange space for voice. Loop clean musical phrases and avoid abrupt cuts—crossfade across bars or use risers to bridge. Sidechain compression (ducking) is a powerful, subtle tool: route dialogue to duck music 2–4 dB when someone speaks (fast attack, medium release). If your NLE lacks sidechain, automate music volume around lines.
6) Effects and ambience: glue the world together
Room tone is the editor’s magic paste. Capture 10–30 seconds on set; if none exists, synthesize a low, broadband bed from the best segment. Keep FX consistent in perspective—close shots get closer, brighter sounds; wides get softer, more diffuse sounds. Use short reverbs to place effects in the same space as your visuals.
7) Loudness and delivery targets
Loudness meters (LUFS) help you meet platform standards. For YouTube and streaming, aim around -14 LUFS integrated with peaks below -1 dBTP. For podcasts or speech-only web content, -16 LUFS integrated is common. For broadcast, check network specs; many regions use -23 LUFS integrated with strict true-peak caps. Use a limiter last in the chain to tame peaks—set ceiling at -1 dBTP for web.
8) A fast-repeat checklist
- Sort tracks: DX, MX, FX/AMB on separate lanes.
- Clean DX first: NR → HPF → EQ → De-ess → Compress → Limiter.
- Balance: Place music under dialogue (start -20 to -28 dB, then adjust).
- Ducking: Sidechain or automate the music around speech.
- Consistency: Use bus processing lightly to glue elements.
- Finalize: Loudness meter → limiter ceiling -1 dBTP → export.
Rescue scenarios and fixes
- Plosives (pops): Use a low-shelf cut or pop remover on the 80–150 Hz range, clip-gain reduce the popped consonant, and crossfade.
- Clicks: Use a de-clicker or razor out the click and crossfade 2–6 frames.
- Hiss and fan noise: Gentle NR plus a complementary EQ dip in the offending band (often 6–10 kHz for hiss, 120–250 Hz for fan rumble).
- Inconsistent mic distance: Clip gain phrases before compression so the compressor doesn’t work overtime.
Deliverables: think ahead
When possible, export clean stems: DX, MX, FX. This allows last-minute narration tweaks without rebuilding the whole mix. Label clearly and keep sample rates consistent (48 kHz for video). If delivering internationally, provide a mix-minus (full mix minus dialogue) so local VO can be dropped in.
Great audio is transparent. When viewers lean into the story instead of fighting the mix, your edit lands. Build this workflow into a preset in your NLE, and your timelines will sound polished on every project.