03/05/2026

The CFO Was Never On the Call: Deepfake-Driven BEC in 2026

// ELUSIVE THOUGHTS — APPSEC / SOCIAL ENGINEERING

The CFO Was Never On the Call: Deepfake-Driven BEC in 2026

Posted by Jerry — May 2026

A finance director joins a Zoom call. The CFO is on the screen, voice and face perfectly familiar, requesting an urgent wire transfer. The transfer goes through. The CFO never logged in.

In 2024, this exact playbook cost engineering firm Arup roughly twenty-five million dollars in Hong Kong. In 2026, the cost of running this attack has fallen below five US dollars and requires under thirty seconds of public training audio. The infrastructure to do this at industrial scale is now sitting in consumer SaaS products.

// the threat model has shifted

Traditional BEC playbooks assume a text-based attack: spoofed email, lookalike domain, social-engineered urgency. Defensive guidance was built around DMARC, DKIM, SPF, and "verify the sender's email domain." All of that still matters. None of it covers the current attack vector.

The current attack vector is real-time voice and video synthesis, deployed on live conferencing platforms. Open-source models like FaceFusion and commercial offerings like ElevenLabs Pro have collapsed the technical barrier. The latency required for a convincing real-time conversation has dropped below two hundred milliseconds. The training audio requirement has dropped to under a minute.

Sora 2 and Veo 3 enable pre-recorded video that survives casual scrutiny. The combination — pre-recorded video for the appearance plus real-time voice cloning for the dialogue — is what attackers are using now.

// what mfa cannot save you from

The first thing to understand: this attack does not bypass authentication. It bypasses the human in the loop. Your finance director has authenticated correctly. They are on the right Zoom call. They are talking to what looks like the right person. The compromise is not at the auth layer — it is at the trust-the-call layer.

Identity verification at the start of the call does not help, because the attacker is on the same call as a legitimate participant. Speaker verification on the conferencing platform does not help — the platform sees a verified meeting host inviting a guest. The guest just happens to look and sound like the CEO.

// what actually works

The defensive controls below are not novel. They are operational discipline that most organizations have not implemented because, until recently, they felt like overkill. They no longer do.

CONTROL 1 — OUT-OF-BAND CALLBACK VERIFICATION

Any wire transfer above an organizationally defined threshold requires verification via a callback to a pre-shared phone number. Not the number on the email. Not the number from the call. The number stored in the procurement system from when the relationship was established. The number that was set up before any social engineering took place.

CONTROL 2 — CHALLENGE PHRASES FOR HIGH-VALUE APPROVALS

Yes, like spy films. Pre-agreed code phrases between executives and finance teams, rotated quarterly, used as a final challenge for any approval over a defined value. The reason this technique appears in fiction is that it works in reality. A deepfake of someone's voice cannot reproduce a code phrase the original person never spoke.

CONTROL 3 — LIVENESS CHALLENGES

Real-time deepfake models still degrade noticeably under unscripted physical motion. Ask the person to turn their head sharply, hold up a specific number of fingers, or move the camera. Pre-recorded video fails immediately. Real-time synthesis fails on novel gestures. This is a stopgap — the technology will improve — but in the current threat landscape it is effective.

CONTROL 4 — APPROVAL THRESHOLDS AND DUAL CONTROL

No single human should be able to approve a transfer above a meaningful threshold based on a video call alone. Dual control — two distinct authenticated approvals through the financial system, not through the conferencing platform — moves the trust boundary back to systems with stronger guarantees than the human eye and ear.

CONTROL 5 — TRAIN THE SPECIFIC FAILURE MODE

Generic phishing training does not cover this. Finance staff, executive assistants, and treasury operators need specific tabletop exercises against deepfake scenarios. They need to feel the social pressure of being asked by a "C-level" to bypass procedure, and they need explicit organizational backing to refuse. "Trust your instincts" is not a control — clear procedural authority is.

// detection technology

Several vendors are building real-time deepfake detection for conferencing platforms — Reality Defender, Pindrop, Sensity AI. The technology exists. It is not yet good enough to be the only line of defense. Detection accuracy degrades against the latest generation of synthesis models, and the false positive rate creates real friction for legitimate calls.

The honest assessment in 2026: deploy the detection technology where you can, but do not depend on it. The procedural controls above carry the load.

// the larger pattern

This category of attack is the leading edge of a broader shift. The attack surface is no longer the email, the network, or the application. It is the trusted communication channel that humans use to coordinate work. The voice you recognize. The face on the screen. The conversational dynamics that signal legitimacy.

Application security as a discipline has historically been about code, infrastructure, and data flows. The discipline now extends to the human protocols that surround those systems. The threat model that does not include synthetic media is incomplete.

If your incident response runbook does not include "what we do when an employee reports an executive impersonation," it is missing a chapter that 2026 has made mandatory.

$ end_of_post.sh — comments open. Tell me what your org is doing about this.

CVE-2025-59536: When Your Coding Agent Becomes the Backdoor

// ELUSIVE THOUGHTS — APPSEC / AI AGENTS CVE-2025-59536: When Your Coding Agent Becomes the Backdoor Posted by Jerry — May 2026 On F...