A Threat Model Developers Will Actually Use
// appsec // threat modeling // kill the 40-page pdf
Somewhere on a shared drive at almost every company is a folder of threat models. Forty pages each. Beautiful diagrams. A STRIDE matrix that took someone two days to fill in. Every one of them was opened exactly twice. Once when it was written, once when an auditor asked for evidence that threat modeling happens. Between those two opens, the system it described shipped, changed shape four times, and grew an entire new attack surface the document never mentioned.
That folder is a graveyard. The headstones are very well formatted. And it is the single biggest reason developers think threat modeling is security theatre, because the version they were shown is theatre. A document produced for an audience of one auditor, read by no engineer, describing a system that no longer exists.
I want to talk about the other kind. The threat model a developer opens on purpose, mid-change, because it helps them ship the right thing. It is smaller than you think, it lives somewhere you might not expect, and the format barely matters.
Why threat models die
Three causes of death, and they are almost always the same three.
It is too big. A forty-page model of an entire platform is obsolete the day it is signed. Systems do not change at platform granularity. They change one pull request at a time. A model scoped to the whole system can never keep up with a system that changes in small pieces, so it rots, and a rotting model is worse than none because it lies with authority.
It lives in the wrong place. If the threat model is a Confluence page or a Visio file in a drive, it is outside the developer's loop. Developers live in the repo, the PR, the IDE, the ticket. Anything outside that loop is a context switch, and context switches under deadline lose every time. The model is not lazy. It is just geographically inconvenient.
It produces a document, not a decision. This is the deep one. Most threat models output a description of risks. A developer cannot act on a description. They can act on a decision: add auth to this endpoint, do not log this field, this service must not be reachable from the internet. If your threat model ends in prose instead of a short list of concrete changes, you produced a report, and reports go in the graveyard.
A threat model is useful if, and only if, a developer can read it in five minutes and come away with a small list of things they will now build differently. Everything that does not serve that outcome is decoration.
What "usable" actually looks like
Flip the three causes of death and you get the design spec for free.
Scoped to a change, not a system. You do not threat model "the payments platform." You threat model "the new endpoint that lets a user move money between their own pots." Small enough to finish in one sitting. Small enough to stay true, because it is attached to a single change that either shipped or did not.
Lives where the work lives. A markdown file in the repo, next to the code. A section in the design doc that already exists for any non-trivial change. A comment thread on the PR. The model should be reachable without leaving the place the developer already is. If they have to open a second tab, you have already lost most of them.
Ends in a checklist of decisions. The output is not "here are the STRIDE categories we considered." The output is a handful of concrete, testable statements about what the code will now do. Those statements become acceptance criteria. They become test cases. They become review comments. They are load-bearing, not archival.
The method, kept deliberately small
You can run a genuinely useful threat model on a change with four questions. No matrix, no tooling, no two-day workshop. Sit with the one or two engineers who own the change and ask, in order:
- What are we building, in one diagram? Not a formal DFD. A napkin sketch of the boxes, the arrows between them, and where the data goes. The only rule: draw the trust boundaries. Where does data cross from something you control to something you do not, or from less-trusted to more-trusted? Every interesting bug lives on a boundary line.
- What would an attacker want here, and what are they allowed to touch? Money, personal data, a privilege, a foothold. And critically, who is the attacker? An anonymous internet user, a logged-in customer poking at another customer's data, a compromised internal service. The threat model for "external user" and "malicious authenticated user" are different models and most teams only ever do the first.
- How could each boundary crossing go wrong? Walk every arrow that crosses a trust boundary. For each one: can it be spoofed, tampered with, abused to read something it should not, abused to do something it should not. This is STRIDE if you want a name for it, but you do not need the acronym in the room. You need the engineers looking at their own arrows and getting uncomfortable.
- What are we going to do about it? This is the only question whose answer you keep. Each real risk becomes one of three things: a change we make now, a risk we knowingly accept and write down, or a thing we explicitly decide is out of scope. Three buckets. No prose.
That is the whole method. Thirty to sixty minutes for a normal change. The artifact it produces is the answers to question four, and nothing else needs to survive.
What the artifact looks like
Here is the entire output for a real-shaped example, the money-between-pots endpoint, as a markdown block that lives in the PR:
## Threat model: internal pot transfer (PR #2241)
Boundary crossings:
- client -> transfer API (untrusted -> trusted)
- transfer API -> ledger (trusted -> trusted, money moves here)
Decisions:
[x] enforce that both pots belong to the authenticated user
(authz, not just authn) -> test: transfer to a pot you
do not own returns 403
[x] make the transfer idempotent on a client key
-> replaying the same request moves money once, not twice
[x] do not log the amount or pot IDs at info level (PII + abuse)
[ ] ACCEPTED RISK: no per-user rate limit at launch.
Owner: payments. Revisit before raising transfer caps.
OUT OF SCOPE: cross-user transfers (not built in this change)
That is it. A developer reads it in two minutes. Three of those lines are now test cases. One is a written, owned risk acceptance instead of a thing nobody decided. One is an explicit scope boundary so the next person does not assume it was missed. No auditor will be thrilled by its length. Every engineer who touches this code will actually use it.
Make it a habit, not an event
The reason the four-question version wins long term is not that it is better security. A two-day workshop finds more, in theory. The four-question version wins because it is cheap enough to run every time, and a cheap thing done every time beats a thorough thing done never.
Threat modeling as an event, a quarterly ceremony with a facilitator and a meeting room, models the system at one frozen moment and then watches reality drift away from it. Threat modeling as a habit, a thirty-minute section in the design doc for any change that touches a trust boundary, moves with the system because it is part of how the system gets built. The first is an artifact. The second is a reflex. You want the reflex.
The way you get the reflex is by making it small enough that nobody can argue it is not worth the time, and by putting it where they already work so it costs them no context switch. Drop a four-question template into the repo as THREAT_MODEL.md next to the design-doc template. Add one line to the PR checklist: does this change cross a trust boundary, and if so, link the model. That is the entire rollout. No platform purchase, no workshop calendar.
The best threat models I see in-house are three paragraphs of markdown in a PR, written by the developer, with two comments from me. The worst are forty-page PDFs written by security in isolation. The difference in security value is not subtle, and it runs the opposite way to the page count.
Where the heavy version still earns its place
To be fair to the forty-page model: there is a place for depth. A new platform, a major architectural decision, a system that moves serious money or holds serious data deserves a real, deep, deliberate threat model with the full method and the diagrams and the time. That is a handful of times a year, on the things that genuinely warrant it.
The mistake is using that heavyweight format for everything, because then you either do it rarely, on the big things only, and leave every normal change unmodelled, or you try to do it on everything and burn out in a month. Match the weight of the model to the weight of the change. Big decision, deep model, run it like the serious thing it is. Normal change crossing a boundary, four questions and a markdown block. Trivial change touching nothing sensitive, no model at all, and be honest that this is most changes.
Calibrating that is the actual skill. Not filling in the matrix. Knowing which changes deserve which depth, and refusing to spend a forty-page budget on a four-question problem. Do that, and threat modeling stops being the thing developers dread and the folder nobody opens. It becomes a thirty-minute reflex that quietly catches the bugs that would otherwise have been a pentest finding six weeks too late.