Elusive Thoughts: 28/06/2026

28/06/2026

A Threat Model Developers Will Actually Use

// appsec // threat modeling // kill the 40-page pdf

Somewhere on a shared drive at almost every company is a folder of threat models. Forty pages each. Beautiful diagrams. A STRIDE matrix that took someone two days to fill in. Every one of them was opened exactly twice. Once when it was written, once when an auditor asked for evidence that threat modeling happens. Between those two opens, the system it described shipped, changed shape four times, and grew an entire new attack surface the document never mentioned.

That folder is a graveyard. The headstones are very well formatted. And it is the single biggest reason developers think threat modeling is security theatre, because the version they were shown is theatre. A document produced for an audience of one auditor, read by no engineer, describing a system that no longer exists.

I want to talk about the other kind. The threat model a developer opens on purpose, mid-change, because it helps them ship the right thing. It is smaller than you think, it lives somewhere you might not expect, and the format barely matters.

Why threat models die

Three causes of death, and they are almost always the same three.

It is too big. A forty-page model of an entire platform is obsolete the day it is signed. Systems do not change at platform granularity. They change one pull request at a time. A model scoped to the whole system can never keep up with a system that changes in small pieces, so it rots, and a rotting model is worse than none because it lies with authority.

It lives in the wrong place. If the threat model is a Confluence page or a Visio file in a drive, it is outside the developer's loop. Developers live in the repo, the PR, the IDE, the ticket. Anything outside that loop is a context switch, and context switches under deadline lose every time. The model is not lazy. It is just geographically inconvenient.

It produces a document, not a decision. This is the deep one. Most threat models output a description of risks. A developer cannot act on a description. They can act on a decision: add auth to this endpoint, do not log this field, this service must not be reachable from the internet. If your threat model ends in prose instead of a short list of concrete changes, you produced a report, and reports go in the graveyard.

the test

A threat model is useful if, and only if, a developer can read it in five minutes and come away with a small list of things they will now build differently. Everything that does not serve that outcome is decoration.

What "usable" actually looks like

Flip the three causes of death and you get the design spec for free.

Scoped to a change, not a system. You do not threat model "the payments platform." You threat model "the new endpoint that lets a user move money between their own pots." Small enough to finish in one sitting. Small enough to stay true, because it is attached to a single change that either shipped or did not.

Lives where the work lives. A markdown file in the repo, next to the code. A section in the design doc that already exists for any non-trivial change. A comment thread on the PR. The model should be reachable without leaving the place the developer already is. If they have to open a second tab, you have already lost most of them.

Ends in a checklist of decisions. The output is not "here are the STRIDE categories we considered." The output is a handful of concrete, testable statements about what the code will now do. Those statements become acceptance criteria. They become test cases. They become review comments. They are load-bearing, not archival.

The method, kept deliberately small

You can run a genuinely useful threat model on a change with four questions. No matrix, no tooling, no two-day workshop. Sit with the one or two engineers who own the change and ask, in order:

What are we building, in one diagram? Not a formal DFD. A napkin sketch of the boxes, the arrows between them, and where the data goes. The only rule: draw the trust boundaries. Where does data cross from something you control to something you do not, or from less-trusted to more-trusted? Every interesting bug lives on a boundary line.
What would an attacker want here, and what are they allowed to touch? Money, personal data, a privilege, a foothold. And critically, who is the attacker? An anonymous internet user, a logged-in customer poking at another customer's data, a compromised internal service. The threat model for "external user" and "malicious authenticated user" are different models and most teams only ever do the first.
How could each boundary crossing go wrong? Walk every arrow that crosses a trust boundary. For each one: can it be spoofed, tampered with, abused to read something it should not, abused to do something it should not. This is STRIDE if you want a name for it, but you do not need the acronym in the room. You need the engineers looking at their own arrows and getting uncomfortable.
What are we going to do about it? This is the only question whose answer you keep. Each real risk becomes one of three things: a change we make now, a risk we knowingly accept and write down, or a thing we explicitly decide is out of scope. Three buckets. No prose.

That is the whole method. Thirty to sixty minutes for a normal change. The artifact it produces is the answers to question four, and nothing else needs to survive.

What the artifact looks like

Here is the entire output for a real-shaped example, the money-between-pots endpoint, as a markdown block that lives in the PR:

## Threat model: internal pot transfer (PR #2241)

Boundary crossings:
- client -> transfer API  (untrusted -> trusted)
- transfer API -> ledger  (trusted -> trusted, money moves here)

Decisions:
[x] enforce that both pots belong to the authenticated user
      (authz, not just authn) -> test: transfer to a pot you
      do not own returns 403
[x] make the transfer idempotent on a client key
      -> replaying the same request moves money once, not twice
[x] do not log the amount or pot IDs at info level (PII + abuse)
[ ] ACCEPTED RISK: no per-user rate limit at launch.
      Owner: payments. Revisit before raising transfer caps.
OUT OF SCOPE: cross-user transfers (not built in this change)

That is it. A developer reads it in two minutes. Three of those lines are now test cases. One is a written, owned risk acceptance instead of a thing nobody decided. One is an explicit scope boundary so the next person does not assume it was missed. No auditor will be thrilled by its length. Every engineer who touches this code will actually use it.

Make it a habit, not an event

The reason the four-question version wins long term is not that it is better security. A two-day workshop finds more, in theory. The four-question version wins because it is cheap enough to run every time, and a cheap thing done every time beats a thorough thing done never.

Threat modeling as an event, a quarterly ceremony with a facilitator and a meeting room, models the system at one frozen moment and then watches reality drift away from it. Threat modeling as a habit, a thirty-minute section in the design doc for any change that touches a trust boundary, moves with the system because it is part of how the system gets built. The first is an artifact. The second is a reflex. You want the reflex.

The way you get the reflex is by making it small enough that nobody can argue it is not worth the time, and by putting it where they already work so it costs them no context switch. Drop a four-question template into the repo as THREAT_MODEL.md next to the design-doc template. Add one line to the PR checklist: does this change cross a trust boundary, and if so, link the model. That is the entire rollout. No platform purchase, no workshop calendar.

field note

The best threat models I see in-house are three paragraphs of markdown in a PR, written by the developer, with two comments from me. The worst are forty-page PDFs written by security in isolation. The difference in security value is not subtle, and it runs the opposite way to the page count.

Where the heavy version still earns its place

To be fair to the forty-page model: there is a place for depth. A new platform, a major architectural decision, a system that moves serious money or holds serious data deserves a real, deep, deliberate threat model with the full method and the diagrams and the time. That is a handful of times a year, on the things that genuinely warrant it.

The mistake is using that heavyweight format for everything, because then you either do it rarely, on the big things only, and leave every normal change unmodelled, or you try to do it on everything and burn out in a month. Match the weight of the model to the weight of the change. Big decision, deep model, run it like the serious thing it is. Normal change crossing a boundary, four questions and a markdown block. Trivial change touching nothing sensitive, no model at all, and be honest that this is most changes.

Calibrating that is the actual skill. Not filling in the matrix. Knowing which changes deserve which depth, and refusing to spend a forty-page budget on a four-question problem. Do that, and threat modeling stops being the thing developers dread and the folder nobody opens. It becomes a thirty-minute reflex that quietly catches the bugs that would otherwise have been a pentest finding six weeks too late.

// Elusive Thoughts // less paper, more decisions // securityhorror.blogspot.com

Shift-Left Is Org Design Wearing a Vendor Badge

// appsec // culture // the part nobody sells you

Every shift-left pitch I have sat through follows the same arc. A scanner plugs into CI. A dashboard turns red. Developers fix things earlier. Risk slides down and to the left like a stock chart nobody questions. Buy the tool, get the outcome. It is clean, it ships quarterly, and it is mostly a lie.

Not because the tools are bad. Some of them are excellent. It is a lie because the slide deck quietly swaps the cause and the effect. Shift-left is not a thing you install. It is a thing your organisation already is, or already is not. The tool is the last 10 percent. The other 90 percent is who owns what, who gets paged, who can say no, and how fast a developer finds out they made a mistake. That is org design, and you cannot buy it on a per-seat licence.

What the word actually means once you strip the paint

Strip the marketing and shift-left says one thing: move the security decision closer to the moment the decision is made. That moment is not the pull request. It is earlier. It is a developer choosing a library, sketching a data flow on a whiteboard, deciding whether a service needs to talk to another service at all. The bug you want to kill was born in that choice, not in the diff.

So the real question shift-left asks is not where do we put the scanner. It is who is in the room when the choice gets made, and do they have what they need to choose well. A scanner in CI is downstream of every interesting answer to that question. It catches what survived the choice. Useful, but it is a net under the tightrope, not the tightrope.

Why tooling-first cargo-cults itself to death

Here is the failure mode I have watched play out more than once, from both chairs, consultant and in-house.

Security buys a scanner. Security wires it into the pipeline. The first full run produces 4,000 findings. Nobody triaged the baseline, so 3,600 of them are noise, dead code, test fixtures, dependencies that are technically vulnerable in a path that does not exist in production. The pipeline goes red. Developers cannot ship. Developers do the only rational thing a human under a deadline does: they find the bypass. A skip label. A nightly job instead of a blocking gate. A Slack message to the one person with admin who will wave it through.

Now you have spent budget to teach your entire engineering org that the security gate is a thing you route around. That lesson is durable. It outlives the tool. The next tool you bring in inherits the reputation of the last one, and you wonder why adoption is a fight every single time.

the pattern

A tool dropped into an org that is not built to absorb it does not raise the security baseline. It raises the noise floor and trains people to ignore you. The tool worked perfectly. The org ate it.

None of that is a tooling problem. You can swap the vendor and reproduce the exact same wreck. It is an ownership problem, an incentive problem, and a feedback-latency problem wearing a tooling costume.

The primitives that actually move security left

If shift-left is org design, then these are the levers. None of them appear in a product comparison grid.

1. Ownership that survives a reorg

A finding with no owner is not a finding, it is a rumour. The single highest-leverage thing I have done in-house is not picking a scanner. It is making sure every service has a named team that owns its security posture, and that the name is attached to something the team already cares about, not a spreadsheet security maintains in the dark.

If the security backlog lives in a tool only the security team opens, it is dead. Findings have to land in the same backlog where the team plans its sprint, in the same tracker, with the same labels, ranked against the same features. The moment a vuln has to compete with a feature on equal footing in front of the same engineering manager, you have moved security left in the only way that holds. You moved it into the place where prioritisation actually happens.

2. Incentives pointed at the outcome you want

People do what they are measured on. If a team is measured purely on delivery velocity, security is friction, and you are the friction. No amount of lunch-and-learns fixes a misaligned incentive. You can win every developer's heart in the room and lose every one of them the second the quarter clock starts.

The fix is not to make security a KPI nobody believes. It is to make the secure path the fast path. If reaching for the hardened, paved-road service template is genuinely the quickest way to ship, security stops being a tax and becomes the default. That is an engineering-platform investment, not a security-tool purchase. It is built by the platform team, funded by leadership, and your job is to make the case for it in language a delivery lead and a CFO both accept.

3. Decision rights, written down

Who can accept a risk? Who can override a gate? On what authority? If the answer is whoever shouts loudest before the release, you do not have a security program, you have a negotiation that restarts every Friday. Real shift-left needs the boring governance most teams skip: a written risk-acceptance path, an owner for it, and a clear line between what a team can wave through themselves and what has to come up the chain. Boring. Load-bearing.

4. Feedback latency measured in minutes, not weeks

The entire mechanical value of shift-left is shortening the distance between mistake and correction. A pentest that lands findings six weeks after the code shipped has shifted nothing. A SAST rule that flags the dangerous pattern in the IDE, while the developer still has the whole mental model loaded, has shifted everything. Same class of bug. The only variable that changed was latency, and latency is the whole game.

This is the one place tooling genuinely earns its keep, and notice it only works because the org already did the other three. Fast feedback into a team with no ownership and no incentive is just a faster way to get ignored.

So where does the tool go

Last. The tool goes last. Once a service has a named owner, the secure path is the fast path, risk acceptance is a written route instead of a hallway favour, and feedback is fast and tuned, then you bolt on the scanner and it lands on prepared ground. The findings have somewhere to go. Somebody owns them. The baseline was triaged before the gate ever blocked anyone. Adoption is not a fight because you are not asking people to absorb chaos, you are handing them a sharper version of a process they already run.

Do it in the other order and the tool is a liability. I would rather inherit an org with strong ownership and a mediocre scanner than a perfect scanner bolted to an org that routes around it. The first one I can improve in a quarter. The second one taught itself to ignore me, and unlearning that takes years.

field note

The most effective security control I have shipped this year was not a tool. It was getting security findings into the same Jira board, with the same priority scale, as feature work. Zero new software. It changed more behaviour than any scanner I have ever deployed.

What I would actually do on day one

If you handed me a fresh AppSec mandate and a budget, here is the order, and the tool is deliberately not at the top.

Map ownership first. Every production service to a named team. The gaps you find here are your real risk register, and they cost nothing to find.
Find the paved road, or build the case for one. What is the fastest way to ship a new service today, and is it secure by default? If the fast path and the safe path are different paths, that is the problem. Fix the road before you fine people for going off it.
Write down the risk-acceptance path. One page. Who decides, on what authority, recorded where. Kill the Friday negotiation.
Only then, pick the tool, tune the baseline before it blocks a single build, and wire it into the feedback loop that is closest to the developer you can reach.

Notice that three of those four cost no licence money and do not appear in any vendor demo. That is the tell. Shift-left was always org design. The vendors just found a way to sell you the badge without the work, and the badge does not do the work.

Buy the tool if you want. Buy a good one. But buy it knowing it is the last 10 percent, and that if you skip the 90 percent underneath it, the tool will work flawlessly while your program quietly fails around it.

// Elusive Thoughts // written from the in-house chair, not the consultant one // securityhorror.blogspot.com

Elusive Thoughts

28/06/2026

A Threat Model Developers Will Actually Use

A Threat Model Developers Will Actually Use

Why threat models die

What "usable" actually looks like

The method, kept deliberately small

What the artifact looks like

Make it a habit, not an event

Where the heavy version still earns its place

Shift-Left Is Org Design Wearing a Vendor Badge

Shift-Left Is Org Design Wearing a Vendor Badge

What the word actually means once you strip the paint

Why tooling-first cargo-cults itself to death

The primitives that actually move security left

1. Ownership that survives a reorg

2. Incentives pointed at the outcome you want

3. Decision rights, written down

4. Feedback latency measured in minutes, not weeks

So where does the tool go

What I would actually do on day one

JadePuffer: Anatomy of an Agentic Ransomware Attack That Ran an LLM as Its Operator

New tool repo

My Other Blogs