Reactive vs proactive supply-chain security

Software supply-chain security is currently top-of-mind for many people, because the previous months have seen a lot of attacks. However, this is not a new issue. It is something that I have been thinking about for several years now. My concerns about it were strengthened when the xz-utils attack happened, and current events provide even more evidence that we need to re-think how we approach it.

As a community, we have adopted a working pattern where we place implicit trust in software packages published on registries like NPM, PyPI, Cargo or even on random GitHub repositories. And I would like to live in a world where that is okay, but real world experience shows us that this is not a sustainable path.

People have suggested different “solutions” for this. In the previous months I have read a plethora of blog posts arguing for different ways to improve security. However, in my opinion, none of the suggestions are actually the right way to solve this. They are all reactive: they wait for something to go wrong, then tell you about it. Real security has to be proactive. It has to surface problems before they reach you in the first place.

I think we can only solve this problem if we re-think the way that we use dependencies. Part of the reason proactive security isn’t pushed more often, in my view, is that it asks us to change how we think — and the tooling to make that practical does not yet exist.

In this article, I want to walk through some suggestions that I have read about, and why I think they are not good solutions. Finally, I will explain what OpenVet’s philosophy is, why I am building it the way I’m building it, and how OpenVet tries to solve these problems differently, and properly.

Just don’t use dependencies #

A few of the articles I have read seem to argue that you just should not use dependencies. Implement things yourself, or copy code. In fact, Go even spells the latter out as a proverb:

A little copying is better than a little dependency.

Let me untangle this a bit. I think a part of this thinking applies to microdependencies, those are dependencies that do a tiny thing. For example, left-pad was a small package published on NPM that left-padded a string. When the maintainer deleted it, it broke a lot of other people’s code. The article Micro-libraries need to die already talks about cases like this.

However, I don’t agree with the blanket idea that you should copy code. If a lot of people are using a dependency that does one tiny thing, that is a signal that whatever the library does should become part of the language’s standard library. There’s no signal stronger to add something like left-pad to your language’s standard library than someone publishing a library that does only that, and has hundreds of millions of users. There are cases where copying is fine, but that should not be the go-to way that software is developed.

For larger libraries, I also don’t think copying is the right move. A language’s usefulness is compounded by having a set of robust libraries. It’s the ecosystem. Instead of declaring that an ecosystem is bad and that people should go back to implementing things by themselves, we should work together to make sure that there are strong libraries, and that those libraries are safe.

Having dependencies is not bad. Having dependencies and not checking them is bad.

Vulnerability scanners don’t protect you #

Many people are pushing for the use of vulnerability scanners. I don’t think that using these is bad — it’s better than not using them. But fundamentally, vulnerability scanners can’t protect you.

Vulnerability scanners are, by definition, reactive. If there is an active attack (a malicious version of a software package published), someone might notice it, investigate it, make a report, that report is then checked, once verified it lands in a vulnerability database and that is where your dependency scanner picks it up. But that might be a considerable time later. This is reactive security: by the time your dependency scanner flags something, your secrets, API tokens and access to your infrastructure may already be exfiltrated. Modern supply-chain attacks move fast.

The time it takes for a malicious package to be discovered also depends on how popular it is. Today, most attacks target dependencies that are popular, which means that the damage they can do is greater (more users), but it also means that they are discovered more quickly. What if someone attacks a smaller, niche dependency? Who looks into those?

The data also undercuts the assumption that the CVE feed is even complete. Sonatype’s 2026 State of the Software Supply Chain report puts the catalog coverage of recent open-source CVEs at roughly 35%. The other 65% of issues are real, but not in any database your scanner is querying. The scanner is also not going to flag the next Shai-Hulud worm an hour after it is published, because there is no signature to match against yet.

SBOMs answer a different question #

A Software Bill of Materials is a list of what is in your build. It is documentation. It does not contain any assertion about whether the listed components are safe — it just enumerates them. SBOMs are useful when you need to answer “do I have a dependency on log4j 2.14?” after a CVE drops. They do not help you decide whether to take a new dependency in the first place.

The empirical picture is also bleak. Williams et al.’s Rising Tide study of supply-chain practices across nine large companies found SBOM consumption at three percent adoption. That is not “early days” three percent: that is “the machinery to act on SBOMs basically doesn’t exist on the consumer side” three percent. SBOMs are being produced because regulators are asking for them, not because consumers have a workflow that ingests them.

Scanners on top of SBOMs do not close the gap either. Dietrich et al. looked at 727 confirmed vulnerable shaded clones in Maven Central across 29 CVEs; the dominant SCA tools (Dependabot, Snyk, OWASP Dependency-Check, Grype) collectively flagged 20.5% of them. Even catalogued issues evade catalog-based detection when the code reappears under a slightly different name. These would have been caught if someone had looked into those dependencies.

Sigstore and provenance answer yet another question #

Sigstore, npm provenance, PyPI’s Sigstore-bound uploads, and Go’s checksum database are all answering the question did this artifact come from where it claims to?. They are good, important answers to that question. I want them to exist, and OpenVet’s data structures are explicitly designed to compose with them rather than replace them.

But “came from where it claims” is not the same as “is safe to run”. The dominant supply-chain compromises of the last two years all produced signed, attested, provenance-verified malicious releases:

The xz-utils backdoor was introduced by a maintainer who was, at the time, a legitimate maintainer of the project. Every release in the campaign was signed by a real key held by a real human with push rights.
The polyfill.io takeover happened because the attacker bought the domain and the GitHub organisation of the dependency. There was no credential compromise to detect — they were the new owners.
The Shai-Hulud worm propagated via postinstall scripts published under stolen but otherwise valid maintainer tokens. Every infected release was provenance-attested to a real CI job, run from a real repo, under a real account.

There is also a deeper limit on what attestations can prove, even in their richest form. in-toto, the spec behind most modern attestation schemes, only constrains the chain from source to artifact: it records which steps ran, in what order, and signed by whom. It is not an auditing tool: it does not check whether the source code is correct. A maintainer with commit and signing rights (or one whose tokens have been stolen) can put malicious code straight into the git tree, and the entire chain will verify end-to-end, even with in-toto.

A perfectly attested malicious release is still malicious. Provenance gates lower the rate at which unauthorised compromises succeed; they do not lower the rate at which authorised compromises succeed, and “authorised” turns out to cover a lot of the interesting attacks.

For example, the Ultralytics GitHub project was hijacked and published a malicious release that went through the legitimate GitHub Actions pipeline and was properly attested in Sigstore. Every signature was real, every check passed, and the release was still malicious.

Dependency freezes shift the problem in time #

Dependency cooldowns and the related “turn Dependabot off, pin everything, update on your own cadence” school (Valsorda, Hoyt) start from a real observation: most malicious package versions are detected within days of publication, so a consumer who waits N days before adopting a new release dodges a lot of attacks for free.

The problem, as Cal Paterson pointed out, is that this works precisely because someone else absorbs the attack during the cooldown window. The mechanism is structurally a free-rider on the ecosystem’s early adopters. If everyone froze for 7 days, the detection clock would simply move 7 days later, because most detections happen when the malicious code reaches consumers and one of them notices. The aggregate amount of exploitation does not go down; it just shifts to a slightly later cohort.

Cooldowns do nothing against slow campaigns. xz-utils was prepared over multiple years. A 7-day cooldown is invisible to that attack. Anywhere a determined adversary is willing to wait, time-based defenses degrade to zero.

There is also a second axis of dependency risk that cooldowns do not engage with at all. Cooldowns are a defense against malicious code — code deliberately placed in a release to harm consumers. They do nothing about ordinary vulnerabilities: bugs in legitimate code, undiscovered at release time, waiting to be found. The longer you stay on an old version, the more time attackers have to find exploitable bugs in the code you are still running, and the longer you remain on a version that has accumulated known issues since release. Cooldowns trade exposure to malicious new releases for exposure to slowly-accumulating vulnerabilities in old ones.

The common pattern #

Each of these defenses is a real thing, addressing a real layer of the threat surface. CVEs answer “is this known broken?”. SBOMs answer “what is in my build?”. Provenance answers “did this come from where it claims?”. Cooldowns answer “has anyone else been hit yet?”. Pinning answers “is this still the same code I ran last week?”.

None of them answer the question that actually matters when you take a dependency: has anyone competent looked at this code?

That question is the one we have been collectively avoiding for two decades. We use other people’s code at a scale that has no historical precedent — the median application now pulls in around 180 transitive dependencies — and we do it almost entirely without reading what we are running. The popularity heuristic (“if a lot of people use it, someone must have looked at it”) is the implicit fallback, and it is exactly the heuristic that maintainer-handoff, typosquatting, and account-takeover attacks are designed to defeat.

Sammak et al.’s 2024 interview study of 18 industry developers across 11 countries is sobering on this point: all 18 reported assessing popularity, 15 of 18 verified the recency of the last update, and 13 of 18 named dependency auditing as one of the most challenging security measures to implement at all. SBOMs, signing, and SLSA were essentially absent from the actual decisions developers reported making.

So why don’t we just read our dependencies? #

Because there is too much code. One personal data point: my average Rust pet project pulls in around 400 transitive dependencies, summing to about 3.5 million lines of code, as measured with cargo vendor and tokei. Reading that, even once, is the work of months. Doing it again on every update is not a job a human can hold.

And the trajectory is wrong. AI coding agents are generating code at unprecedented speed, and recent benchmarks suggest only around 10% of AI-generated code meets both functional and security bars. The volume of code entering projects without anyone reading it is growing faster, not slower.

This is the part of the argument that I think is most often missed. When somebody says “just audit your dependencies”, they are either:

proposing that every consumer privately performs millions of lines of review per project, on rolling updates, forever — which is not going to happen; or
proposing the same private audit at company scale, which is what Google, Mozilla, and the Bytecode Alliance actually do, and which produces real audit artifacts that the rest of the world cannot see¹; or
handwaving.

Option 1 is unrealistic at the individual level. Option 2 happens, but the artifacts are locked inside each organisation. Option 3 is what the “you should audit your dependencies” discourse usually collapses to in practice.

OpenVet’s argument is that doing the actual code-reading is the only thing that actually answers the question — and that the only way to make it sustainable is to do it in the open. If you read a dependency and publish a signed, machine-readable audit of what you read, then everyone who trusts you gets to stand on that work without re-reading the code themselves. The marginal cost of adding another audited consumer drops to zero.

The trust model is the load-bearing piece. You don’t have to trust every audit; you choose whose audits you trust. Maybe you trust Mozilla, the Bytecode Alliance, some well-known cryptography researchers, and three of your friends. Maybe you also trust a vendor you work with. A dependency passes your policy when the union of audits from people you trust covers it. If nobody you trust has audited it, you have a clear, actionable signal: either you do the audit yourself (which OpenVet tries to make as easy as possible), or you find a dependency that someone you trust has reviewed.

This is proactive in the specific sense that it answers the question before you ship the dependency, not after the incident response. You read the code while you are depending on it, not three months later when the CVE rolls in. And because the work is shared, you do not have to read all 3.5 million lines yourself: you only have to audit the slice that nobody you trust has audited yet.

And there is one aspect that none of the other approaches cover at all. OpenVet is a tool you can use to detect malicious code, but it does not end there. What about correctness? Vulnerability databases focus on code that is exploitable, but not on code that is just broken. By auditing your dependencies, you can not only surface malicious behaviour, but you can also check for other properties, such as correctness, test coverage, and overall implementation quality.

Audits don’t need maintainer cooperation #

There is one more structural property of the audit channel that the attestation-based defenses do not have: it does not require the maintainer to do anything. Every signing scheme — npm provenance, PyPI attestations, in-toto layouts, Sigstore-bound publishes — depends on the publisher of a package opting in. That turns ecosystem-wide adoption into a coordination problem across millions of independent maintainers, and the numbers show it is going slowly. Williams et al. put attestation production at 0.17 across major industry adopters and provenance delivery at 0.08. Your security model cannot realistically be “I will contact every maintainer in my dependency tree and convince them to emit attestations”.

OpenVet sits on the other side of that boundary. An audit is something anyone who can read the code publishes about a package, with zero cooperation required from the upstream maintainer. If a crate’s maintainer has never heard of signing, an auditor can still publish a useful audit of it. If a package’s maintainer is unreachable, an auditor can still review it. The bootstrap problem is much smaller: you do not have to convince every npm publisher in the world to flip a switch. You just need a community of reviewers willing to read code, and a way to share their work. It is also why this is a thing you can deploy today, without waiting on the rest of the ecosystem to catch up.

Reactive defenses still matter #

I want to be clear that I am not arguing against the reactive layer. Provenance, signing, SBOMs, and CVE feeds are all genuinely useful and OpenVet is designed to compose with them, not displace them.

Andrew Nesbitt makes the honest version of this case in Signing is for the bad days: the value of signing infrastructure shows up when something has already gone wrong — forensic clarity, identity binding, the ability to tell whether a malicious release came from a stolen token or a poisoned build cache. He is right about that. The reactive layer is the thing that helps you survive a bad day. It just should not be the layer that decides whether to install the dependency in the first place.

Provenance is useful to capture where binary artifacts came from. You typically can’t meaningfully audit binary artifacts, so provenance doesn’t tell you whether they are safe — but it does at least tell you where they came from, which is often the best you can do.
SBOMs give you the inventory you need to ask, after a disclosure, “do I have this thing in my build?”. They also give you legal compliance, in jurisdiction where that is required.
CVE feeds tell you when somebody else has caught an issue you missed. One feature I am working on with OpenVet is re-publishing data from CVEs and various vulnerabilities as OpenVet audits, so the tooling can ingest those as well.

These should be present. They just should not be the front line. A defense layered entirely on “what the rest of the ecosystem has already learned” is a defense that accepts being late as a feature. The front line should be “somebody I trust read this code”, and the reactive feeds should sit behind that as a backstop for the cases the audits did not catch.

What OpenVet is doing about it #

Concretely, OpenVet’s design follows from this position in a few specific ways:

Audits are first-class artifacts. They are signed, structured, and machine-readable. They carry claims, findings, source-anchored annotations, and a structured report. They are not a side effect of some other workflow.
Audits are publicly shareable by default. The registry hosts them under permissive licenses (CC0 or CC-BY-4.0) so that consumers can actually use, redistribute, and build on them.
Trust is rooted in publishers, not in the platform. Consumers pick whose logs they trust. The registry holds the bytes, the signatures and the cryptographic chain hold the trust.
The tooling is built around lowering authoring cost. Creating and publishing your first audit should take minutes, not days. If the marginal cost of publishing one more audit is high, the ecosystem will not scale.

The bet is that, if we can get the cost-sharing right, an ecosystem where most of the code most people depend on has been read by somebody they trust is actually reachable. It is not reachable by asking every consumer to read everything. It is not reachable by hoping a CVE feed catches the bad stuff in time. It is reachable by splitting the work and making the result legible across the network of people who already trust each other to some degree.

That is the part of the problem none of the reactive defenses are trying to solve, and it is the part OpenVet is built around.

Google, Mozilla and some other companies are auditing their dependencies. They publish some of these audits in the form of cargo-vet audits. But these published audits only cover Rust libraries, and don’t seem to be updated very often.