openvet 0.5.0 — biggest pre-1.0 breaking release

openvet 0.5.0 is out. It is the biggest breaking release I have shipped before 1.0, with a handful of new things worth trying. Highlights: an audit linter (openvet audit check), an ad-hoc query command (openvet query), the implies operator in requirements, and a deliberate wire-format reshape. Most of what landed is motivated by me doing some dogfooding. I created audits for the crates that the openvet project itself uses. The openvet repository has about 500 dependencies, and I audited 111 of them. On the way I learned a lot, and most of what’s in this post is what those lessons turned into.

Ideally, when you implement openvet in a real-world project, you don’t have to audit everything from scratch – you can trust audits that others have produced. But I am the first real “user” of openvet, so I do have to audit everything from scratch. And I need a dataset of audits, to test the CLI against.

I didn’t want to create all of the audits manually, especially given that the main reason I needed them was to test stuff, and I may have to re-create them. So, what I did was automate creating audits, by writing a “skill” that Claude can use to audit a package. This was actually very productive, for a number of reasons:

Writing the skill forced me to write down rules about how an audit is structured. I had some ideas in my mind about what audits should look like — but there is nothing like drafting it, letting Claude apply it, looking at the outcome, and seeing that it comes out different than I expected (because I had implicit assumptions, but I did not write them down)
Watching Claude use the CLI with my instructions and the help text revealed some real shortcomings that the CLI had.
Having a set of audits allowed me to test the registry: to make sure that audits render right.
It also allowed me to test the CLI, answering questions like: how fast does check work with ~100 audits in a log? How fast is publishing? How fast does it update and verify new commits?

Claims vocabulary #

OpenVet uses the concept of a claims vocabulary. Claims are atomic “facts” about a software package. For example, uses-unsafe. Or uses-network. I had a vocabulary file drafted up, but I hadn’t put a lot of effort into it. The first few audits showed me that this vocabulary was not really in the shape I wanted it to be in.

So, I ended up revising it, pointing Claude at the revised vocabulary, and it looked much better. I came up with a pattern of claims that works well with the way we write requirements. There are two kinds of claims:

Discovery claims: Encode a fact about something a package is, or does. These are claims like uses-unsafe (uses memory-unsafe code), impl-crypto (implements cryptography), has-binaries (has binaries/executables in the package archive), or is-benign. These are simple, can be asserted by just looking at what files are there, what code is there. is-benign is the global one — it doesn’t slot into the has-* / uses-* / impl-* families, but every audit asserts it like the others.
Audit claims: These are gated on the discovery claims. For example, if a crate implements cryptography (impl-crypto), then you may want to check if the cryptography is implemented in a safe way for callers (crypto-impl-safe), whether it’s implemented correctly (crypto-impl-correct), whether it is tested (crypto-impl-tested).

With this vocabulary in place, the auditing process works like this:

You test and assert all discovery claims (to true or false).
You then test and assert all audit claims that are gated by the discovery claims.

The new vocabulary is now in place, and the old one is removed.

Referencing findings, claims #

OpenVet has a file format for audits, that has human-readable and machine-readable pieces. For example, it has claims (boolean “facts”, machine-readable) and it has markdown-shaped data, for human consumption. But I wanted a way to “connect” the two.

One feature that I built previously is that you can reference claims from inside the markdown, using the !claim-name syntax. The idea behind that was that you can link claims inside the markdown, to justify why you asserted a claim the way you did. Similarly, I had built the #1 syntax, to reference findings.

OpenVet is set up to check these: you can only reference findings that exist. If you wrote #34, but there are only 27 findings, that is an error. Same for the claims.

Testing showed that the #1 syntax for findings clashes with some things, including GitHub pull-request references. In the registry, we render these as FINDING-1, so the syntax was changed to #finding-1, which reduces the chance of clashes.

One issue was found with the claims: sometimes, you want to choose to not evaluate a claim (leave it unasserted). This can happen for a number of reasons: maybe it is not possible for you to evaluate it. Maybe the scope is too large. But you still want to explain in the prose why you didn’t evaluate it. That did not work: openvet would error if you referenced a claim that you did not assert.

So I changed that. You can now reference claims, even if you don’t assert them.

Turning a summary into a report #

Initially, audits had a summary field as a high-level summary of the audit. Iterating on it, I ended up standardizing on a structure for it, that begins with a high-level explanation of the package that is audited (what it does), followed by the methodology of the audit (how was it examined, what tools were used), followed by a results section (explaining the outcome of the audit, explaining claims and findings), followed by a conclusion section.

That worked, but it turns out there is a small problem: in the web interface, what can we show as an actual summary? I considered extracting the conclusion section from the summary, but a conclusion isn’t really a summary of an audit.

So, I changed the audit wire format. What used to be summary is now called report, and I added an actual summary field, which is enforced to be less than 1024 bytes.

Audit linter #

Testing showed that it is easy to forget to assert a claim. Especially the discovery claims, ideally they should all be set to something. Claude forgot to, and so did I. A more structured approach was needed.

Since we have the claims vocabulary (and their relationships), and good markdown parsing abilities (with pulldown-cmark), I thought the best way to handle this is to build a linter. The existing audit validation code was moved from the openvet-proto crate into a dedicated openvet-audit crate. The vocabulary was converted into a TOML document, that the linter can validate. So openvet can now enforce a number of things:

All of the discovery claims need to either be asserted, or they need to be referenced in the markdown (a justification of why they could not be evaluated).
Gated on the discovery claims, the applying audit claims need to either be asserted, or referenced.
Any asserted claim needs to be referenced (explaining how it was evaluated).

Similarly, the linter also checks the report. It verifies that the markdown has the expected headings (Subject, Methodology, Results, Conclusion).

Making the linter fast and giving it readable output took real effort. I came up with a tree-shaped output. Here’s an example:

$ openvet audit check
note: detected workspace audit-cargo-deranged-0.5.8-7cd812cc2bc1, performing workspace check
audit.pb
├─ ✘ claims
│  ├─ ✘ 'uses-unsfae' is declared but never cited
│  ├─ ✘ 'algorithm-impl-safe' is asserted but 'impl-algorithm' is false or absent
│  ├─ ⚠ 'unsafe-documented' should be asserted because 'uses-unsafe' is true
│  ├─ ⚠ 'unsafe-minimal' should be asserted because 'uses-unsafe' is true
│  ├─ ⚠ 'unsafe-tested' should be asserted because 'uses-unsafe' is true
│  └─ ⚠ 'uses-unsfae' is not in the taxonomy (typo, or a custom claim)
├─ ⚠ summary
│  └─ ⚠ summary is over the soft cap (577 bytes; soft cap 512)
├─ ✘ report
│  ├─ ⚠ unexpected section: ## Assessment (sibling-level structure should be H3+)
│  ├─ ⚠ unexpected H1: # Some Other Heading (only `# Summary` is allowed at H1)
│  ├─ ✘ #finding-99 (no such finding)
│  └─ ⚠ '#1' inside backticks looks like a finding reference; remove the backticks to make it resolve
├─ ✘ findings
│  ├─ ✘ findings[0].description: #99 (no such finding)
│  └─ ⚠ findings[1].description: '#1' inside backticks looks like a finding reference; remove the backticks to make it resolve
├─ ✘ annotations
│  └─ ✘ src/unsafe_wrapper.rs
│     └─ ✘ lines 50-60 exceed file length (36)
└─ ? signatures (not signed yet)

5 errors, 9 warnings

The forward-reference part of this check is wired into more than just the command line: openvet audit sign won’t sign an audit with unresolved references, and the registry’s publish endpoint refuses bundles whose references don’t resolve. Same code, three call sites — what the CLI gates pre-sign matches what the server enforces at publish time.

Querying and requirements #

With 111 audits in hand, a different question came up: I now have a pretty cool dataset (all of my dependencies audited, with claims attached), but could I build a way to query that dataset? I wanted to answer questions like:

How many dependencies I have use unsafe?
How many dependencies I have implement cryptography, or use cryptography?

So I built openvet query. OpenVet already has a language (the requirement language), and all I needed was a way to evaluate a requirement expression and pretty-print the output. That part was not so hard.

$ openvet query impl-crypto
✓ cargo:sha1@0.11.0
✓ cargo:siphasher@1.0.3
✓ cargo:ssh-cipher@0.2.0
✓ cargo:subtle@2.6.1
Summary: 4 match, 182 contradicted, 372 unknown, 13 skipped

One pattern for requirements I used a lot looks like this:

(not impl-crypto) or crypto-impl-safe

This translates to: either the package does not implement cryptography, or the cryptography must be safe. In other words: if impl-crypto is true, then crypto-impl-safe must also be true. Because this is a common pattern, I added syntax for it. You can now write:

impl-crypto implies crypto-impl-safe

This works both when querying, and when writing requirements.

I then wanted a few more things:

By default, it shows all crates that match the requirement. But I wanted a way to show crates that don’t match it, or crates where the requirement evaluates to unknown (because it has no audits, or the audits it has don’t assert one of the claims in the requirement)
I also wanted a way to explain how the requirement expression was evaluated.

So I added a --status [match|contradicted|unknown|all], and an --explain flag. The explain output fits into a tree-like shape, similar to the audit linter.

$ openvet query impl-crypto implies crypto-impl-safe --explain
✓ cargo:zerovec@0.11.6
└─ from xfbs: ✓ true
   └─ any → true
      ├─ not → true
      │  └─ impl-crypto → false
      └─ crypto-impl-safe → unknown
✓ cargo:zerovec-derive@0.11.3
└─ from xfbs: ✓ true
   └─ any → true
      ├─ not → true
      │  └─ impl-crypto → false
      └─ crypto-impl-safe → unknown

openvet check shows you the same for packages that fail your requirements, it just breaks it down per-requirement.

Breaking the wire format #

By now you may have noticed that 0.5.0 breaks the wire format. Several of the changes above — splitting summary into report and summary, the #1 → #finding-1 rename, the relaxation of which claim references must resolve — are breaking. I could have implemented them in a way that preserves backwards compatibility, but I chose not to. OpenVet does not have users yet, so I can still “move fast and break things”. I figured: if I break the wire format, I might as well break it right. So I snuck a number of simplifications and improvements in there too.

A lot of RFC 3339 timestamps were replaced with a single u64 UNIX timestamp. No need to stringly-type time, saves parsing and encoding. The keys used by the keyset tree were simplified (now are just a TaggedHash’s Display output). The Signature type lost the embedded public key.

Adjacent to all of that I added some authoring tooling. Annotations gained a --remove flag — previously, if you set an annotation incorrectly, you had to recreate the audit workspace from scratch. I also added directory-level annotations (a wire-shape change of its own); you can attach an annotation to a whole subtree now, like a has-tests note on the tests/ directory.

The wire format isn’t perfect yet, there are still things I need to change, but I tried to fold everything into this breaking change that I could fit.

Next steps #

There is still a lot on the to-do list for the CLI. Some of the bigger pieces:

The TUI editor could lint as you go. It works, but you have to close it and run openvet audit check to see what’s wrong. Folding the linter into the TUI so it surfaces issues as you type would close the loop.
openvet audit new --diff doesn’t exist yet. Diff audits record metadata stating what previous version and audit were consulted. The metadata for this shipped already, but the tooling for it did not.
Encrypted SSH keys aren’t supported yet. Today openvet only signs with unencrypted on-disk keys; encrypted-key support with a passphrase prompt is a planned addition.
Only sshsig signatures are supported. Other signature formats are on the list but unimplemented. Using SSH keys and signatures is convenient: almost every developer has one, and they support hardware-backed tokens (ed25519-sk). But long-term, I want to support different kinds of keys. I am especially keen to try a WebAuthn-backed flow. The breaking changes in the wire format support that, but I haven’t built it yet.
There is no mirroring story yet. Self-hosting a log on static storage works, but tooling for “mirror an existing registry-hosted log to your own server” hasn’t been built.

I’ll keep posting here as these land.

Find it / report it #

Install: cargo install openvet
Source: gitlab.com/openvet-org/openvet
Docs: docs.openvet.org
Issues: the GitLab tracker on the CLI repo. Issues and MRs are very much welcome — this is a one-person project and any extra eyes help.

If you’d rather reach me directly, my GitHub profile has the contact info.