Fossil SCM

Signing and verification of artifacts

3 years, 10 months ago by george

This document tries to bring closer a more ubiquitous, seamless and useful signing and verification of artifacts.

This is a draft!
It is incomplete. It sketches out a few possible solutions. These solutions try to balance flexibility and complexity.

Table of content:

Agenda and context ==================

The main point is to enable strong authenticity not just for "data in transit" but also for "data at rest". This would enable some interesting features:

  • Trust could be decoupled from centralized CAs (which is nearly inevitable for TLS).

  • Trust could be decoupled from online-managed secret keys (like "cold wallets" of some cryptocurrencies do)

  • Trust could be maintained even if there is a need to use unconventional (yet) transports like GNUnet, Freenet, IPFS, Dat, NNCP, Pigeon post, Sneakernet

  • If a [Fossil repository is repurposed as a document] (forum:/forumpost/2ac0171524104616) then this document gets digital signatures "for free".

The above would make Fossil robust distributed system that by design can not be surpassed by any "server-based" service (e.g. GitHub and the likes).

The idea is not new. Monotone (which is kind of ancestor of Fossil) automatically signs every commit. But Monotone seems to be orphaned and does not support all the goodies provided by Fossil (like customizable WebUI and Tickets, Wiki, Forum and so on).
The optimal way to implement the feature in Fossil is not obvious, so lengthy discussions about the details can easily be anticipated.

Some related topics have already arisen at the Forum:

Some more recent noteworthy opinions on the topic:

ravbc on 2020-10-22:

IMHO, there is no easy escape from distributing public keys within a repository

offray on 2020-10-23:

I really like the idea of having public keys uploaded to the repository and signed by others in it.

wyoung on 2020-12-13:

All you can do is establish a PKI standard within the set of repos you do control.

wyoung on 2021-09-18

It is quite unlikely that your Fossil server has a wild assortment of PGP keys

george on 2022-05-29

Fossil 2.19 should accept structural artifacts with signatures in some prominent (yet undecided) format...

Identity model ==============

Identity is a cryptographically sound avatar of a human being. [Identity is distinguished by the public key of it's main keypair] (^ This is essential. It enables the same identity to participate in different projects, even though the owner of that identity previously was registered and is participating in these projects under different UserIDs. See also forum post ae37ac84285. ), which is referred to either directly (for signature schemes with short public keys, such as Ed25519) or through it's hashsum (for signature schemes with long public keys, such as Ed448 and RSA). In both cases a [human-friendly variant of base32 encoding]1 is used in order to prevent confusion with artifacts' UUIDs and also to facilitate verbal transfers (in the context of signing parties and alike).

Identity does not expire, but can be explicitly abrogated. Identity's main key may be used to claim that it was compromised or [intentionally destroyed.]3 Also identity's main key may be used to declare a [trusted revoker]2 — a public key that is authorized to claim that identity's main key is lost, destroyed or compromised. The former claim may be recovered using the identity's main key, while in the later two cases the whole identity is permanently abrogated. A trusted revoker may be a key that is under exclusive control of identity's owner or may be a main key of some other identity. In both cases authorization of the trusted revoker may have an expiration time set and also be limited to just some of claims (for example, only "lost" and "destroyed" claims may be authorized). A trusted revoker need not be public unless it is used.

A set of projects that are relevant for a particular identity will be denoted as identity's context.

Identity's context is partitioned into workspaces. This means that a workspace is a subset of projects relevant for that identity, and that at any moment of time any two workspaces do not overlap. However projects may be added to or removed from workspaces as time goes. Identity's context may constitute of just a single workspace. Similarly a workspace may consist of just a single project.

A person may have just one identity or may choose to maintain several identities (perhaps with different organization of workspaces). [It is advised to have as little amount of identities as is reasonable] (^ An underline conjecture is that it should help to improve the connectedness of the global Web of Trust. ); this model tries to be sufficiently flexible in order to permit that.

Workspace subkeys are used for general-purpose signing of structural artifacts (check-ins, posts, ticket changes, wiki edits etc.). Each workspace subkey is limited in scope to a particular workspace and must be neither used nor propagated outside of that workspace. Each workspace subkey has an expiration date set. (^ Whenever a workspace subkey is introduced, prolongated or rotated there is an upper bound for the eligible lifespan. The exact optimal lifespan depends on the workspace. A lifespan of 14 months is suggested as a hard-coded maximum. ) A workspace subkey may be used to revoke itself.
In the following a "subkey" or "work key" means a short form of "workspace subkey".

A particular identity at any moment of time may have just one active workspace subkey within any workspace. In the other words: several workspace subkeys of a particular identity must not be used simultaneously within any project. If the aforementioned clash is observed then identity should be treated as misbehaving and suspicious. (^ It may be tempting to allow several simultaneous workspace subkeys within a project. In that case different devices could use a dedicated workspace subkey. Thus if a leak of the corresponding secret key should occur then it would be possible to identify (and fix) the device that permitted that leak. However, it looks like a significant complication of the model, which for the time being seems neither necessary nor desirable. )

It is assumed that the safety of the main key is maintained on a higher level than the safety of the workspace subkeys; and that safety of trusted revoker(s) (if any) is somewhere in between.

Auxiliary definitions =====================

  • Key
    Either secret or public key depending on the context.

  • Owner of a key
    A person who generated a keypair (presumably in a secure environment).

  • Signed artifact
    A structural artifact with a cryptographically valid digital signature.

  • Legitimate artifact
    A signed artifact that was created according to the concious desire of a key's owner.

  • Counterfeited artifact
    A signed artifact that was created without concious consent or desire of a key's owner.

  • Leak of a key
    A copy of a secret key is or have been accessible for someone other than the owner (who may remain unaware of that).

  • Key compromise
    Probability of a key's leak is not negligible.

  • Key is lost
    Owner is unable to retrieve a copy of a secret key. Usual reasons include the loss of a media, a passphrase being forgotten or inability to gather enough shares of a distributed secret.

  • Key is destroyed
    It is guaranteed that neither owner nor anybody else will ever be able to retrieve a copy of a secret key. A breakthrough in cryptoanalysis (for example, a discrete-log problem being broken) doesn't count as "retrieving".

  • Claim
    A proposition signed by identity's main key. Identity that makes (signs) a claim will be refered to as a claim's source.

  • Unitary claim
    Is one of the following:

    • introduction, expiration, prolongation, rotation or revocation of identity's workspace subkey;

    • abrogation of the source by itself;

    • certificate of the trusted revoker.

  • [Binary][^] claim
    Represents a quantified proposition about some other identity; this other identity will be refered to as claim's [destination]5.
    Abrogation of the destination by a trusted revoker may be viewed as [a special case of a binary claim]6, provided that a trusted revoker is equal to the source.

  • Identity is inhibited
    Owner may be unable to create or to propagate full set of legitimate artifacts; this may be caused by the

    1. loss of secret key(s)
    2. lack of infrastructure
    3. blackmail
    4. gag order
    5. health conditions
    6. owner's death
  • Identity is disconnected
    Identity is inhibited or the owner may be unable to receive full set of artifacts generated within all projects relevant to that identity.

  • Identity is disintegrated
    Counterfeiting of artifacts signed by identity's workspace subkey either has already occurred or is anticipated.

  • Identity is stolen
    Loss of exclusive control for identity's main key.

Trust model ===========

The system should try to answer the ultimate inquiry from a user:

Is this particular signed artifact a legitimate one or counterfeited?

The answer to this question is guaranteed to be "it is legitimate" if and only if at the moment when that signed artifact was created

  • identity wasn't inhibited and
  • identity had exclusive control for the corresponding secret key

The reality is more complicated because often there is a bit of uncertainty. The system should derive a probabilistic answer based on the estimates of probabilities for the values of the above predicates.

That calculation might use the reasoning about possible [temporal][^] sequence of events and also the claims from identities within relevant project(s).

Propositions within binary claims fall into one of three categories:

  1. Connectedness
    This is quantified as ERL (short for expected response lag) which estimates the typical duration of information [roundtrips][^].
    It sums up durations that are needed for

    • workspace's new information to reach claim's destination,
    • destination to understand this information and prepare a response,
    • that response to reach substantial part of workspace's participants.
  2. Integrity
    Encapsulates safety of a particular workspace subkey and also willingness of the destination's owner to revoke or rotate a subkey immediately upon the discovery of key compromise.

    This is quantified as a transient probability that a signed artifact is legitimate. It is a tripple of scalars, where each scalar estimates aforementioned probability for a certain moment:

    • right after a signed artifact has been received,
    • a moment that is [two ERLs later]9,
    • a moment that is [five ERLs after an artifact was received]9

    If the corresponding workspace subkey is revoked then all these probabilities are invalidated.

  3. Trustworthiness
    Estimate of trust that source puts into pairwise claims signed by the destination.

    This is quantified as an integer in the range [-3;+3] which represents a bias of a claim's destination relative to its source on the abstract axis "trustworthiness". This abstract axis encapsulates and integrates three very different characteristics of a human being:

    • safety
      — ability to prevent counterfeiting of claims (through a leak of the main key in particular);
        this aggregates

      • severity of threats
      • willingness to resist
      • resources for defense (such as skills, laws, money, etc.)
    • perspicacity
      — ability to deduce the truth; about other identities in particular.

    • honesty:
      — intolerance to the falsity of one's own propositions; one's own claims in particular.

    The integer values of 0, ±1, ±2 and ±3 may be interpreted as "same", "slightly", "noticeably" and "much" respectively.

A claim with proposition about trustworthiness will be referred to as t-claim. T-claim is propagated to all projects that are relevant for both the source and the destination. T-claims form a global "social graph".

A claim with propositions about connectedness and integrity will be referred to as ci-claim. CI-claim is propagated to all projects that

  • are relevant for both ends of the claim, and that
  • belong to the corresponding workspace (the one which propositions are about).

For a given signed artifact it is possible to estimate its legitimacy provided that "social graph" contains a path from the identity who makes an inquiry to the identity who signed that artifact.
Probability that a signed artifact is legitimate may be computed for arbitrary moment of time as weighted average of approximated integrities from the available ci-claims.
The aforementioned weights are derived from the t-claims using a computation over the underlying "social graph". This computation starts from the identity who makes an inquiry and computes weights of other identities in a [BFS-like]10 manner, until the author of the artifact is reached.

Footnotes =========

Z a378d1b7


  1. Something like Crockford's Base32 encoding

  2. It's yet unclear which word is more appropriate: "trusted" or "designated". 

  3. This is a bit speculative because the signing of the "intentionally destroyed" claim has to precede the actual destruction of the last copy of a secret key; and that actual destruction may fail silently. 

  4. It's unclear which word is more appropriate: "binary", "pairwise" or some other. 

  5. It's unclear which word is more appropriate: "destination", "target" or some other. 

  6. This special case of a binary claim may be viewed as a claim about trustworthiness

  7. The notion of "when" is rather complicated for a distributed system without a single source of trusted timestamps. The only thing that can be guaranteed is that the knowledge of the output of a secure hash function can not precede the knowledge of the corresponding input. 

  8. The notion of "roundtrip" is blurry if there is no central server. In that case it is more about dissipation of information in "both directions" through the network of retransmitters (not all of which are necessarily participants of a project). 

  9. The exact values of that delay is debatable. It is assumed that two ERLs might be enough for the destination to react on impersonation, and five ERLs might be enough for reaction from a trusted revoker or other participants of the workspace.
    If the delay is modeled by Erlang-2 distribution, then two ERLs give 91% probability that response has been received. 

  10. Breadth-first search. Proceeds like an expanding concentric wave on the water. 

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button