Fossil SCM

fossil-scm / www / blockchain.md
1
# Is Fossil A Blockchain?
2
3
The Fossil version control system shares a lot of similarities with
4
other blockchain based technologies, but it also differs from the more common
5
sorts of blockchains. This document will discuss the term’s
6
applicability, so you can decide whether applying the term to Fossil
7
makes sense to you.
8
9
10
## The Dictionary Argument
11
12
The [Wikipedia definition of "blockchain"][bcwp] begins:
13
14
>
15
"A blockchain…is a growing list of records, called blocks, which are linked using
16
cryptography… Each block contains a cryptographic hash of the previous
17
block, a timestamp, and transaction data (generally represented as a Merkle tree)."
18
19
Point-for-point, Fossil follows this partial definition.
20
The blocks
21
are Fossil’s ["manifest" artifacts](./fileformat.wiki#manifest). Each
22
manifest has a cryptographically-strong [SHA-1] or [SHA-3] hash linking it to
23
one or more “parent” blocks. The manifest also contains a timestamp and
24
the transactional data needed to express a commit to the repository.
25
To traverse the Fossil repository from the tips of its [DAG] to the
26
root by following the parent hashes in each manifest is to traverse
27
a Merkle tree.
28
Every change in Fossil starts by adding one or more manifests to
29
the repository, extending this Merkle tree.
30
31
[bcwp]: https://en.wikipedia.org/wiki/Blockchain
32
[DAG]: https://en.wikipedia.org/wiki/Directed_acyclic_graph
33
[SHA-1]: https://en.wikipedia.org/wiki/SHA-1
34
[SHA-3]: https://en.wikipedia.org/wiki/SHA-3
35
36
37
38
<a id="currency"></a>
39
## Cryptocurrency
40
41
Because blockchain technology was first popularized as Bitcoin, many
42
people associate the term with cryptocurrency. Fossil has nothing to do
43
with cryptocurrency, so a claim that “Fossil is a blockchain” may fail
44
to communicate the speaker’s concepts clearly due to conflation with
45
cryptocurrency.
46
47
Cryptocurrency has several features and requirements that Fossil doesn’t
48
provide, either because it doesn’t need them or because we haven’t
49
gotten around to creating the feature. Whether these are essential to
50
the definition of “blockchain” and thus disqualify Fossil as a blockchain
51
is for you to decide.
52
53
Cryptocurrencies must prevent three separate types of fraud to be useful:
54
55
* **Type 1** is modification of existing currency. To draw an analogy
56
to paper money, we wish to prevent someone from using green and
57
black markers to draw extra zeroes on a US $10 bill so that it
58
claims to be a $100 bill.
59
60
* **Type 2** is creation of new fraudulent currency that will pass
61
in commerce. To extend our analogy, it is the creation of new
62
US $10 bills. There are two sub-types to this fraud. In terms of
63
our analogy, they are:
64
65
* **Type 2a**: copying an existing legitimate $10 bill<br><br>
66
67
* **Type 2b**: printing a new $10 bill that is unlike an existing
68
legitimate one, yet which will still pass in commerce
69
70
* **Type 3** is double-spending existing legitimate cryptocurrency.
71
There is no analogy in paper money due to its physical form; it is a
72
problem unique to digital currency due to its infinitely-copyable
73
nature.
74
75
How does all of this compare to Fossil?
76
77
1. <a id="signatures"></a>**Signatures.** Cryptocurrencies use a chain
78
of [digital signatures][dsig] to prevent Type 1 and Type 3 frauds. This
79
chain forms an additional link between the blocks, separate from the
80
hash chain that applies an ordering and lookup scheme to the blocks.
81
[_Blockchain: Simple Explanation_][bse] explains this “hash chain”
82
vs. “block chain” distinction in more detail.
83
84
These signatures prevent modification of the face value of each
85
transaction (Type 1 fraud) by ensuring that only the one signing a
86
new block has the private signing key that could change an issued
87
block after the fact.
88
89
The fact that these signatures are also *chained* prevents Type
90
3 frauds by making the *prior* owner of a block sign it over to
91
the new owner. To avoid an O(n²) auditing problem as a result,
92
cryptocurrencies add a separate chain of hashes to make checking
93
for double-spending quick and easy.
94
95
Fossil has [a disabled-by-default feature][cs] to call out to an
96
external copy of [PGP] or [GPG] to sign commit manifests before
97
inserting them into the repository. You can couple that with
98
a server-side [after-receive hook][arh] to reject unsigned commits.
99
100
Although there are several distinctions you can draw between the way
101
Fossil’s commit signing scheme works and the way block signing works
102
in cryptocurrencies, only one is of material interest for our
103
purposes here: Fossil commit signatures apply only to a single
104
commit. Fossil does not sign one commit over to the next “owner” of
105
that commit in the way that a blockchain-based cryptocurrency must
106
when transferring currency from one user to another, beacuse there
107
is no useful analog to the double-spending problem in Fossil. The
108
closest you can come to this is double-insert of commits into the
109
blockchain, which we’ll address shortly.
110
111
What Fossil commit signatures actually do is provide in-tree forgery
112
prevention, both Type 1 and Type 2. You cannot modify existing
113
commits (Type 1 forgery) because you do not have the original
114
committer’s private signing key, and you cannot forge new commits
115
attesting to come from some other trusted committer (Type 2) because
116
you don’t have any of their private signing keys, either.
117
Cryptocurrencies use the work problem to prevent Type 2
118
forgeries, but the application of that to Fossil is a matter we get
119
to [later](#work).
120
121
Although you have complete control over the contents of your local
122
Fossil repository clone, you cannot perform Type 1 forgery on its
123
contents short of executing a [preimage attack][prei] on the hash
124
algorithm. ([SHA3-256][SHA-3] by default in the current version of
125
Fossil.) Even if you could, Fossil’s sync protocol will prevent the
126
modification from being pushed into another repository: the remote
127
Fossil instance says, “I’ve already got that one, thanks,” and
128
ignores the push. Thus, short of breaking into the remote server
129
and modifying the repository in place, you couldn’t make use of
130
a preimage attack even if you had that power. Further, that would be an attack on the
131
server itself, not on Fossil’s data structures, so while it is
132
useful to think through this problem, it is not helpful in answering
133
our questions here.
134
135
The Fossil sync protocol’s duplication detection also prevents the closest analog to Type 3
136
frauds in Fossil: copying a commit manifest in your local repo clone
137
won’t result in a double-commit on sync.
138
139
In the absence of digital signatures, Fossil’s [RBAC system][caps]
140
restricts Type 2 forgery to trusted committers. Thus once again
141
we’re reduced to an infosec problem, not a data structure design
142
question.
143
144
(Inversely, enabling commit clearsigning is a good idea
145
if you have committers on your repo whom you don’t trust not to
146
commit Type 2 frauds. But let us be clear: your choice of setting
147
does not answer the question of whether Fossil is a blockchain.)
148
149
If Fossil signatures prevent Type 1 and Type 2 frauds, you
150
may wonder why they are not enabled by default. It is because
151
they are defense-in-depth measures, not the minimum sufficient
152
measures needed to prevent repository fraud, unlike the equivalent
153
protections in a cryptocurrency blockchain. Fossil provides its
154
primary protections through other means, so it doesn’t need to
155
mandate signatures.
156
157
Also, Fossil is not itself a [PKI], and there is no way for regular
158
users of Fossil to link it to a PKI, since doing so would likely
159
result in an unwanted [PII] disclosure. There is no email address
160
in a Fossil commit manifest that you could use to query one of the
161
public PGP keyservers, for example. It therefore becomes a local
162
policy matter as to whether you even *want* to have signatures,
163
because they’re not without their downsides.
164
165
2. <a id="work"></a>**Work Contests.** Cryptocurrencies prevent Type 2b forgeries
166
by setting up some sort of contest that ensures that new coins can come
167
into existence only by doing some difficult work task. This “mining”
168
activity results in a coin that took considerable work to create,
169
which thus has economic value by being a) difficult to re-create,
170
and b) resistant to [debasement][dboc].
171
172
Fossil repositories are most often used to store the work product of
173
individuals, rather than cryptocoin mining machines. There is
174
generally no contest in trying to produce the most commits. There
175
may be an implicit contest to produce the “best” commits, but that
176
is a matter of project management, not something that can be
177
automatically mediated through objective measures.
178
179
Incentives to commit to the repository come from outside of Fossil;
180
they are not inherent to its nature, as with cryptocurrencies.
181
Moreover, there is no useful sense in which we could say that one
182
commit “re-creates” another. Commits are generally products of
183
individual human intellect, thus necessarily unique in all but
184
trivial cases. This is foundational to copyright law.
185
186
3. <a id="lcr"></a>**Longest Chain Rule.** Cryptocurrencies generally
187
need some way to distinguish which blocks are legitimate and which
188
not. They do this in part by identifying the linear chain with the
189
greatest cumulative [work time](#work) as the legitimate chain. All
190
blocks not on that linear chain are considered “orphans” and are
191
ignored by the cryptocurrency software.
192
193
Its inverse is sometimes called the “51% attack” because a single
194
actor would have to do slightly more work than the entire rest of
195
the community using a given cryptocurrency in order for their fork
196
of the currency to be considered the legitimate fork. This argument
197
soothes concerns that a single bad actor could take over the
198
network.
199
200
The closest we can come to that notion in Fossil is the default
201
“trunk” branch, but there’s nothing in Fossil that delegitimizes
202
other branches just because they’re shorter, nor is there any way in
203
Fossil to score the amount of work that went into a commit. Indeed,
204
[forks and branches][fb] are *valuable and desirable* things in
205
Fossil.
206
207
This much is certain: Fossil is definitely not a cryptocurrency. Whether
208
this makes it “not a blockchain” is a subjective matter.
209
210
[arh]: ./hooks.md
211
[bse]: https://www.researchgate.net/publication/311572122_What_is_Blockchain_a_Gentle_Introduction
212
[caps]: ./caps/
213
[cs]: /help/clearsign
214
[dboc]: https://en.wikipedia.org/wiki/Debasement
215
[dsig]: https://en.wikipedia.org/wiki/Digital_signature
216
[fb]: ./branching.wiki
217
[GPG]: https://gnupg.org/
218
[PGP]: https://www.openpgp.org/
219
[PII]: https://en.wikipedia.org/wiki/Personal_data
220
[PKI]: https://en.wikipedia.org/wiki/Public_key_infrastructure
221
[pow]: https://en.wikipedia.org/wiki/Proof_of_work
222
[prei]: https://en.wikipedia.org/wiki/Preimage_attack
223
224
225
226
<a id="dlt"></a>
227
## Distributed Ledgers
228
229
Cryptocurrencies are an instance of [distributed ledger technology][dlt]. If
230
we can convince ourselves that Fossil is also a distributed
231
ledger, then we might think of Fossil as a peer technology,
232
having at least some qualifications toward being considered a blockchain.
233
234
A key tenet of DLT is that records be unmodifiable after they’re
235
committed to the ledger, which matches quite well with Fossil’s design
236
and everyday use cases. Fossil puts up multiple barriers to prevent
237
modification of existing records and injection of incorrect records.
238
239
Yet, Fossil also has [purge] and [shunning][shun]. Doesn’t that mean
240
Fossil cannot be a distributed ledger?
241
242
These features only remove existing commits from the repository. If you want a
243
currency analogy, they are ways to burn a paper bill or to melt a [fiat
244
coin][fc] down to slag. In a cryptocurrency, you can erase your “wallet”
245
file, effectively destroying money in a similar way. These features
246
do not permit forgery of either type described above: you can’t use them
247
to change the value of existing commits (Type 1) or add new commits to
248
the repository (Type 2).
249
250
What if we removed those features from Fossil, creating an append-only
251
Fossil variant? Is it a DLT then? Arguably still not, because [today’s Fossil
252
is an AP-mode system][ctap], which means
253
there can be no guaranteed consensus on the content of the ledger at any
254
given time. An AP-mode accounts receivable system would allow
255
different bottom-line totals at different sites, because you’ve
256
cast away “C” to get AP-mode operation. (See the prior link or
257
[Wikipedia’s article on the CAP theorem][cap] if you aren’t following
258
this terminology.)
259
260
By the same token, you cannot guarantee that the command
261
“`fossil info tip`” gives the same result everywhere. You would need to
262
recast Fossil as a CA or CP-mode system to solve that.
263
(Everyone not
264
partitioned away from the majority of the network at any rate, in the CP
265
case.)
266
267
What are the prospects for CA-mode or CP-mode Fossil? [We don’t want
268
CA-mode Fossil][ctca], but [CP-mode could be useful][ctcp]. Until the latter
269
exists, this author believes Fossil is not a distributed ledger in a
270
technologically defensible sense.
271
272
The most common technologies answering to the label “blockchain” are all
273
DLTs, so if Fossil is not a DLT, then it is not a blockchain in that
274
sense.
275
276
[ctap]: ./cap-theorem.md#ap
277
[ctca]: ./cap-theorem.md#ca
278
[ctcp]: ./cap-theorem.md#cp
279
[cap]: https://en.wikipedia.org/wiki/CAP_theorem
280
[dlt]: https://en.wikipedia.org/wiki/Distributed_ledger
281
[DVCS]: https://en.wikipedia.org/wiki/Distributed_version_control
282
[fc]: https://en.wikipedia.org/wiki/Fiat_money
283
[purge]: /help/purge
284
[shun]: ./shunning.wiki
285
286
287
<a id="dpc"></a>
288
## Distributed Partial Consensus
289
290
If we can’t get DLT, can we at least get some kind of distributed
291
consensus at the level of individual Fossil’s commits?
292
293
Many blockchain based technologies have this property: given some
294
element of the blockchain, you can make certain proofs that it either is
295
a legitimate part of the whole blockchain, or it is not.
296
297
Unfortunately, this author doesn’t see a way to do that with Fossil.
298
Given only one “block” in Fossil’s putative “blockchain” — a commit, in
299
Fossil terminology — all you can prove is whether it is internally
300
consistent, that it is not corrupt. That then points you at the parent(s) of that
301
commit, which you can repeat the exercise on, back to the root of the
302
DAG. This is what the enabled-by-default [`repo-cksum` setting][rcks]
303
does.
304
305
If cryptocurrencies worked this way, you wouldn’t be able to prove that
306
a given cryptocoin was legitimate without repeating the proof-of-work
307
calculations for the entire cryptocurrency scheme! Instead, you only
308
need to check a certain number of signatures and proofs-of-work in order
309
to be reasonably certain that you are looking at a legitimate section of
310
the whole blockchain.
311
312
What would it even mean to prove that a given Fossil commit “*belongs*”
313
to the repository you’ve extracted it from? For a software project,
314
isn’t that tantamount to automatic code review, where the server would
315
be able to reliably accept or reject a commit based solely on its
316
content? That sounds nice, but this author believes we’ll need to invent
317
[AGI] first.
318
319
A better method to provide distributed consensus for Fossil would be to
320
rely on the *natural* intelligence of its users: that is, distributed
321
commit signing, so that a commit is accepted into the blockchain only
322
once some number of users countersign it. This amounts to a code review
323
feature, which Fossil doesn’t currently have.
324
325
Solving that problem basically requires solving the [PKI] problem first,
326
since you can’t verify the proofs of these signatures if you can’t first
327
prove that the provided signatures belong to people you trust. This is a
328
notoriously hard problem in its own right.
329
330
A future version of Fossil could instead provide [consensus in the CAP
331
sense][ctcp]. For instance, you could say that if a quorum of servers
332
all have a given commit, it “belongs.” Fossil’s strong hashing tech
333
would mean that querying whether a given commit is part of the
334
“blockchain” would be as simple as going down the list of servers and
335
sending each an HTTP GET `/info` query for the artifact ID, concluding
336
that the commit is legitimate once you get enough HTTP 200 status codes back. All of this is
337
hypothetical, because Fossil doesn’t do this today.
338
339
[AGI]: https://en.wikipedia.org/wiki/Artificial_general_intelligence
340
[rcks]: /help/repo-cksum
341
342
343
344
<a id="anon"></a>
345
## Anonymity
346
347
Many blockchain based technologies go to extraordinary lengths to
348
allow anonymous use of their service.
349
350
As typically configured, Fossil does not: commits synced between servers
351
always at least have a user name associated with them, which the remote
352
system must accept through its [RBAC system][caps]. That system can run
353
without having the user’s email address, but it’s needed if [email
354
alerts][alert] are enabled on the server. The remote server logs the IP
355
address of the commit for security reasons. That coupled with the
356
timestamp on the commit could sufficiently deanonymize users in many
357
common situations.
358
359
It is possible to configure Fossil so it doesn’t do this:
360
361
* You can give [Write capability][capi] to user category “nobody,” so
362
that anyone that can reach your server can push commits into its
363
repository.
364
365
* You could give that capability to user category “anonymous” instead,
366
which requires that the user log in with a CAPTCHA, but which doesn’t
367
require that the user otherwise identify themselves.
368
369
* You could enable [the `self-register` setting][sreg] and choose not to
370
enable [commit clear-signing][cs] so that anonymous users could push
371
commits into your repository under any name they want.
372
373
On the server side, you can also [scrub] the logging that remembers
374
where each commit came from.
375
376
Commit source info isn’t transmitted from the remote server on clone or pull:
377
the size of the `rcvfrom` table after initial clone is 1, containing
378
only the remote server’s IP address. On each pull containing new
379
artifacts, your local `fossil` instance adds another entry to this
380
table, likely with the same IP address unless the server has moved or
381
you’re using [multiple remotes][mrep]. This table is far more
382
interesting on the server side, containing the IP addresses of all
383
contentful pushes; thus [the `scrub` command][scrub].
384
385
Because Fossil doesn’t
386
remember IP addresses in commit manifests or require commit signing, it
387
allows at least *pseudonymous* commits. When someone clones a remote
388
repository, they don’t learn the email address, IP address, or any other
389
sort of [PII] of prior committers, on purpose.
390
391
Some people say that private, permissioned blockchains (as you may
392
imagine Fossil to be) are inherently problematic by the very reason that
393
they don’t bake anonymous contribution into their core. The very
394
existence of an RBAC is a moving piece that can break. Isn’t it better,
395
the argument goes, to have a system that works even in the face of
396
anonymous contribution, so that you don’t need an RBAC? Cryptocurrencies
397
do this, for example: anyone can “mine” a new coin and push it into the
398
blockchain, and there is no central authority restricting the transfer
399
of cryptocurrency from one user to another.
400
401
We can draw an analogy to encryption, where an algorithm is
402
considered inherently insecure if it depends on keeping any information
403
from an attacker other than the key. Encryption schemes that do
404
otherwise are derided as “security through obscurity.”
405
406
You may be wondering what any of this has to do with whether Fossil is a
407
blockchain, but that is exactly the point: all of this is outside
408
Fossil’s core hash-chained repository data structure. If you take the
409
position that you don’t have a “blockchain” unless it allows anonymous
410
contribution, with any needed restrictions provided only by the very
411
structure of the managed data, then Fossil does not qualify.
412
413
Why do some people care about this distinction? Consider Bitcoin,
414
wherein an anonymous user cannot spam the blockchain with bogus coins
415
because its [proof-of-work][pow] protocol allows such coins to be
416
rejected immediately. There is no equivalent in Fossil: it has no
417
technology that allows the receiving server to look at the content of a
418
commit and automatically judge it to be “good.” Fossil relies on its
419
RBAC system to provide such distinctions: if you have a commit bit, your
420
commits are *ipso facto* judged “good,” insofar as any human work
421
product can be so judged by a blob of compiled C code. This takes us
422
back to the [digital ledger question](#dlt), where we can talk about
423
what it means to later correct a bad commit that got through the RBAC
424
check.
425
426
We may be willing to accept pseudonymity, rather than full anonymity.
427
If we configure Fossil as above, either bypassing the RBAC or abandoning
428
human control over it, scrubbing IP addresses, etc., is it then a public
429
permissionless blockchain in that sense?
430
431
We think not, because there is no [longest chain rule](#lcr) or anything
432
like it in Fossil.
433
434
For a fair model of how a Fossil repository might behave under such
435
conditions, consider GitHub: here one user can fork another’s repository
436
and make an arbitrary number of commits to their public fork. Imagine
437
this happens 10 times. How does someone come along later and
438
*automatically* evaluate which of the 11 forks of the code (counting the
439
original repository among their number) is the “best” one? For a
440
computer software project, the best we could do to approximate this
441
devolves to a [software project cost estimation problem][scost]. These
442
methods are rather questionable in their own right, being mathematical
443
judgement values on human work products, but even if we accept their
444
usefulness, then we still cannot say which fork is better based solely
445
on their scores under these metrics. We may well prefer to use the fork
446
of a software program that took *less* effort, being smaller, more
447
self-contained, and with a smaller attack surface.
448
449
450
[alert]: ./alerts.md
451
[capi]: ./caps/ref.html#i
452
[mrep]: /help/remote
453
[scost]: https://en.wikipedia.org/wiki/Software_development_effort_estimation
454
[scrub]: /help/scrub
455
[sreg]: /help/self-register
456
457
458
# Conclusion
459
460
This author believes it is technologically indefensible to call Fossil a
461
“blockchain” in any sense likely to be understood by a majority of those
462
you’re communicating with. Using a term in a nonstandard way just because you can
463
defend it means you’ve failed any goal that requires clear communication.
464
The people you’re communicating your ideas to must have the
465
same concept of the terms you use.
466
467
What term should you use instead? Fossil stores a DAG of hash-chained
468
commits, so an indisputably correct term is a [Merkle tree][mt], named
469
after [its inventor][drrm]. You could also use the more generic term
470
“hash tree.”
471
472
Fossil is a technological peer to many common sorts of blockchain
473
technology. There is a lot of overlap in concepts and implementation
474
details, but when speaking of what most people understand as
475
“blockchain,” Fossil is not that.
476
477
[drrm]: https://en.wikipedia.org/wiki/Ralph_Merkle
478
[mt]: https://en.wikipedia.org/wiki/Merkle_tree
479

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button