|
1
|
# Fossil is not Relational |
|
2
|
|
|
3
|
***An Introduction to the Fossil Data Model*** |
|
4
|
|
|
5
|
Upon hearing that Fossil is based on sqlite, it's natural for people |
|
6
|
unfamiliar with its internals to assume that Fossil stores its |
|
7
|
SCM-relevant data in a database-friendly way and that the SCM history |
|
8
|
can be modified via SQL. The truth, however, is *far stranger than |
|
9
|
that.* |
|
10
|
|
|
11
|
This document introduces, at a relatively high level: |
|
12
|
|
|
13
|
1) The underlying enduring and immutable data format, which is |
|
14
|
independent of any specific storage engine. |
|
15
|
|
|
16
|
2) The `blob` table: Fossil's single point of SCM-relevant data |
|
17
|
storage. |
|
18
|
|
|
19
|
3) The transformation of (1) from its immutable raw form to a |
|
20
|
*transient* database-friendly form. |
|
21
|
|
|
22
|
4) Some of the consequences of this model. |
|
23
|
|
|
24
|
|
|
25
|
# Part 1: Artifacts |
|
26
|
|
|
27
|
```pikchr center |
|
28
|
AllObjects: [ |
|
29
|
A: file "Artifacts" fill lightskyblue; |
|
30
|
down; move to A.s; move 50%; |
|
31
|
F: file "Client" "files"; |
|
32
|
right; move 1; up; move 50%; |
|
33
|
B: cylinder "blob table" |
|
34
|
right; |
|
35
|
arrow from A.e to B.w; |
|
36
|
arrow from F.e to B.w; |
|
37
|
arrow dashed from B.e; |
|
38
|
C: box rad 0.1 "Crosslink" "process"; |
|
39
|
arrow |
|
40
|
AUX: cylinder "Auxiliary" "tables" |
|
41
|
arc -> cw dotted from AUX.s to B.s; |
|
42
|
] # end of AllObjects |
|
43
|
``` |
|
44
|
|
|
45
|
|
|
46
|
The centerpiece of Fossil's architecture is a data format which |
|
47
|
describes what we call "artifacts." Each artifact represents the state |
|
48
|
of one atomic unit of SCM-relevant data, such as a single checkin, a |
|
49
|
single wiki page edit, a single modification to a ticket, creation or |
|
50
|
cancellation of tags, and similar SCM constructs. In the cases of |
|
51
|
checkins and ticket updates, an artifact may record changes to |
|
52
|
multiple files resp. ticket fields, but the change as a whole |
|
53
|
is atomic. Though we often refer to both fossil-specific SCM data |
|
54
|
and client-side content as artifacts, this document uses the term |
|
55
|
artifact solely for the former purpose. |
|
56
|
|
|
57
|
From [the data format's main documentation][dataformat]: |
|
58
|
|
|
59
|
> The global state of a fossil repository is kept simple so that it |
|
60
|
> can endure in useful form for decades or centuries. A fossil |
|
61
|
> repository is intended to be readable, searchable, and extensible by |
|
62
|
> people not yet born. |
|
63
|
|
|
64
|
[dataformat]: ./fileformat.wiki |
|
65
|
|
|
66
|
This format has the following major properties: |
|
67
|
|
|
68
|
- It is <u>**syntactically simple**</u>, easily and efficiently |
|
69
|
parsable in any programming language. It is also entirely |
|
70
|
human-readable. |
|
71
|
|
|
72
|
- It is <u>**immutable**</u>. An artifact is identified by its unique |
|
73
|
hash value. Any modification to an artifact changes that hash, |
|
74
|
thereby changing its identity. |
|
75
|
|
|
76
|
- It is <u>**not generic**</u>. It is custom-made for its purpose and |
|
77
|
makes no attempt at providing a generic format. It contains *only* |
|
78
|
what it *needs* to function, with zero bloat. |
|
79
|
|
|
80
|
- It <u>**holds all SCM-relevant data except for client-level file |
|
81
|
content**</u>, the latter instead being referenced by their unique |
|
82
|
hash values. Storage of the client-side content is an implementation |
|
83
|
detail delegated to higher-level applications. |
|
84
|
|
|
85
|
- <u>**Auditability**</u>. By following the hash references in |
|
86
|
artifacts it is possible to unambiguously trace the origin of any |
|
87
|
modification to the SCM state. Combined with higher-level tools |
|
88
|
(specifically, Fossil's database), this audit trail can easily be |
|
89
|
traced both backwards and forwards in time, using any given version |
|
90
|
in the SCM history as a starting point. |
|
91
|
|
|
92
|
Notably, the artifact file format <u>does not</u>... |
|
93
|
|
|
94
|
- Specify any specific storage mechanism for the SCM's raw bytes, |
|
95
|
which includes both artifacts themselves and client-side file |
|
96
|
content. The file format refers to all such content solely by its |
|
97
|
unique hash value. |
|
98
|
|
|
99
|
- Specify any optimizations such as storing file-level changes as |
|
100
|
deltas between two versions of that content. |
|
101
|
|
|
102
|
Such aspects are all considered to be implementation details of |
|
103
|
higher-level applications (be they in the main fossil binary or a |
|
104
|
hypothetical 3rd-party application), and have no effect on the |
|
105
|
underlying artifact data model. That said, in Fossil: |
|
106
|
|
|
107
|
- All raw byte content (artifacts and client files) is stored in |
|
108
|
the `blob` database table. |
|
109
|
|
|
110
|
- Fossil uses delta and zlib compression to keep the storage size of |
|
111
|
changes from one version of a piece of content to the next to a |
|
112
|
minimum. |
|
113
|
|
|
114
|
|
|
115
|
## Sidebar: SCM-relevant vs Non-SCM-relevant State |
|
116
|
|
|
117
|
Certain data in Fossil are "SCM-relevant" and certain data are not. In |
|
118
|
short, SCM-relevant data are managed in a way consistent with |
|
119
|
controlled versioning of that data. Conversely, non-SCM-relevant data |
|
120
|
are essentially any state neither specified by nor unambiguously |
|
121
|
refererenced by the artifact file format and are therefore not |
|
122
|
versioned. |
|
123
|
|
|
124
|
SCM-relevant state includes: |
|
125
|
|
|
126
|
- Any and all data stored in the bodies of artifacts. This includes, |
|
127
|
but is not limited to: wiki/ticket/forum content, tags, file names |
|
128
|
and Fossil-side permissions, the name of each user who introduces |
|
129
|
any given artifact into the data store, the timestamp of each such |
|
130
|
change, the inheritance tree of checkins, and many other pieces of |
|
131
|
metadata. |
|
132
|
|
|
133
|
- Raw file content of versioned files. These data are external to |
|
134
|
artifacts, which refer to them by their hashes. How they are stored |
|
135
|
is not the concern of the data model, but (spoiler alert!) Fossil |
|
136
|
stores them in an SQLite database, one record per distinct hash, in |
|
137
|
its `blob` table (which we will cover more very soon). |
|
138
|
|
|
139
|
Non-SCM-relevant state includes: |
|
140
|
|
|
141
|
- Fossil's list of users and their metadata (permissions, email |
|
142
|
address, etc.). Artifacts themselves reference users only by their |
|
143
|
user names. Artifacts neither care whether, nor guarantee that, user |
|
144
|
"drh" in one artifact is in fact the same "drh" referenced in |
|
145
|
another artifact. |
|
146
|
|
|
147
|
- All Fossil UI configuration, e.g. the site's skin, config settings, |
|
148
|
and project name. |
|
149
|
|
|
150
|
- In short, any tables in a Fossil repository file except for the |
|
151
|
`blob` table. Most, but not all, of these tables are transient |
|
152
|
caches for the data specified by the artifact files (which are |
|
153
|
stored in the `blob` table), and can safely be destroyed and rebuilt |
|
154
|
from the collection of artifacts with no loss of state to the |
|
155
|
repository. *All* of them, except for `blob` and `delta`, can be |
|
156
|
destroyed with no loss of *SCM-relevant* data. |
|
157
|
|
|
158
|
## Terminology Hair-splitting: Manifest vs. Artifact |
|
159
|
|
|
160
|
We sometimes refer to artifacts as "manifests," which is technically a |
|
161
|
term for artifacts which record checkins. The various other artifact |
|
162
|
types are arguably not "manifests," but are sometimes referred to as |
|
163
|
such because the internal APIs use that term. |
|
164
|
|
|
165
|
|
|
166
|
## A Very Basic Example |
|
167
|
|
|
168
|
The following artifact, truncated for brevity, represents a typical |
|
169
|
checkin artifact (a.k.a. a manifest): |
|
170
|
|
|
171
|
``` |
|
172
|
C Bug\sfix\sin\sthe\slocal\sdatabase\sfinder. |
|
173
|
D 2007-07-30T13:01:08 |
|
174
|
F src/VERSION 24bbb3aad63325ff33c56d777007d7cd63dc19ea |
|
175
|
F src/add.c 1a5dfcdbfd24c65fa04da865b2e21486d075e154 |
|
176
|
F src/blob.c 8ec1e279a6cd0cfd5f1e3f8a39f2e9a1682e0113 |
|
177
|
<SNIP> |
|
178
|
F www/selfcheck.html 849df9860df602dc2c55163d658c6b138213122f |
|
179
|
P 01e7596a984e2cd2bc12abc0a741415b902cbeea |
|
180
|
R 74a0432d81b956bfc3ff5a1a2bb46eb5 |
|
181
|
U drh |
|
182
|
Z c9dcc06ecead312b1c310711cb360bc3 |
|
183
|
``` |
|
184
|
|
|
185
|
Each line is a single data record called a "card." The first letter of |
|
186
|
each line tells us the type of data stored on that line and the |
|
187
|
following space-separated tokens contain the data for that |
|
188
|
line. Tokens which themselves contain spaces (notably the checkin |
|
189
|
comment) have those escaped as `\s`. The raw text of wiki |
|
190
|
pages/comments, forum posts, and ticket bodies/comments is stored |
|
191
|
directly in the corresponding artifact, but is stored in a way which |
|
192
|
makes such escaping unnecessary. |
|
193
|
|
|
194
|
The hashes seen above are a critical component of the architecture: |
|
195
|
|
|
196
|
- The `F` (file) records refer to the content of those files by the |
|
197
|
hash of that content. Where that content is stored is *not* specified |
|
198
|
by the data model. |
|
199
|
|
|
200
|
- The `P` (parent) line is the hash code of the parent version (itself |
|
201
|
an artifact). |
|
202
|
|
|
203
|
- The `Z` line is a hash of all of the content of *this artifact* |
|
204
|
which precedes the `Z` line. Thus any change to the content of an |
|
205
|
artifact changes both the artifact's identity (its hash) and its `Z` |
|
206
|
value, making it impossible to inject modified artifacts into an |
|
207
|
existing artifact tree. |
|
208
|
|
|
209
|
- The `R` line is yet another consistency-checking hash which we won't |
|
210
|
go into here except to say that it's an internal consistency |
|
211
|
check/line of defense against modification of file content |
|
212
|
referenced by the artifact. |
|
213
|
|
|
214
|
# Part 2: The `blob` Table |
|
215
|
|
|
216
|
```pikchr center |
|
217
|
AllObjects: [ |
|
218
|
A: file "Artifacts"; |
|
219
|
down; move to A.s; move 50%; |
|
220
|
F: file "Client" "files" fill lightskyblue; |
|
221
|
right; move 1; up; move 50%; |
|
222
|
B: cylinder "blob table" fill lightskyblue; |
|
223
|
right; |
|
224
|
arrow from A.e to B.w; |
|
225
|
arrow from F.e to B.w; |
|
226
|
arrow dashed from B.e; |
|
227
|
C: box rad 0.1 "Crosslink" "process"; |
|
228
|
arrow |
|
229
|
AUX: cylinder "Auxiliary" "tables" |
|
230
|
arc -> cw dotted from AUX.s to B.s; |
|
231
|
] # end of AllObjects |
|
232
|
``` |
|
233
|
|
|
234
|
|
|
235
|
The `blob` table is the core-most storage of a Fossil repository |
|
236
|
database, storing all SCM-relevant data (and *only* SCM-relevant |
|
237
|
data). Each row of this table holds a single artifact or the content |
|
238
|
for a single version of a single client-side file. Slightly truncated |
|
239
|
for clarity, its schema contains the following fields: |
|
240
|
|
|
241
|
- **`uuid`**: the hash code of the blob's contents. |
|
242
|
- **`rid`**: a unique integer key for this record. This is how the |
|
243
|
blob table is mapped to other (transient) tables, but the RIDs are |
|
244
|
specific to one given copy of a repository and must not be used for |
|
245
|
cross-repository referencing. The RID is a private/internal value of |
|
246
|
no use to a user unless they're building SQL queries for use with |
|
247
|
the Fossil db schema. |
|
248
|
- **`size`**: the size, in bytes, of the blob's contents, or -1 for |
|
249
|
"phantom" blobs (those which Fossil knows should exist because it's |
|
250
|
seen them referenced somewhere, but for which it has not been given |
|
251
|
any content). |
|
252
|
- **`content`**: the blob's raw content bytes, with the caveat that |
|
253
|
Fossil is free to store it in an "alternate representation." |
|
254
|
Specifically, the `content` field often holds a zlib-compressed |
|
255
|
delta from a previous version of the blob's content (a separate |
|
256
|
entry in the `blob` table), and an auxiliary table named `delta` |
|
257
|
maps such blobs to their previous versions, such that Fossil can |
|
258
|
reconstruct the real content from them by applying the delta to its |
|
259
|
previous version (and such deltas may be chained). Thus extraction |
|
260
|
of the content from this field cannot be performed via vanilla SQL, |
|
261
|
and requires a Fossil-specific function which knows how to convert |
|
262
|
any internal representations of the content to its original form. |
|
263
|
|
|
264
|
|
|
265
|
## Sidebar: How does `blob` Distinguish Between Artifacts and Client Content? |
|
266
|
|
|
267
|
Notice that the `blob` table has no flag saying "this record is an |
|
268
|
artifact" or "this record is client data." Similarly, there is no |
|
269
|
place in the database dedicated to keeping track of which `blob` |
|
270
|
records are artifacts and which are file content. |
|
271
|
|
|
272
|
That said, (A) the type of a blob can be implied via certain table |
|
273
|
relationships and (B) the `event` table (the `/timeline`'s main data |
|
274
|
source) incidentally has a list of artifacts and their sub-types |
|
275
|
(checkin, wiki, tag, etc.). However, given that all of those |
|
276
|
relationships, including the timeline, are *transient*, how can Fossil |
|
277
|
distinguish between the two types of data? |
|
278
|
|
|
279
|
Fossil's artifact format is extremely rigid and is *strictly* enforced |
|
280
|
internally, with zero room provided for leniency. Every artifact which |
|
281
|
is internally created is re-parsed for validity before it is committed |
|
282
|
to the database, making it impossible that Fossil can inject an |
|
283
|
invalid artifact into the repository. Because of the strictness of the |
|
284
|
artifact parser, the chances that any given piece of arbitrary client |
|
285
|
data could be successfully parsed as an artifact, even if it is |
|
286
|
syntactically 99% similar to an artifact, are *effectively zero*. |
|
287
|
|
|
288
|
Thus Fossil's rule of interpreting the contents of the blob table is: |
|
289
|
if it can be parsed as an artifact, it *is* an artifact, else it is |
|
290
|
opaque client-side data. |
|
291
|
|
|
292
|
That rule is most often relevant in operations like `rebuild` and |
|
293
|
`reconstruct`, both of which necessarily have to sort out artifacts |
|
294
|
and non-artifact blobs from arbitrary collections of blobs. |
|
295
|
|
|
296
|
It is, in fact, possible to store an artifact unrelated to the current |
|
297
|
repository in that repository, and it *will be parsed and processed as |
|
298
|
an artifact* (see below), but it likely refers to other artifacts or |
|
299
|
blobs which are not part of the current repository, thereby possibly |
|
300
|
introducing "strange" data into the UI. If this happens, it's |
|
301
|
potentially slightly confusing but is functionally harmless. |
|
302
|
|
|
303
|
|
|
304
|
# Part 3: Crosslinking |
|
305
|
|
|
306
|
```pikchr center |
|
307
|
AllObjects: [ |
|
308
|
A: file "Artifacts"; |
|
309
|
down; move to A.s; move 50%; |
|
310
|
F: file "Client" "files"; |
|
311
|
right; move 1; up; move 50%; |
|
312
|
B: cylinder "blob table" |
|
313
|
right; |
|
314
|
arrow from A.e to B.w; |
|
315
|
arrow from F.e to B.w; |
|
316
|
arrow dashed from B.e; |
|
317
|
C: box rad 0.1 "Crosslink" "process" fill lightskyblue; |
|
318
|
arrow |
|
319
|
AUX: cylinder "Auxiliary" "tables" fill lightskyblue; |
|
320
|
arc -> cw dotted from AUX.s to B.s; |
|
321
|
] # end of AllObjects |
|
322
|
``` |
|
323
|
|
|
324
|
Once an artifact is stored in the `blob` table, how does one perform |
|
325
|
SQL queries against its plain-text format? In short: *One Does Not |
|
326
|
Simply Query the Artifacts*. |
|
327
|
|
|
328
|
Crosslinking, as its colloquially known, is a one-way processing step |
|
329
|
which transforms an immutable artifact's state into something |
|
330
|
database-friendly. Crosslinking happens automatically every time |
|
331
|
Fossil generates, or is given, a new artifact. Crosslinking of any |
|
332
|
given artifact may update many different auxiliary tables, *all* of |
|
333
|
which are transient in the sense that they may be destroyed and then |
|
334
|
recreated by crosslinking all artifacts from the `blob` table (which |
|
335
|
is exactly what the `rebuild` command does). The overwhelming majority |
|
336
|
of individual database records in any Fossil repository are found in |
|
337
|
these transient auxiliary tables, though the `blob` table tends to |
|
338
|
account for the overwhelming majority of a repository's disk space. |
|
339
|
|
|
340
|
This approach to mapping data from artifacts to the db gives Fossil |
|
341
|
the freedom to change its database model, effectively at will, with |
|
342
|
minimal client-side disruption (at most, a call to `rebuild`). This |
|
343
|
allows, for example, Fossil to take advantage of new improvements in |
|
344
|
sqlite without affecting compatibility with older repositories. |
|
345
|
|
|
346
|
Auxiliary tables hold data mappings such as: |
|
347
|
|
|
348
|
- Child/parent relationships of checkins. (The `plink` table.) |
|
349
|
- Records of file names and changes to files. (The `mlink` and `filename` tables.) |
|
350
|
- Timeline entries. (The `event` table.) |
|
351
|
|
|
352
|
And numerous other bits and pieces. |
|
353
|
|
|
354
|
The many auxiliary tables maintained by the app-level code reference |
|
355
|
the `blob` table via its RID field, as that's far more efficient than |
|
356
|
using hashes (`blob.uuid`) as foreign keys. The contexts of those |
|
357
|
auxiliary data unambiguously tell us whether the referenced blobs are |
|
358
|
artifacts or file content, so there is no efficiency penalty there for |
|
359
|
hosting both opaque blobs and artifacts in the `blob` table. |
|
360
|
|
|
361
|
The complete SQL schemas for the core-most auxiliary tables can be found |
|
362
|
at: |
|
363
|
|
|
364
|
[](/finfo/src/schema.c?ci=trunk) |
|
365
|
|
|
366
|
Noting, however, that all database tables are effectively internal |
|
367
|
APIs, with no API stability guarantees and subject to change at any |
|
368
|
time. Thus their structures generally should not be relied upon in |
|
369
|
client-side scripts. |
|
370
|
|
|
371
|
|
|
372
|
# Part 4: Implications and Consequences of the Model |
|
373
|
|
|
374
|
*Some* of the implications and consequences of Fossil's data model |
|
375
|
combined with the higher-level access via SQL include: |
|
376
|
|
|
377
|
- **Provable immutability of history.** Fossil offers only one option |
|
378
|
for modifying history: "shunning" is the forceful removal of an |
|
379
|
artifact from the `blob` table and the creation of a db record |
|
380
|
stating that the shunned hash may no longer be synced into this |
|
381
|
repository. Shunning effectively leaves a hole in the SCM history, |
|
382
|
and is only intended to be used for removal of illegal, dangerous, |
|
383
|
or private information which should never have been added to the |
|
384
|
repository. |
|
385
|
|
|
386
|
- **Complete separation of SCM-relevant data and app-level data |
|
387
|
structures**. This allows the application to update its structures |
|
388
|
at will without significant backwards-compatibility concerns. In |
|
389
|
Fossil's case, "data structures" primarily refers to the SQL |
|
390
|
schema. Bringing a given repository schema up to date vis a vis a |
|
391
|
given fossil binary version simply means rebuilding the repository |
|
392
|
with that fossil binary. There are exceptionally rare cases, namely |
|
393
|
the switch from SHA1 to SHA3-256 ushered in with Fossil 2.0, which |
|
394
|
can lead to true incompatibility. e.g. a Fossil 1.x client cannot |
|
395
|
use a repository database which contains SHA3 hashes, regardless of |
|
396
|
a rebuild. |
|
397
|
|
|
398
|
- **Two-way compatibility with other hypothetical clients** which also |
|
399
|
implement the same underlying data model. So far there are none, but |
|
400
|
it's conceivably possible. |
|
401
|
|
|
402
|
- **Provides a solid basis for reporting.** Fossil's real-time metrics |
|
403
|
and reporting options are arguably the most powerful and flexible |
|
404
|
yet seen in an SCM. |
|
405
|
|
|
406
|
- Very probably several more things. |
|
407
|
|