Fossil SCM

fossil-scm / www / fossil-is-not-relational.md
Source Blame History 406 lines
8da2f2a… stephan 1 # Fossil is not Relational
8da2f2a… stephan 2
8da2f2a… stephan 3 ***An Introduction to the Fossil Data Model***
8da2f2a… stephan 4
8da2f2a… stephan 5 Upon hearing that Fossil is based on sqlite, it's natural for people
8da2f2a… stephan 6 unfamiliar with its internals to assume that Fossil stores its
8da2f2a… stephan 7 SCM-relevant data in a database-friendly way and that the SCM history
269788e… stephan 8 can be modified via SQL. The truth, however, is *far stranger than
8da2f2a… stephan 9 that.*
8da2f2a… stephan 10
213160c… stephan 11 This document introduces, at a relatively high level:
8da2f2a… stephan 12
8da2f2a… stephan 13 1) The underlying enduring and immutable data format, which is
8da2f2a… stephan 14 independent of any specific storage engine.
8da2f2a… stephan 15
8da2f2a… stephan 16 2) The `blob` table: Fossil's single point of SCM-relevant data
8da2f2a… stephan 17 storage.
8da2f2a… stephan 18
8da2f2a… stephan 19 3) The transformation of (1) from its immutable raw form to a
8da2f2a… stephan 20 *transient* database-friendly form.
8da2f2a… stephan 21
8da2f2a… stephan 22 4) Some of the consequences of this model.
8da2f2a… stephan 23
8da2f2a… stephan 24
8da2f2a… stephan 25 # Part 1: Artifacts
8da2f2a… stephan 26
8da2f2a… stephan 27 ```pikchr center
8da2f2a… stephan 28 AllObjects: [
8da2f2a… stephan 29 A: file "Artifacts" fill lightskyblue;
8da2f2a… stephan 30 down; move to A.s; move 50%;
8da2f2a… stephan 31 F: file "Client" "files";
8da2f2a… stephan 32 right; move 1; up; move 50%;
8da2f2a… stephan 33 B: cylinder "blob table"
8da2f2a… stephan 34 right;
8da2f2a… stephan 35 arrow from A.e to B.w;
8da2f2a… stephan 36 arrow from F.e to B.w;
8da2f2a… stephan 37 arrow dashed from B.e;
8da2f2a… stephan 38 C: box rad 0.1 "Crosslink" "process";
8da2f2a… stephan 39 arrow
8da2f2a… stephan 40 AUX: cylinder "Auxiliary" "tables"
8da2f2a… stephan 41 arc -> cw dotted from AUX.s to B.s;
8da2f2a… stephan 42 ] # end of AllObjects
8da2f2a… stephan 43 ```
8da2f2a… stephan 44
8da2f2a… stephan 45
8da2f2a… stephan 46 The centerpiece of Fossil's architecture is a data format which
8da2f2a… stephan 47 describes what we call "artifacts." Each artifact represents the state
8da2f2a… stephan 48 of one atomic unit of SCM-relevant data, such as a single checkin, a
8da2f2a… stephan 49 single wiki page edit, a single modification to a ticket, creation or
8da2f2a… stephan 50 cancellation of tags, and similar SCM constructs. In the cases of
8da2f2a… stephan 51 checkins and ticket updates, an artifact may record changes to
8da2f2a… stephan 52 multiple files resp. ticket fields, but the change as a whole
8da2f2a… stephan 53 is atomic. Though we often refer to both fossil-specific SCM data
8da2f2a… stephan 54 and client-side content as artifacts, this document uses the term
8da2f2a… stephan 55 artifact solely for the former purpose.
8da2f2a… stephan 56
8da2f2a… stephan 57 From [the data format's main documentation][dataformat]:
8da2f2a… stephan 58
8da2f2a… stephan 59 > The global state of a fossil repository is kept simple so that it
8da2f2a… stephan 60 > can endure in useful form for decades or centuries. A fossil
8da2f2a… stephan 61 > repository is intended to be readable, searchable, and extensible by
8da2f2a… stephan 62 > people not yet born.
8da2f2a… stephan 63
8da2f2a… stephan 64 [dataformat]: ./fileformat.wiki
8da2f2a… stephan 65
8da2f2a… stephan 66 This format has the following major properties:
8da2f2a… stephan 67
8da2f2a… stephan 68 - It is <u>**syntactically simple**</u>, easily and efficiently
8da2f2a… stephan 69 parsable in any programming language. It is also entirely
8da2f2a… stephan 70 human-readable.
8da2f2a… stephan 71
8da2f2a… stephan 72 - It is <u>**immutable**</u>. An artifact is identified by its unique
8da2f2a… stephan 73 hash value. Any modification to an artifact changes that hash,
8da2f2a… stephan 74 thereby changing its identity.
8da2f2a… stephan 75
8da2f2a… stephan 76 - It is <u>**not generic**</u>. It is custom-made for its purpose and
8da2f2a… stephan 77 makes no attempt at providing a generic format. It contains *only*
8da2f2a… stephan 78 what it *needs* to function, with zero bloat.
8da2f2a… stephan 79
8da2f2a… stephan 80 - It <u>**holds all SCM-relevant data except for client-level file
8da2f2a… stephan 81 content**</u>, the latter instead being referenced by their unique
269788e… stephan 82 hash values. Storage of the client-side content is an implementation
8da2f2a… stephan 83 detail delegated to higher-level applications.
8da2f2a… stephan 84
8da2f2a… stephan 85 - <u>**Auditability**</u>. By following the hash references in
8da2f2a… stephan 86 artifacts it is possible to unambiguously trace the origin of any
8da2f2a… stephan 87 modification to the SCM state. Combined with higher-level tools
8da2f2a… stephan 88 (specifically, Fossil's database), this audit trail can easily be
8da2f2a… stephan 89 traced both backwards and forwards in time, using any given version
8da2f2a… stephan 90 in the SCM history as a starting point.
8da2f2a… stephan 91
8da2f2a… stephan 92 Notably, the artifact file format <u>does not</u>...
8da2f2a… stephan 93
8da2f2a… stephan 94 - Specify any specific storage mechanism for the SCM's raw bytes,
8da2f2a… stephan 95 which includes both artifacts themselves and client-side file
8da2f2a… stephan 96 content. The file format refers to all such content solely by its
8da2f2a… stephan 97 unique hash value.
8da2f2a… stephan 98
4799aae… drh 99 - Specify any optimizations such as storing file-level changes as
8da2f2a… stephan 100 deltas between two versions of that content.
8da2f2a… stephan 101
8da2f2a… stephan 102 Such aspects are all considered to be implementation details of
8da2f2a… stephan 103 higher-level applications (be they in the main fossil binary or a
8da2f2a… stephan 104 hypothetical 3rd-party application), and have no effect on the
8da2f2a… stephan 105 underlying artifact data model. That said, in Fossil:
8da2f2a… stephan 106
8da2f2a… stephan 107 - All raw byte content (artifacts and client files) is stored in
8da2f2a… stephan 108 the `blob` database table.
8da2f2a… stephan 109
8da2f2a… stephan 110 - Fossil uses delta and zlib compression to keep the storage size of
8da2f2a… stephan 111 changes from one version of a piece of content to the next to a
8da2f2a… stephan 112 minimum.
8da2f2a… stephan 113
8da2f2a… stephan 114
8da2f2a… stephan 115 ## Sidebar: SCM-relevant vs Non-SCM-relevant State
8da2f2a… stephan 116
8da2f2a… stephan 117 Certain data in Fossil are "SCM-relevant" and certain data are not. In
8da2f2a… stephan 118 short, SCM-relevant data are managed in a way consistent with
8da2f2a… stephan 119 controlled versioning of that data. Conversely, non-SCM-relevant data
8da2f2a… stephan 120 are essentially any state neither specified by nor unambiguously
8da2f2a… stephan 121 refererenced by the artifact file format and are therefore not
8da2f2a… stephan 122 versioned.
8da2f2a… stephan 123
8da2f2a… stephan 124 SCM-relevant state includes:
8da2f2a… stephan 125
8da2f2a… stephan 126 - Any and all data stored in the bodies of artifacts. This includes,
8da2f2a… stephan 127 but is not limited to: wiki/ticket/forum content, tags, file names
8da2f2a… stephan 128 and Fossil-side permissions, the name of each user who introduces
8da2f2a… stephan 129 any given artifact into the data store, the timestamp of each such
8da2f2a… stephan 130 change, the inheritance tree of checkins, and many other pieces of
8da2f2a… stephan 131 metadata.
8da2f2a… stephan 132
8da2f2a… stephan 133 - Raw file content of versioned files. These data are external to
8da2f2a… stephan 134 artifacts, which refer to them by their hashes. How they are stored
8da2f2a… stephan 135 is not the concern of the data model, but (spoiler alert!) Fossil
c0654b1… brickviking 136 stores them in an SQLite database, one record per distinct hash, in
8da2f2a… stephan 137 its `blob` table (which we will cover more very soon).
8da2f2a… stephan 138
8da2f2a… stephan 139 Non-SCM-relevant state includes:
8da2f2a… stephan 140
8da2f2a… stephan 141 - Fossil's list of users and their metadata (permissions, email
8da2f2a… stephan 142 address, etc.). Artifacts themselves reference users only by their
44c5d02… stephan 143 user names. Artifacts neither care whether, nor guarantee that, user
8da2f2a… stephan 144 "drh" in one artifact is in fact the same "drh" referenced in
8da2f2a… stephan 145 another artifact.
8da2f2a… stephan 146
8da2f2a… stephan 147 - All Fossil UI configuration, e.g. the site's skin, config settings,
8da2f2a… stephan 148 and project name.
8da2f2a… stephan 149
8da2f2a… stephan 150 - In short, any tables in a Fossil repository file except for the
8da2f2a… stephan 151 `blob` table. Most, but not all, of these tables are transient
8da2f2a… stephan 152 caches for the data specified by the artifact files (which are
8da2f2a… stephan 153 stored in the `blob` table), and can safely be destroyed and rebuilt
8da2f2a… stephan 154 from the collection of artifacts with no loss of state to the
8da2f2a… stephan 155 repository. *All* of them, except for `blob` and `delta`, can be
8da2f2a… stephan 156 destroyed with no loss of *SCM-relevant* data.
8da2f2a… stephan 157
8da2f2a… stephan 158 ## Terminology Hair-splitting: Manifest vs. Artifact
8da2f2a… stephan 159
8da2f2a… stephan 160 We sometimes refer to artifacts as "manifests," which is technically a
8da2f2a… stephan 161 term for artifacts which record checkins. The various other artifact
8da2f2a… stephan 162 types are arguably not "manifests," but are sometimes referred to as
8da2f2a… stephan 163 such because the internal APIs use that term.
8da2f2a… stephan 164
8da2f2a… stephan 165
8da2f2a… stephan 166 ## A Very Basic Example
8da2f2a… stephan 167
8da2f2a… stephan 168 The following artifact, truncated for brevity, represents a typical
8da2f2a… stephan 169 checkin artifact (a.k.a. a manifest):
8da2f2a… stephan 170
8da2f2a… stephan 171 ```
8da2f2a… stephan 172 C Bug\sfix\sin\sthe\slocal\sdatabase\sfinder.
8da2f2a… stephan 173 D 2007-07-30T13:01:08
8da2f2a… stephan 174 F src/VERSION 24bbb3aad63325ff33c56d777007d7cd63dc19ea
8da2f2a… stephan 175 F src/add.c 1a5dfcdbfd24c65fa04da865b2e21486d075e154
8da2f2a… stephan 176 F src/blob.c 8ec1e279a6cd0cfd5f1e3f8a39f2e9a1682e0113
8da2f2a… stephan 177 <SNIP>
8da2f2a… stephan 178 F www/selfcheck.html 849df9860df602dc2c55163d658c6b138213122f
8da2f2a… stephan 179 P 01e7596a984e2cd2bc12abc0a741415b902cbeea
8da2f2a… stephan 180 R 74a0432d81b956bfc3ff5a1a2bb46eb5
8da2f2a… stephan 181 U drh
8da2f2a… stephan 182 Z c9dcc06ecead312b1c310711cb360bc3
8da2f2a… stephan 183 ```
8da2f2a… stephan 184
213160c… stephan 185 Each line is a single data record called a "card." The first letter of
8da2f2a… stephan 186 each line tells us the type of data stored on that line and the
8da2f2a… stephan 187 following space-separated tokens contain the data for that
8da2f2a… stephan 188 line. Tokens which themselves contain spaces (notably the checkin
8da2f2a… stephan 189 comment) have those escaped as `\s`. The raw text of wiki
8da2f2a… stephan 190 pages/comments, forum posts, and ticket bodies/comments is stored
8da2f2a… stephan 191 directly in the corresponding artifact, but is stored in a way which
8da2f2a… stephan 192 makes such escaping unnecessary.
8da2f2a… stephan 193
8da2f2a… stephan 194 The hashes seen above are a critical component of the architecture:
8da2f2a… stephan 195
8da2f2a… stephan 196 - The `F` (file) records refer to the content of those files by the
8da2f2a… stephan 197 hash of that content. Where that content is stored is *not* specified
8da2f2a… stephan 198 by the data model.
8da2f2a… stephan 199
8da2f2a… stephan 200 - The `P` (parent) line is the hash code of the parent version (itself
8da2f2a… stephan 201 an artifact).
8da2f2a… stephan 202
8da2f2a… stephan 203 - The `Z` line is a hash of all of the content of *this artifact*
8da2f2a… stephan 204 which precedes the `Z` line. Thus any change to the content of an
8da2f2a… stephan 205 artifact changes both the artifact's identity (its hash) and its `Z`
8da2f2a… stephan 206 value, making it impossible to inject modified artifacts into an
8da2f2a… stephan 207 existing artifact tree.
8da2f2a… stephan 208
8da2f2a… stephan 209 - The `R` line is yet another consistency-checking hash which we won't
8da2f2a… stephan 210 go into here except to say that it's an internal consistency
8da2f2a… stephan 211 check/line of defense against modification of file content
8da2f2a… stephan 212 referenced by the artifact.
8da2f2a… stephan 213
8da2f2a… stephan 214 # Part 2: The `blob` Table
8da2f2a… stephan 215
8da2f2a… stephan 216 ```pikchr center
8da2f2a… stephan 217 AllObjects: [
8da2f2a… stephan 218 A: file "Artifacts";
8da2f2a… stephan 219 down; move to A.s; move 50%;
8da2f2a… stephan 220 F: file "Client" "files" fill lightskyblue;
8da2f2a… stephan 221 right; move 1; up; move 50%;
8da2f2a… stephan 222 B: cylinder "blob table" fill lightskyblue;
8da2f2a… stephan 223 right;
8da2f2a… stephan 224 arrow from A.e to B.w;
8da2f2a… stephan 225 arrow from F.e to B.w;
8da2f2a… stephan 226 arrow dashed from B.e;
8da2f2a… stephan 227 C: box rad 0.1 "Crosslink" "process";
8da2f2a… stephan 228 arrow
8da2f2a… stephan 229 AUX: cylinder "Auxiliary" "tables"
8da2f2a… stephan 230 arc -> cw dotted from AUX.s to B.s;
8da2f2a… stephan 231 ] # end of AllObjects
8da2f2a… stephan 232 ```
8da2f2a… stephan 233
8da2f2a… stephan 234
8da2f2a… stephan 235 The `blob` table is the core-most storage of a Fossil repository
8da2f2a… stephan 236 database, storing all SCM-relevant data (and *only* SCM-relevant
8da2f2a… stephan 237 data). Each row of this table holds a single artifact or the content
8da2f2a… stephan 238 for a single version of a single client-side file. Slightly truncated
8da2f2a… stephan 239 for clarity, its schema contains the following fields:
8da2f2a… stephan 240
8da2f2a… stephan 241 - **`uuid`**: the hash code of the blob's contents.
8da2f2a… stephan 242 - **`rid`**: a unique integer key for this record. This is how the
8da2f2a… stephan 243 blob table is mapped to other (transient) tables, but the RIDs are
8da2f2a… stephan 244 specific to one given copy of a repository and must not be used for
8da2f2a… stephan 245 cross-repository referencing. The RID is a private/internal value of
8da2f2a… stephan 246 no use to a user unless they're building SQL queries for use with
8da2f2a… stephan 247 the Fossil db schema.
8da2f2a… stephan 248 - **`size`**: the size, in bytes, of the blob's contents, or -1 for
8da2f2a… stephan 249 "phantom" blobs (those which Fossil knows should exist because it's
8da2f2a… stephan 250 seen them referenced somewhere, but for which it has not been given
8da2f2a… stephan 251 any content).
8da2f2a… stephan 252 - **`content`**: the blob's raw content bytes, with the caveat that
8da2f2a… stephan 253 Fossil is free to store it in an "alternate representation."
8da2f2a… stephan 254 Specifically, the `content` field often holds a zlib-compressed
8da2f2a… stephan 255 delta from a previous version of the blob's content (a separate
8da2f2a… stephan 256 entry in the `blob` table), and an auxiliary table named `delta`
8da2f2a… stephan 257 maps such blobs to their previous versions, such that Fossil can
8da2f2a… stephan 258 reconstruct the real content from them by applying the delta to its
8da2f2a… stephan 259 previous version (and such deltas may be chained). Thus extraction
8da2f2a… stephan 260 of the content from this field cannot be performed via vanilla SQL,
8da2f2a… stephan 261 and requires a Fossil-specific function which knows how to convert
8da2f2a… stephan 262 any internal representations of the content to its original form.
8da2f2a… stephan 263
8da2f2a… stephan 264
8da2f2a… stephan 265 ## Sidebar: How does `blob` Distinguish Between Artifacts and Client Content?
8da2f2a… stephan 266
8da2f2a… stephan 267 Notice that the `blob` table has no flag saying "this record is an
8da2f2a… stephan 268 artifact" or "this record is client data." Similarly, there is no
8da2f2a… stephan 269 place in the database dedicated to keeping track of which `blob`
8da2f2a… stephan 270 records are artifacts and which are file content.
8da2f2a… stephan 271
8da2f2a… stephan 272 That said, (A) the type of a blob can be implied via certain table
8da2f2a… stephan 273 relationships and (B) the `event` table (the `/timeline`'s main data
8da2f2a… stephan 274 source) incidentally has a list of artifacts and their sub-types
8da2f2a… stephan 275 (checkin, wiki, tag, etc.). However, given that all of those
8da2f2a… stephan 276 relationships, including the timeline, are *transient*, how can Fossil
8da2f2a… stephan 277 distinguish between the two types of data?
8da2f2a… stephan 278
8da2f2a… stephan 279 Fossil's artifact format is extremely rigid and is *strictly* enforced
8da2f2a… stephan 280 internally, with zero room provided for leniency. Every artifact which
8da2f2a… stephan 281 is internally created is re-parsed for validity before it is committed
8da2f2a… stephan 282 to the database, making it impossible that Fossil can inject an
8da2f2a… stephan 283 invalid artifact into the repository. Because of the strictness of the
8da2f2a… stephan 284 artifact parser, the chances that any given piece of arbitrary client
8da2f2a… stephan 285 data could be successfully parsed as an artifact, even if it is
8da2f2a… stephan 286 syntactically 99% similar to an artifact, are *effectively zero*.
8da2f2a… stephan 287
8da2f2a… stephan 288 Thus Fossil's rule of interpreting the contents of the blob table is:
8da2f2a… stephan 289 if it can be parsed as an artifact, it *is* an artifact, else it is
8da2f2a… stephan 290 opaque client-side data.
8da2f2a… stephan 291
8da2f2a… stephan 292 That rule is most often relevant in operations like `rebuild` and
8da2f2a… stephan 293 `reconstruct`, both of which necessarily have to sort out artifacts
8da2f2a… stephan 294 and non-artifact blobs from arbitrary collections of blobs.
8da2f2a… stephan 295
8da2f2a… stephan 296 It is, in fact, possible to store an artifact unrelated to the current
8da2f2a… stephan 297 repository in that repository, and it *will be parsed and processed as
8da2f2a… stephan 298 an artifact* (see below), but it likely refers to other artifacts or
8da2f2a… stephan 299 blobs which are not part of the current repository, thereby possibly
8da2f2a… stephan 300 introducing "strange" data into the UI. If this happens, it's
8da2f2a… stephan 301 potentially slightly confusing but is functionally harmless.
8da2f2a… stephan 302
8da2f2a… stephan 303
8da2f2a… stephan 304 # Part 3: Crosslinking
8da2f2a… stephan 305
8da2f2a… stephan 306 ```pikchr center
8da2f2a… stephan 307 AllObjects: [
8da2f2a… stephan 308 A: file "Artifacts";
8da2f2a… stephan 309 down; move to A.s; move 50%;
8da2f2a… stephan 310 F: file "Client" "files";
8da2f2a… stephan 311 right; move 1; up; move 50%;
8da2f2a… stephan 312 B: cylinder "blob table"
8da2f2a… stephan 313 right;
8da2f2a… stephan 314 arrow from A.e to B.w;
8da2f2a… stephan 315 arrow from F.e to B.w;
8da2f2a… stephan 316 arrow dashed from B.e;
8da2f2a… stephan 317 C: box rad 0.1 "Crosslink" "process" fill lightskyblue;
8da2f2a… stephan 318 arrow
8da2f2a… stephan 319 AUX: cylinder "Auxiliary" "tables" fill lightskyblue;
8da2f2a… stephan 320 arc -> cw dotted from AUX.s to B.s;
8da2f2a… stephan 321 ] # end of AllObjects
8da2f2a… stephan 322 ```
8da2f2a… stephan 323
8da2f2a… stephan 324 Once an artifact is stored in the `blob` table, how does one perform
8da2f2a… stephan 325 SQL queries against its plain-text format? In short: *One Does Not
8da2f2a… stephan 326 Simply Query the Artifacts*.
8da2f2a… stephan 327
8da2f2a… stephan 328 Crosslinking, as its colloquially known, is a one-way processing step
8da2f2a… stephan 329 which transforms an immutable artifact's state into something
8da2f2a… stephan 330 database-friendly. Crosslinking happens automatically every time
8da2f2a… stephan 331 Fossil generates, or is given, a new artifact. Crosslinking of any
8da2f2a… stephan 332 given artifact may update many different auxiliary tables, *all* of
8da2f2a… stephan 333 which are transient in the sense that they may be destroyed and then
8da2f2a… stephan 334 recreated by crosslinking all artifacts from the `blob` table (which
8da2f2a… stephan 335 is exactly what the `rebuild` command does). The overwhelming majority
8da2f2a… stephan 336 of individual database records in any Fossil repository are found in
8da2f2a… stephan 337 these transient auxiliary tables, though the `blob` table tends to
8da2f2a… stephan 338 account for the overwhelming majority of a repository's disk space.
8da2f2a… stephan 339
8da2f2a… stephan 340 This approach to mapping data from artifacts to the db gives Fossil
8da2f2a… stephan 341 the freedom to change its database model, effectively at will, with
8da2f2a… stephan 342 minimal client-side disruption (at most, a call to `rebuild`). This
8da2f2a… stephan 343 allows, for example, Fossil to take advantage of new improvements in
8da2f2a… stephan 344 sqlite without affecting compatibility with older repositories.
8da2f2a… stephan 345
8da2f2a… stephan 346 Auxiliary tables hold data mappings such as:
8da2f2a… stephan 347
8da2f2a… stephan 348 - Child/parent relationships of checkins. (The `plink` table.)
8da2f2a… stephan 349 - Records of file names and changes to files. (The `mlink` and `filename` tables.)
8da2f2a… stephan 350 - Timeline entries. (The `event` table.)
8da2f2a… stephan 351
8da2f2a… stephan 352 And numerous other bits and pieces.
8da2f2a… stephan 353
8da2f2a… stephan 354 The many auxiliary tables maintained by the app-level code reference
8da2f2a… stephan 355 the `blob` table via its RID field, as that's far more efficient than
8da2f2a… stephan 356 using hashes (`blob.uuid`) as foreign keys. The contexts of those
8da2f2a… stephan 357 auxiliary data unambiguously tell us whether the referenced blobs are
8da2f2a… stephan 358 artifacts or file content, so there is no efficiency penalty there for
8da2f2a… stephan 359 hosting both opaque blobs and artifacts in the `blob` table.
8da2f2a… stephan 360
8da2f2a… stephan 361 The complete SQL schemas for the core-most auxiliary tables can be found
8da2f2a… stephan 362 at:
8da2f2a… stephan 363
5b42737… stephan 364 [](/finfo/src/schema.c?ci=trunk)
8da2f2a… stephan 365
8da2f2a… stephan 366 Noting, however, that all database tables are effectively internal
8da2f2a… stephan 367 APIs, with no API stability guarantees and subject to change at any
8da2f2a… stephan 368 time. Thus their structures generally should not be relied upon in
8da2f2a… stephan 369 client-side scripts.
8da2f2a… stephan 370
8da2f2a… stephan 371
8da2f2a… stephan 372 # Part 4: Implications and Consequences of the Model
8da2f2a… stephan 373
8da2f2a… stephan 374 *Some* of the implications and consequences of Fossil's data model
8da2f2a… stephan 375 combined with the higher-level access via SQL include:
8da2f2a… stephan 376
8da2f2a… stephan 377 - **Provable immutability of history.** Fossil offers only one option
8da2f2a… stephan 378 for modifying history: "shunning" is the forceful removal of an
8da2f2a… stephan 379 artifact from the `blob` table and the creation of a db record
8da2f2a… stephan 380 stating that the shunned hash may no longer be synced into this
8da2f2a… stephan 381 repository. Shunning effectively leaves a hole in the SCM history,
8da2f2a… stephan 382 and is only intended to be used for removal of illegal, dangerous,
8da2f2a… stephan 383 or private information which should never have been added to the
8da2f2a… stephan 384 repository.
8da2f2a… stephan 385
8da2f2a… stephan 386 - **Complete separation of SCM-relevant data and app-level data
8da2f2a… stephan 387 structures**. This allows the application to update its structures
8da2f2a… stephan 388 at will without significant backwards-compatibility concerns. In
8da2f2a… stephan 389 Fossil's case, "data structures" primarily refers to the SQL
8da2f2a… stephan 390 schema. Bringing a given repository schema up to date vis a vis a
8da2f2a… stephan 391 given fossil binary version simply means rebuilding the repository
8da2f2a… stephan 392 with that fossil binary. There are exceptionally rare cases, namely
8da2f2a… stephan 393 the switch from SHA1 to SHA3-256 ushered in with Fossil 2.0, which
8da2f2a… stephan 394 can lead to true incompatibility. e.g. a Fossil 1.x client cannot
8da2f2a… stephan 395 use a repository database which contains SHA3 hashes, regardless of
8da2f2a… stephan 396 a rebuild.
8da2f2a… stephan 397
8da2f2a… stephan 398 - **Two-way compatibility with other hypothetical clients** which also
8da2f2a… stephan 399 implement the same underlying data model. So far there are none, but
8da2f2a… stephan 400 it's conceivably possible.
8da2f2a… stephan 401
213160c… stephan 402 - **Provides a solid basis for reporting.** Fossil's real-time metrics
213160c… stephan 403 and reporting options are arguably the most powerful and flexible
213160c… stephan 404 yet seen in an SCM.
8da2f2a… stephan 405
8da2f2a… stephan 406 - Very probably several more things.

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button