Fossil SCM
Updates to the technical overview document.
Commit
e255caa2c71a3b35db7163ea42bfe4ba9e34e43c
Parent
034e887c356c38e…
1 file changed
+16
-18
+16
-18
| --- www/tech_overview.wiki | ||
| +++ www/tech_overview.wiki | ||
| @@ -5,41 +5,38 @@ | ||
| 5 | 5 | |
| 6 | 6 | <h2>1.0 Introduction</h2> |
| 7 | 7 | |
| 8 | 8 | At its lowest level, a Fossil repository consists of an unordered set |
| 9 | 9 | of immutable "artifacts". You might think of these artifacts as "files", |
| 10 | -since in many cases the artifacts exactly correspond to source code files | |
| 11 | -that are stored in the Fossil repository. But other "control artifacts" | |
| 10 | +since in many cases the artifacts exactly that. But other "control artifacts" | |
| 12 | 11 | are also included in the mix. These control artifacts define the relationships |
| 13 | 12 | between artifacts - which files go together to form a particular |
| 14 | 13 | version of the project, who checked in that version and when, what was |
| 15 | 14 | the check-in comment, what wiki pages are included with the project, what |
| 16 | 15 | are the edit histories of each wiki page, what bug reports or tickets are |
| 17 | -included, who contributed to the evolution of each ticket, and so forth, | |
| 18 | -and so on. This low-level file format is called the "global state" of | |
| 16 | +included, who contributed to the evolution of each ticket, and so forth. | |
| 17 | +This low-level file format is called the "global state" of | |
| 19 | 18 | the repository, since this is the information that is synced to peer |
| 20 | 19 | repositories using push and pull operations. The low-level file format |
| 21 | 20 | is also called "enduring" since it is intended to last for many years. |
| 22 | 21 | The details of the low-level, enduring, global file format |
| 23 | 22 | are [./fileformat.wiki | described separately]. |
| 24 | 23 | |
| 25 | 24 | This article is about how Fossil is currently implemented. Instead of |
| 26 | 25 | dealing with vague abstractions of "enduring file formats" as the |
| 27 | -[./fileformat.wiki | that other document] does, this article provides | |
| 26 | +[./fileformat.wiki | other document] does, this article provides | |
| 28 | 27 | some detail on how Fossil actually stores information on disk. |
| 29 | 28 | |
| 30 | 29 | <h2>2.0 Three Databases</h2> |
| 31 | 30 | |
| 32 | 31 | Fossil stores state information in |
| 33 | 32 | [http://www.sqlite.org/ | SQLite] database files. |
| 34 | 33 | SQLite keeps an entire relational database, including multiple tables and |
| 35 | 34 | indices, in a single disk file. The SQLite library allows the database |
| 36 | 35 | files to be efficiently queried and updated using the industry-standard |
| 37 | -SQL language. And SQLite makes updates to these database files atomic, | |
| 38 | -even if a system crashes or power failure occurs in the middle of the | |
| 39 | -update, meaning that repository content is protected even during severe | |
| 40 | -malfunctions. | |
| 36 | +SQL language. SQLite updates are atomic, so even in the event of | |
| 37 | +a system crashes or power failure the repository content is protected. | |
| 41 | 38 | |
| 42 | 39 | Fossil uses three separate classes of SQLite databases: |
| 43 | 40 | |
| 44 | 41 | <ol> |
| 45 | 42 | <li>The configuration database |
| @@ -152,14 +149,15 @@ | ||
| 152 | 149 | The artifacts are stored as BLOBs, compressed using |
| 153 | 150 | [http://www.zlib.net/ | zlib compression] and, where applicable, |
| 154 | 151 | using [./delta_encoder_algorithm.wiki | delta compression]. |
| 155 | 152 | The combination of zlib and delta compression results in a considerable |
| 156 | 153 | space savings. For the SQLite project, at the time of this writing, |
| 157 | -the total size of all artifacts is over 1.7 GB but thanks to the | |
| 154 | +the total size of all artifacts is over 2.0 GB but thanks to the | |
| 158 | 155 | combined zlib and delta compression, that content only takes up |
| 159 | -51.4 MB of space in the repository database, for a compression ratio | |
| 160 | -of about 33:1. | |
| 156 | +32 MB of space in the repository database, for a compression ratio | |
| 157 | +of about 64:1. The average size of a content BLOB in the database | |
| 158 | +is around 500 bytes. | |
| 161 | 159 | |
| 162 | 160 | Note that the zlib and delta compression is not an inherent part of the |
| 163 | 161 | Fossil file format; it is just an optimization. |
| 164 | 162 | The enduring file format for Fossil is the unordered |
| 165 | 163 | set of artifacts. The compression techniques are just a detail of |
| @@ -185,11 +183,11 @@ | ||
| 185 | 183 | |
| 186 | 184 | <h4>2.2.2 Project Metadata</h4> |
| 187 | 185 | |
| 188 | 186 | The global project state information in the repository database is |
| 189 | 187 | supplemented by computed metadata that makes querying the project state |
| 190 | -more efficient. Metadata includes but information such as the following: | |
| 188 | +more efficient. Metadata includes information such as the following: | |
| 191 | 189 | |
| 192 | 190 | * The names for all files found in any checkin. |
| 193 | 191 | * All check-ins that modify a given file |
| 194 | 192 | * Parents and children of each checkin. |
| 195 | 193 | * Potential timeline rows. |
| @@ -200,13 +198,13 @@ | ||
| 200 | 198 | * Current content of each ticket. |
| 201 | 199 | * Cross-references between tickets, checkins, and wiki pages. |
| 202 | 200 | |
| 203 | 201 | The metadata is held in various SQL tables in the repository database. |
| 204 | 202 | The metadata is designed to facilitate queries for the various timelines and |
| 205 | -reports that Fossil generates. | |
| 203 | +reports that Fossil generates. | |
| 206 | 204 | As the functionality of Fossil evolves, |
| 207 | -the schema for the metadata can and does change from time to time. | |
| 205 | +the schema for the metadata can and does change. | |
| 208 | 206 | But schema changes do no invalidate the repository. Remember that the |
| 209 | 207 | metadata contains no new information - only information that has been |
| 210 | 208 | extracted from the canonical artifacts and saved in a more useful form. |
| 211 | 209 | Hence, when the metadata schema changes, the prior metadata can be discarded |
| 212 | 210 | and the entire metadata corpus can be recomputed from the canonical |
| @@ -273,13 +271,13 @@ | ||
| 273 | 271 | <h4>2.2.5 Shunned Artifact List</h4> |
| 274 | 272 | |
| 275 | 273 | The set of canonical artifacts for a project - the global state for the |
| 276 | 274 | project - is intended to be an append-only database. In other words, |
| 277 | 275 | new artifacts can be added but artifacts can never be removed. But |
| 278 | -it sometimes happens that inappropriate content can be mistakenly or | |
| 279 | -maliciously added to a repository. When that happens, the only way | |
| 280 | -to get rid of the content is to [./shunning.wiki | "shun"] it. | |
| 276 | +it sometimes happens that inappropriate content is mistakenly or | |
| 277 | +maliciously added to a repository. The only way to get rid of | |
| 278 | +the undesired content is to [./shunning.wiki | "shun"] it. | |
| 281 | 279 | The "shun" table in the repository database records the SHA1 hash of |
| 282 | 280 | all shunned artifacts. |
| 283 | 281 | |
| 284 | 282 | The shun table can be pushed or pulled using |
| 285 | 283 | the [/help/config | fossil config] command with the "shun" AREA argument. |
| 286 | 284 |
| --- www/tech_overview.wiki | |
| +++ www/tech_overview.wiki | |
| @@ -5,41 +5,38 @@ | |
| 5 | |
| 6 | <h2>1.0 Introduction</h2> |
| 7 | |
| 8 | At its lowest level, a Fossil repository consists of an unordered set |
| 9 | of immutable "artifacts". You might think of these artifacts as "files", |
| 10 | since in many cases the artifacts exactly correspond to source code files |
| 11 | that are stored in the Fossil repository. But other "control artifacts" |
| 12 | are also included in the mix. These control artifacts define the relationships |
| 13 | between artifacts - which files go together to form a particular |
| 14 | version of the project, who checked in that version and when, what was |
| 15 | the check-in comment, what wiki pages are included with the project, what |
| 16 | are the edit histories of each wiki page, what bug reports or tickets are |
| 17 | included, who contributed to the evolution of each ticket, and so forth, |
| 18 | and so on. This low-level file format is called the "global state" of |
| 19 | the repository, since this is the information that is synced to peer |
| 20 | repositories using push and pull operations. The low-level file format |
| 21 | is also called "enduring" since it is intended to last for many years. |
| 22 | The details of the low-level, enduring, global file format |
| 23 | are [./fileformat.wiki | described separately]. |
| 24 | |
| 25 | This article is about how Fossil is currently implemented. Instead of |
| 26 | dealing with vague abstractions of "enduring file formats" as the |
| 27 | [./fileformat.wiki | that other document] does, this article provides |
| 28 | some detail on how Fossil actually stores information on disk. |
| 29 | |
| 30 | <h2>2.0 Three Databases</h2> |
| 31 | |
| 32 | Fossil stores state information in |
| 33 | [http://www.sqlite.org/ | SQLite] database files. |
| 34 | SQLite keeps an entire relational database, including multiple tables and |
| 35 | indices, in a single disk file. The SQLite library allows the database |
| 36 | files to be efficiently queried and updated using the industry-standard |
| 37 | SQL language. And SQLite makes updates to these database files atomic, |
| 38 | even if a system crashes or power failure occurs in the middle of the |
| 39 | update, meaning that repository content is protected even during severe |
| 40 | malfunctions. |
| 41 | |
| 42 | Fossil uses three separate classes of SQLite databases: |
| 43 | |
| 44 | <ol> |
| 45 | <li>The configuration database |
| @@ -152,14 +149,15 @@ | |
| 152 | The artifacts are stored as BLOBs, compressed using |
| 153 | [http://www.zlib.net/ | zlib compression] and, where applicable, |
| 154 | using [./delta_encoder_algorithm.wiki | delta compression]. |
| 155 | The combination of zlib and delta compression results in a considerable |
| 156 | space savings. For the SQLite project, at the time of this writing, |
| 157 | the total size of all artifacts is over 1.7 GB but thanks to the |
| 158 | combined zlib and delta compression, that content only takes up |
| 159 | 51.4 MB of space in the repository database, for a compression ratio |
| 160 | of about 33:1. |
| 161 | |
| 162 | Note that the zlib and delta compression is not an inherent part of the |
| 163 | Fossil file format; it is just an optimization. |
| 164 | The enduring file format for Fossil is the unordered |
| 165 | set of artifacts. The compression techniques are just a detail of |
| @@ -185,11 +183,11 @@ | |
| 185 | |
| 186 | <h4>2.2.2 Project Metadata</h4> |
| 187 | |
| 188 | The global project state information in the repository database is |
| 189 | supplemented by computed metadata that makes querying the project state |
| 190 | more efficient. Metadata includes but information such as the following: |
| 191 | |
| 192 | * The names for all files found in any checkin. |
| 193 | * All check-ins that modify a given file |
| 194 | * Parents and children of each checkin. |
| 195 | * Potential timeline rows. |
| @@ -200,13 +198,13 @@ | |
| 200 | * Current content of each ticket. |
| 201 | * Cross-references between tickets, checkins, and wiki pages. |
| 202 | |
| 203 | The metadata is held in various SQL tables in the repository database. |
| 204 | The metadata is designed to facilitate queries for the various timelines and |
| 205 | reports that Fossil generates. |
| 206 | As the functionality of Fossil evolves, |
| 207 | the schema for the metadata can and does change from time to time. |
| 208 | But schema changes do no invalidate the repository. Remember that the |
| 209 | metadata contains no new information - only information that has been |
| 210 | extracted from the canonical artifacts and saved in a more useful form. |
| 211 | Hence, when the metadata schema changes, the prior metadata can be discarded |
| 212 | and the entire metadata corpus can be recomputed from the canonical |
| @@ -273,13 +271,13 @@ | |
| 273 | <h4>2.2.5 Shunned Artifact List</h4> |
| 274 | |
| 275 | The set of canonical artifacts for a project - the global state for the |
| 276 | project - is intended to be an append-only database. In other words, |
| 277 | new artifacts can be added but artifacts can never be removed. But |
| 278 | it sometimes happens that inappropriate content can be mistakenly or |
| 279 | maliciously added to a repository. When that happens, the only way |
| 280 | to get rid of the content is to [./shunning.wiki | "shun"] it. |
| 281 | The "shun" table in the repository database records the SHA1 hash of |
| 282 | all shunned artifacts. |
| 283 | |
| 284 | The shun table can be pushed or pulled using |
| 285 | the [/help/config | fossil config] command with the "shun" AREA argument. |
| 286 |
| --- www/tech_overview.wiki | |
| +++ www/tech_overview.wiki | |
| @@ -5,41 +5,38 @@ | |
| 5 | |
| 6 | <h2>1.0 Introduction</h2> |
| 7 | |
| 8 | At its lowest level, a Fossil repository consists of an unordered set |
| 9 | of immutable "artifacts". You might think of these artifacts as "files", |
| 10 | since in many cases the artifacts exactly that. But other "control artifacts" |
| 11 | are also included in the mix. These control artifacts define the relationships |
| 12 | between artifacts - which files go together to form a particular |
| 13 | version of the project, who checked in that version and when, what was |
| 14 | the check-in comment, what wiki pages are included with the project, what |
| 15 | are the edit histories of each wiki page, what bug reports or tickets are |
| 16 | included, who contributed to the evolution of each ticket, and so forth. |
| 17 | This low-level file format is called the "global state" of |
| 18 | the repository, since this is the information that is synced to peer |
| 19 | repositories using push and pull operations. The low-level file format |
| 20 | is also called "enduring" since it is intended to last for many years. |
| 21 | The details of the low-level, enduring, global file format |
| 22 | are [./fileformat.wiki | described separately]. |
| 23 | |
| 24 | This article is about how Fossil is currently implemented. Instead of |
| 25 | dealing with vague abstractions of "enduring file formats" as the |
| 26 | [./fileformat.wiki | other document] does, this article provides |
| 27 | some detail on how Fossil actually stores information on disk. |
| 28 | |
| 29 | <h2>2.0 Three Databases</h2> |
| 30 | |
| 31 | Fossil stores state information in |
| 32 | [http://www.sqlite.org/ | SQLite] database files. |
| 33 | SQLite keeps an entire relational database, including multiple tables and |
| 34 | indices, in a single disk file. The SQLite library allows the database |
| 35 | files to be efficiently queried and updated using the industry-standard |
| 36 | SQL language. SQLite updates are atomic, so even in the event of |
| 37 | a system crashes or power failure the repository content is protected. |
| 38 | |
| 39 | Fossil uses three separate classes of SQLite databases: |
| 40 | |
| 41 | <ol> |
| 42 | <li>The configuration database |
| @@ -152,14 +149,15 @@ | |
| 149 | The artifacts are stored as BLOBs, compressed using |
| 150 | [http://www.zlib.net/ | zlib compression] and, where applicable, |
| 151 | using [./delta_encoder_algorithm.wiki | delta compression]. |
| 152 | The combination of zlib and delta compression results in a considerable |
| 153 | space savings. For the SQLite project, at the time of this writing, |
| 154 | the total size of all artifacts is over 2.0 GB but thanks to the |
| 155 | combined zlib and delta compression, that content only takes up |
| 156 | 32 MB of space in the repository database, for a compression ratio |
| 157 | of about 64:1. The average size of a content BLOB in the database |
| 158 | is around 500 bytes. |
| 159 | |
| 160 | Note that the zlib and delta compression is not an inherent part of the |
| 161 | Fossil file format; it is just an optimization. |
| 162 | The enduring file format for Fossil is the unordered |
| 163 | set of artifacts. The compression techniques are just a detail of |
| @@ -185,11 +183,11 @@ | |
| 183 | |
| 184 | <h4>2.2.2 Project Metadata</h4> |
| 185 | |
| 186 | The global project state information in the repository database is |
| 187 | supplemented by computed metadata that makes querying the project state |
| 188 | more efficient. Metadata includes information such as the following: |
| 189 | |
| 190 | * The names for all files found in any checkin. |
| 191 | * All check-ins that modify a given file |
| 192 | * Parents and children of each checkin. |
| 193 | * Potential timeline rows. |
| @@ -200,13 +198,13 @@ | |
| 198 | * Current content of each ticket. |
| 199 | * Cross-references between tickets, checkins, and wiki pages. |
| 200 | |
| 201 | The metadata is held in various SQL tables in the repository database. |
| 202 | The metadata is designed to facilitate queries for the various timelines and |
| 203 | reports that Fossil generates. |
| 204 | As the functionality of Fossil evolves, |
| 205 | the schema for the metadata can and does change. |
| 206 | But schema changes do no invalidate the repository. Remember that the |
| 207 | metadata contains no new information - only information that has been |
| 208 | extracted from the canonical artifacts and saved in a more useful form. |
| 209 | Hence, when the metadata schema changes, the prior metadata can be discarded |
| 210 | and the entire metadata corpus can be recomputed from the canonical |
| @@ -273,13 +271,13 @@ | |
| 271 | <h4>2.2.5 Shunned Artifact List</h4> |
| 272 | |
| 273 | The set of canonical artifacts for a project - the global state for the |
| 274 | project - is intended to be an append-only database. In other words, |
| 275 | new artifacts can be added but artifacts can never be removed. But |
| 276 | it sometimes happens that inappropriate content is mistakenly or |
| 277 | maliciously added to a repository. The only way to get rid of |
| 278 | the undesired content is to [./shunning.wiki | "shun"] it. |
| 279 | The "shun" table in the repository database records the SHA1 hash of |
| 280 | all shunned artifacts. |
| 281 | |
| 282 | The shun table can be pushed or pulled using |
| 283 | the [/help/config | fossil config] command with the "shun" AREA argument. |
| 284 |