Fossil SCM
Continuing work on the tech_overview document. Still far from complete. This is merely an incremental check-in.
Commit
dabc1105ba85c5dd96fbaf3b2932a6d680e2fb50
Parent
af6810c589e3337…
1 file changed
+82
-9
+82
-9
| --- www/tech_overview.wiki | ||
| +++ www/tech_overview.wiki | ||
| @@ -4,23 +4,23 @@ | ||
| 4 | 4 | </h2> |
| 5 | 5 | |
| 6 | 6 | <h2>1.0 Introduction</h2> |
| 7 | 7 | |
| 8 | 8 | At its lowest level, a Fossil repository consists of an unordered set |
| 9 | -of immutable "artifacts". Think of these artifacts as "files", since in | |
| 10 | -many cases the artifacts do indeed exactly correspond to source code files | |
| 9 | +of immutable "artifacts". You might think of these artifacts as "files", | |
| 10 | +since in many cases the artifacts exactly correspond to source code files | |
| 11 | 11 | that are stored in the Fossil repostory. But other "control artifacts" |
| 12 | 12 | are also included in the mix. These control artifacts define the relationships |
| 13 | 13 | between artifacts - which files go together to form a particular |
| 14 | 14 | version of the project, who checked in that version and when, what was |
| 15 | 15 | the check-in comment, what wiki pages are included with the project, what |
| 16 | 16 | are the edit histories of each wiki page, what bug reports or tickets are |
| 17 | 17 | included, who contributed to the evolution of each ticket, and so forth, |
| 18 | 18 | and so on. This low-level file format is called the "global state" of |
| 19 | 19 | the repository, since this is the information that is synced to peer |
| 20 | -repositories using push and pull operations. the low-level file format | |
| 21 | -is also called "enduring" since it is intended to last for generations. | |
| 20 | +repositories using push and pull operations. The low-level file format | |
| 21 | +is also called "enduring" since it is intended to last for many years. | |
| 22 | 22 | The details of the low-level, enduring, global file format |
| 23 | 23 | are [./fileformat.wiki | described separately]. |
| 24 | 24 | |
| 25 | 25 | This article is about how Fossil is currently implemented. Instead of |
| 26 | 26 | dealing with vague abstractions of "enduring file formats" as the |
| @@ -29,19 +29,19 @@ | ||
| 29 | 29 | |
| 30 | 30 | <h2>2.0 Three Databases</h2> |
| 31 | 31 | |
| 32 | 32 | Fossil stores state information in |
| 33 | 33 | [http://www.sqlite.org/ | SQLite] database files. |
| 34 | -SQLite stores an entire relational database, including multiple tables and | |
| 34 | +SQLite keeps an entire relational database, including multiple tables and | |
| 35 | 35 | indices, in a single disk file. The SQLite library allows the database |
| 36 | 36 | files to be efficiently queried and updated using the industry-standard |
| 37 | 37 | SQL language. And SQLite makes updates to these database files atomic, |
| 38 | -even in the face of system crashes and power failures, meaning that even | |
| 39 | -a power loss in the middle of a commit will not damage the Fossil repository | |
| 40 | -content. | |
| 38 | +even if a system crashe or power failure occurs in the middle of the | |
| 39 | +update, meaning that repository content is protected even during severe | |
| 40 | +malfunctions. | |
| 41 | 41 | |
| 42 | -Fossil uses three separate SQLite databases: | |
| 42 | +Fossil uses three separate classes of SQLite databases: | |
| 43 | 43 | |
| 44 | 44 | <ol> |
| 45 | 45 | <li>The configuration database |
| 46 | 46 | <li>Repository databases |
| 47 | 47 | <li>Checkout databases |
| @@ -52,10 +52,21 @@ | ||
| 52 | 52 | repository database per project. The repository database is the |
| 53 | 53 | file that people are normally referring to when they say |
| 54 | 54 | "a Fossil repository". The checkout database is found in the working |
| 55 | 55 | checkout for a project and contains state information that is unique |
| 56 | 56 | to that working checkout. |
| 57 | + | |
| 58 | +Fossil does not always use all three databaes files. The web interface, | |
| 59 | +for example, typically only uses the repository database. And the | |
| 60 | +[/help/all | fossil setting] command only opens the configuration database | |
| 61 | +when the --global option is used. But other commands use all three | |
| 62 | +databases at once. For example, the [/help/status | fossil status] | |
| 63 | +command will first locate the checkout database, then use the checkout | |
| 64 | +database to find the repository database, then open the configuration | |
| 65 | +database. Whenever multiple databases are used at the same time, | |
| 66 | +they are all opened on the same SQLite database connection using | |
| 67 | +SQLite's [http://www.sqlite.org/lang_attach.html | ATTACH] command. | |
| 57 | 68 | |
| 58 | 69 | The chart below provides a quick summary of how each of these |
| 59 | 70 | database files are used by Fossil, with detailed discussion following. |
| 60 | 71 | |
| 61 | 72 | <center><table border="1" width="80%" cellpadding="0"> |
| @@ -80,10 +91,11 @@ | ||
| 80 | 91 | </ul> |
| 81 | 92 | </td> |
| 82 | 93 | <td width="33%" valign="top"> |
| 83 | 94 | <h3 align="center">Checkout Database<br>"_FOSSIL_"</h3> |
| 84 | 95 | <ul> |
| 96 | +<li>The repository database used by this checkout | |
| 85 | 97 | <li>The version currently checked out |
| 86 | 98 | <li>Other versions [/help/merge | merged] in but not |
| 87 | 99 | yet [/help/commit | committed] |
| 88 | 100 | <li>Changes from the [/help/add | add], [/help/delete | delete], |
| 89 | 101 | and [/help/rename | rename] commands that have not yet been committed |
| @@ -97,10 +109,71 @@ | ||
| 97 | 109 | </table> |
| 98 | 110 | </center> |
| 99 | 111 | |
| 100 | 112 | <h3>2.1 The Configuration Database</h3> |
| 101 | 113 | |
| 114 | +The configuration database holds cross-repository preferences and a list of all | |
| 115 | +repositories for a single user. | |
| 116 | + | |
| 117 | +The [/help/setting | fossil setting] command can be used to specify various | |
| 118 | +operating parameters and preferences for Fossil repositories. Settings can | |
| 119 | +apply to a single repository, or they can apply globally to all repositories | |
| 120 | +for a user. If both a global and a repository value exists for a setting, | |
| 121 | +then the repository-specific value takes precedence. All of the settings | |
| 122 | +have reasonable defaults, and so many users will never need to change them. | |
| 123 | +But if changes to settings are desired, the configuration database provides | |
| 124 | +a why to change settings for all repositories with a single command, rather | |
| 125 | +than having to change the setting individually on each repository. | |
| 102 | 126 | |
| 127 | +The configuration database also maintains a list of respositories. This | |
| 128 | +list is used by the [/help/all | fossil all] command in order to run various | |
| 129 | +operations such as "sync" or "rebuild" on all repositories managed by a user. | |
| 103 | 130 | |
| 131 | +On unix systems, the configuration database is named ".fossil" and is | |
| 132 | +located in the user's home directory. On windows, the configuration | |
| 133 | +database is named "_fossil" (using an underscore as the first character | |
| 134 | +instead of a dot) and is located in the directory specified by the | |
| 135 | +LOCALAPPDATA, APPDATA, or HOMEPATH environment variables, in that order. | |
| 104 | 136 | |
| 105 | 137 | <h3>2.2 Repository Databases</h3> |
| 138 | + | |
| 139 | +The repository database is the file that is commonly referred to as | |
| 140 | +"the repository". This is because the responsitory database contains, | |
| 141 | +among other than, the complete revision, ticket, and wiki history for | |
| 142 | +a project. It is customary to name the respository database after then | |
| 143 | +name of the project, with a ".fossil" suffix. For example, the respository | |
| 144 | +database for the self-hosting Fossil repository is called "fossil.fossil" | |
| 145 | +and the repository database for SQLite is called "sqlite.fossil". | |
| 146 | + | |
| 147 | +<h4>2.2.1 Global Project State</h4> | |
| 148 | + | |
| 149 | +The bulk of the repository database (typically 75 to 85%) consists | |
| 150 | +of the artifacts that comprise the | |
| 151 | +[./fileformat.wiki | enduring, global, shared state] of the project. | |
| 152 | +The artifacts are stored as BLOBs, compressed using | |
| 153 | +[http://www.zlib.net/ | zlib compression] and, where applicable, | |
| 154 | +using [./delta_encoder_algorithm.wiki | delta compression]. | |
| 155 | +The combination of zlib and delta compression results in a considerable | |
| 156 | +space savings. For the SQLite project, at the time of this writing, | |
| 157 | +the total size of all artifacts is over 1.7 GB but thanks to the | |
| 158 | +combined zlib and delta compression, that content only takes up | |
| 159 | +51.4 MB of space in the repository database, for a compression ratio | |
| 160 | +of about 33 to 1. | |
| 161 | + | |
| 162 | +Note that the zlib and delta compression is not an inherient part of | |
| 163 | +Fossil file format; it is just an optimization. | |
| 164 | +The enduring file format for Fossil is the unordered | |
| 165 | +set of artifacts and the compression techniques are just a detail of | |
| 166 | +how the current implementation of Fossil happens to store these artifacts | |
| 167 | +efficiently on disk. | |
| 168 | + | |
| 169 | +All of the original uncompressed and undeltaed artifacts can be extracted | |
| 170 | +from a Fossil repository database using | |
| 171 | +the [/help/deconstruct | fossil deconstruct] | |
| 172 | +command. Going the other way, the [/help/reconstruct | fossil reconstruct] | |
| 173 | +command will scan a directory hierarchy and add all files found to | |
| 174 | +a new repository database. The [/help/artifact | fossil artifact] command | |
| 175 | +can be used to extract individual artifacts from the repository database. | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 106 | 179 | <h3>2.3 Checkout Databases</h3> |
| 107 | 180 |
| --- www/tech_overview.wiki | |
| +++ www/tech_overview.wiki | |
| @@ -4,23 +4,23 @@ | |
| 4 | </h2> |
| 5 | |
| 6 | <h2>1.0 Introduction</h2> |
| 7 | |
| 8 | At its lowest level, a Fossil repository consists of an unordered set |
| 9 | of immutable "artifacts". Think of these artifacts as "files", since in |
| 10 | many cases the artifacts do indeed exactly correspond to source code files |
| 11 | that are stored in the Fossil repostory. But other "control artifacts" |
| 12 | are also included in the mix. These control artifacts define the relationships |
| 13 | between artifacts - which files go together to form a particular |
| 14 | version of the project, who checked in that version and when, what was |
| 15 | the check-in comment, what wiki pages are included with the project, what |
| 16 | are the edit histories of each wiki page, what bug reports or tickets are |
| 17 | included, who contributed to the evolution of each ticket, and so forth, |
| 18 | and so on. This low-level file format is called the "global state" of |
| 19 | the repository, since this is the information that is synced to peer |
| 20 | repositories using push and pull operations. the low-level file format |
| 21 | is also called "enduring" since it is intended to last for generations. |
| 22 | The details of the low-level, enduring, global file format |
| 23 | are [./fileformat.wiki | described separately]. |
| 24 | |
| 25 | This article is about how Fossil is currently implemented. Instead of |
| 26 | dealing with vague abstractions of "enduring file formats" as the |
| @@ -29,19 +29,19 @@ | |
| 29 | |
| 30 | <h2>2.0 Three Databases</h2> |
| 31 | |
| 32 | Fossil stores state information in |
| 33 | [http://www.sqlite.org/ | SQLite] database files. |
| 34 | SQLite stores an entire relational database, including multiple tables and |
| 35 | indices, in a single disk file. The SQLite library allows the database |
| 36 | files to be efficiently queried and updated using the industry-standard |
| 37 | SQL language. And SQLite makes updates to these database files atomic, |
| 38 | even in the face of system crashes and power failures, meaning that even |
| 39 | a power loss in the middle of a commit will not damage the Fossil repository |
| 40 | content. |
| 41 | |
| 42 | Fossil uses three separate SQLite databases: |
| 43 | |
| 44 | <ol> |
| 45 | <li>The configuration database |
| 46 | <li>Repository databases |
| 47 | <li>Checkout databases |
| @@ -52,10 +52,21 @@ | |
| 52 | repository database per project. The repository database is the |
| 53 | file that people are normally referring to when they say |
| 54 | "a Fossil repository". The checkout database is found in the working |
| 55 | checkout for a project and contains state information that is unique |
| 56 | to that working checkout. |
| 57 | |
| 58 | The chart below provides a quick summary of how each of these |
| 59 | database files are used by Fossil, with detailed discussion following. |
| 60 | |
| 61 | <center><table border="1" width="80%" cellpadding="0"> |
| @@ -80,10 +91,11 @@ | |
| 80 | </ul> |
| 81 | </td> |
| 82 | <td width="33%" valign="top"> |
| 83 | <h3 align="center">Checkout Database<br>"_FOSSIL_"</h3> |
| 84 | <ul> |
| 85 | <li>The version currently checked out |
| 86 | <li>Other versions [/help/merge | merged] in but not |
| 87 | yet [/help/commit | committed] |
| 88 | <li>Changes from the [/help/add | add], [/help/delete | delete], |
| 89 | and [/help/rename | rename] commands that have not yet been committed |
| @@ -97,10 +109,71 @@ | |
| 97 | </table> |
| 98 | </center> |
| 99 | |
| 100 | <h3>2.1 The Configuration Database</h3> |
| 101 | |
| 102 | |
| 103 | |
| 104 | |
| 105 | <h3>2.2 Repository Databases</h3> |
| 106 | <h3>2.3 Checkout Databases</h3> |
| 107 |
| --- www/tech_overview.wiki | |
| +++ www/tech_overview.wiki | |
| @@ -4,23 +4,23 @@ | |
| 4 | </h2> |
| 5 | |
| 6 | <h2>1.0 Introduction</h2> |
| 7 | |
| 8 | At its lowest level, a Fossil repository consists of an unordered set |
| 9 | of immutable "artifacts". You might think of these artifacts as "files", |
| 10 | since in many cases the artifacts exactly correspond to source code files |
| 11 | that are stored in the Fossil repostory. But other "control artifacts" |
| 12 | are also included in the mix. These control artifacts define the relationships |
| 13 | between artifacts - which files go together to form a particular |
| 14 | version of the project, who checked in that version and when, what was |
| 15 | the check-in comment, what wiki pages are included with the project, what |
| 16 | are the edit histories of each wiki page, what bug reports or tickets are |
| 17 | included, who contributed to the evolution of each ticket, and so forth, |
| 18 | and so on. This low-level file format is called the "global state" of |
| 19 | the repository, since this is the information that is synced to peer |
| 20 | repositories using push and pull operations. The low-level file format |
| 21 | is also called "enduring" since it is intended to last for many years. |
| 22 | The details of the low-level, enduring, global file format |
| 23 | are [./fileformat.wiki | described separately]. |
| 24 | |
| 25 | This article is about how Fossil is currently implemented. Instead of |
| 26 | dealing with vague abstractions of "enduring file formats" as the |
| @@ -29,19 +29,19 @@ | |
| 29 | |
| 30 | <h2>2.0 Three Databases</h2> |
| 31 | |
| 32 | Fossil stores state information in |
| 33 | [http://www.sqlite.org/ | SQLite] database files. |
| 34 | SQLite keeps an entire relational database, including multiple tables and |
| 35 | indices, in a single disk file. The SQLite library allows the database |
| 36 | files to be efficiently queried and updated using the industry-standard |
| 37 | SQL language. And SQLite makes updates to these database files atomic, |
| 38 | even if a system crashe or power failure occurs in the middle of the |
| 39 | update, meaning that repository content is protected even during severe |
| 40 | malfunctions. |
| 41 | |
| 42 | Fossil uses three separate classes of SQLite databases: |
| 43 | |
| 44 | <ol> |
| 45 | <li>The configuration database |
| 46 | <li>Repository databases |
| 47 | <li>Checkout databases |
| @@ -52,10 +52,21 @@ | |
| 52 | repository database per project. The repository database is the |
| 53 | file that people are normally referring to when they say |
| 54 | "a Fossil repository". The checkout database is found in the working |
| 55 | checkout for a project and contains state information that is unique |
| 56 | to that working checkout. |
| 57 | |
| 58 | Fossil does not always use all three databaes files. The web interface, |
| 59 | for example, typically only uses the repository database. And the |
| 60 | [/help/all | fossil setting] command only opens the configuration database |
| 61 | when the --global option is used. But other commands use all three |
| 62 | databases at once. For example, the [/help/status | fossil status] |
| 63 | command will first locate the checkout database, then use the checkout |
| 64 | database to find the repository database, then open the configuration |
| 65 | database. Whenever multiple databases are used at the same time, |
| 66 | they are all opened on the same SQLite database connection using |
| 67 | SQLite's [http://www.sqlite.org/lang_attach.html | ATTACH] command. |
| 68 | |
| 69 | The chart below provides a quick summary of how each of these |
| 70 | database files are used by Fossil, with detailed discussion following. |
| 71 | |
| 72 | <center><table border="1" width="80%" cellpadding="0"> |
| @@ -80,10 +91,11 @@ | |
| 91 | </ul> |
| 92 | </td> |
| 93 | <td width="33%" valign="top"> |
| 94 | <h3 align="center">Checkout Database<br>"_FOSSIL_"</h3> |
| 95 | <ul> |
| 96 | <li>The repository database used by this checkout |
| 97 | <li>The version currently checked out |
| 98 | <li>Other versions [/help/merge | merged] in but not |
| 99 | yet [/help/commit | committed] |
| 100 | <li>Changes from the [/help/add | add], [/help/delete | delete], |
| 101 | and [/help/rename | rename] commands that have not yet been committed |
| @@ -97,10 +109,71 @@ | |
| 109 | </table> |
| 110 | </center> |
| 111 | |
| 112 | <h3>2.1 The Configuration Database</h3> |
| 113 | |
| 114 | The configuration database holds cross-repository preferences and a list of all |
| 115 | repositories for a single user. |
| 116 | |
| 117 | The [/help/setting | fossil setting] command can be used to specify various |
| 118 | operating parameters and preferences for Fossil repositories. Settings can |
| 119 | apply to a single repository, or they can apply globally to all repositories |
| 120 | for a user. If both a global and a repository value exists for a setting, |
| 121 | then the repository-specific value takes precedence. All of the settings |
| 122 | have reasonable defaults, and so many users will never need to change them. |
| 123 | But if changes to settings are desired, the configuration database provides |
| 124 | a why to change settings for all repositories with a single command, rather |
| 125 | than having to change the setting individually on each repository. |
| 126 | |
| 127 | The configuration database also maintains a list of respositories. This |
| 128 | list is used by the [/help/all | fossil all] command in order to run various |
| 129 | operations such as "sync" or "rebuild" on all repositories managed by a user. |
| 130 | |
| 131 | On unix systems, the configuration database is named ".fossil" and is |
| 132 | located in the user's home directory. On windows, the configuration |
| 133 | database is named "_fossil" (using an underscore as the first character |
| 134 | instead of a dot) and is located in the directory specified by the |
| 135 | LOCALAPPDATA, APPDATA, or HOMEPATH environment variables, in that order. |
| 136 | |
| 137 | <h3>2.2 Repository Databases</h3> |
| 138 | |
| 139 | The repository database is the file that is commonly referred to as |
| 140 | "the repository". This is because the responsitory database contains, |
| 141 | among other than, the complete revision, ticket, and wiki history for |
| 142 | a project. It is customary to name the respository database after then |
| 143 | name of the project, with a ".fossil" suffix. For example, the respository |
| 144 | database for the self-hosting Fossil repository is called "fossil.fossil" |
| 145 | and the repository database for SQLite is called "sqlite.fossil". |
| 146 | |
| 147 | <h4>2.2.1 Global Project State</h4> |
| 148 | |
| 149 | The bulk of the repository database (typically 75 to 85%) consists |
| 150 | of the artifacts that comprise the |
| 151 | [./fileformat.wiki | enduring, global, shared state] of the project. |
| 152 | The artifacts are stored as BLOBs, compressed using |
| 153 | [http://www.zlib.net/ | zlib compression] and, where applicable, |
| 154 | using [./delta_encoder_algorithm.wiki | delta compression]. |
| 155 | The combination of zlib and delta compression results in a considerable |
| 156 | space savings. For the SQLite project, at the time of this writing, |
| 157 | the total size of all artifacts is over 1.7 GB but thanks to the |
| 158 | combined zlib and delta compression, that content only takes up |
| 159 | 51.4 MB of space in the repository database, for a compression ratio |
| 160 | of about 33 to 1. |
| 161 | |
| 162 | Note that the zlib and delta compression is not an inherient part of |
| 163 | Fossil file format; it is just an optimization. |
| 164 | The enduring file format for Fossil is the unordered |
| 165 | set of artifacts and the compression techniques are just a detail of |
| 166 | how the current implementation of Fossil happens to store these artifacts |
| 167 | efficiently on disk. |
| 168 | |
| 169 | All of the original uncompressed and undeltaed artifacts can be extracted |
| 170 | from a Fossil repository database using |
| 171 | the [/help/deconstruct | fossil deconstruct] |
| 172 | command. Going the other way, the [/help/reconstruct | fossil reconstruct] |
| 173 | command will scan a directory hierarchy and add all files found to |
| 174 | a new repository database. The [/help/artifact | fossil artifact] command |
| 175 | can be used to extract individual artifacts from the repository database. |
| 176 | |
| 177 | |
| 178 | |
| 179 | <h3>2.3 Checkout Databases</h3> |
| 180 |