Fossil SCM

Continuing work on the tech_overview document. Still far from complete. This is merely an incremental check-in.

drh 2010-12-26 15:42 trunk
Commit dabc1105ba85c5dd96fbaf3b2932a6d680e2fb50
1 file changed +82 -9
--- www/tech_overview.wiki
+++ www/tech_overview.wiki
@@ -4,23 +4,23 @@
44
</h2>
55
66
<h2>1.0 Introduction</h2>
77
88
At its lowest level, a Fossil repository consists of an unordered set
9
-of immutable "artifacts". Think of these artifacts as "files", since in
10
-many cases the artifacts do indeed exactly correspond to source code files
9
+of immutable "artifacts". You might think of these artifacts as "files",
10
+since in many cases the artifacts exactly correspond to source code files
1111
that are stored in the Fossil repostory. But other "control artifacts"
1212
are also included in the mix. These control artifacts define the relationships
1313
between artifacts - which files go together to form a particular
1414
version of the project, who checked in that version and when, what was
1515
the check-in comment, what wiki pages are included with the project, what
1616
are the edit histories of each wiki page, what bug reports or tickets are
1717
included, who contributed to the evolution of each ticket, and so forth,
1818
and so on. This low-level file format is called the "global state" of
1919
the repository, since this is the information that is synced to peer
20
-repositories using push and pull operations. the low-level file format
21
-is also called "enduring" since it is intended to last for generations.
20
+repositories using push and pull operations. The low-level file format
21
+is also called "enduring" since it is intended to last for many years.
2222
The details of the low-level, enduring, global file format
2323
are [./fileformat.wiki | described separately].
2424
2525
This article is about how Fossil is currently implemented. Instead of
2626
dealing with vague abstractions of "enduring file formats" as the
@@ -29,19 +29,19 @@
2929
3030
<h2>2.0 Three Databases</h2>
3131
3232
Fossil stores state information in
3333
[http://www.sqlite.org/ | SQLite] database files.
34
-SQLite stores an entire relational database, including multiple tables and
34
+SQLite keeps an entire relational database, including multiple tables and
3535
indices, in a single disk file. The SQLite library allows the database
3636
files to be efficiently queried and updated using the industry-standard
3737
SQL language. And SQLite makes updates to these database files atomic,
38
-even in the face of system crashes and power failures, meaning that even
39
-a power loss in the middle of a commit will not damage the Fossil repository
40
-content.
38
+even if a system crashe or power failure occurs in the middle of the
39
+update, meaning that repository content is protected even during severe
40
+malfunctions.
4141
42
-Fossil uses three separate SQLite databases:
42
+Fossil uses three separate classes of SQLite databases:
4343
4444
<ol>
4545
<li>The configuration database
4646
<li>Repository databases
4747
<li>Checkout databases
@@ -52,10 +52,21 @@
5252
repository database per project. The repository database is the
5353
file that people are normally referring to when they say
5454
"a Fossil repository". The checkout database is found in the working
5555
checkout for a project and contains state information that is unique
5656
to that working checkout.
57
+
58
+Fossil does not always use all three databaes files. The web interface,
59
+for example, typically only uses the repository database. And the
60
+[/help/all | fossil setting] command only opens the configuration database
61
+when the --global option is used. But other commands use all three
62
+databases at once. For example, the [/help/status | fossil status]
63
+command will first locate the checkout database, then use the checkout
64
+database to find the repository database, then open the configuration
65
+database. Whenever multiple databases are used at the same time,
66
+they are all opened on the same SQLite database connection using
67
+SQLite's [http://www.sqlite.org/lang_attach.html | ATTACH] command.
5768
5869
The chart below provides a quick summary of how each of these
5970
database files are used by Fossil, with detailed discussion following.
6071
6172
<center><table border="1" width="80%" cellpadding="0">
@@ -80,10 +91,11 @@
8091
</ul>
8192
</td>
8293
<td width="33%" valign="top">
8394
<h3 align="center">Checkout Database<br>"_FOSSIL_"</h3>
8495
<ul>
96
+<li>The repository database used by this checkout
8597
<li>The version currently checked out
8698
<li>Other versions [/help/merge | merged] in but not
8799
yet [/help/commit | committed]
88100
<li>Changes from the [/help/add | add], [/help/delete | delete],
89101
and [/help/rename | rename] commands that have not yet been committed
@@ -97,10 +109,71 @@
97109
</table>
98110
</center>
99111
100112
<h3>2.1 The Configuration Database</h3>
101113
114
+The configuration database holds cross-repository preferences and a list of all
115
+repositories for a single user.
116
+
117
+The [/help/setting | fossil setting] command can be used to specify various
118
+operating parameters and preferences for Fossil repositories. Settings can
119
+apply to a single repository, or they can apply globally to all repositories
120
+for a user. If both a global and a repository value exists for a setting,
121
+then the repository-specific value takes precedence. All of the settings
122
+have reasonable defaults, and so many users will never need to change them.
123
+But if changes to settings are desired, the configuration database provides
124
+a why to change settings for all repositories with a single command, rather
125
+than having to change the setting individually on each repository.
102126
127
+The configuration database also maintains a list of respositories. This
128
+list is used by the [/help/all | fossil all] command in order to run various
129
+operations such as "sync" or "rebuild" on all repositories managed by a user.
103130
131
+On unix systems, the configuration database is named ".fossil" and is
132
+located in the user's home directory. On windows, the configuration
133
+database is named "_fossil" (using an underscore as the first character
134
+instead of a dot) and is located in the directory specified by the
135
+LOCALAPPDATA, APPDATA, or HOMEPATH environment variables, in that order.
104136
105137
<h3>2.2 Repository Databases</h3>
138
+
139
+The repository database is the file that is commonly referred to as
140
+"the repository". This is because the responsitory database contains,
141
+among other than, the complete revision, ticket, and wiki history for
142
+a project. It is customary to name the respository database after then
143
+name of the project, with a ".fossil" suffix. For example, the respository
144
+database for the self-hosting Fossil repository is called "fossil.fossil"
145
+and the repository database for SQLite is called "sqlite.fossil".
146
+
147
+<h4>2.2.1 Global Project State</h4>
148
+
149
+The bulk of the repository database (typically 75 to 85%) consists
150
+of the artifacts that comprise the
151
+[./fileformat.wiki | enduring, global, shared state] of the project.
152
+The artifacts are stored as BLOBs, compressed using
153
+[http://www.zlib.net/ | zlib compression] and, where applicable,
154
+using [./delta_encoder_algorithm.wiki | delta compression].
155
+The combination of zlib and delta compression results in a considerable
156
+space savings. For the SQLite project, at the time of this writing,
157
+the total size of all artifacts is over 1.7 GB but thanks to the
158
+combined zlib and delta compression, that content only takes up
159
+51.4 MB of space in the repository database, for a compression ratio
160
+of about 33 to 1.
161
+
162
+Note that the zlib and delta compression is not an inherient part of
163
+Fossil file format; it is just an optimization.
164
+The enduring file format for Fossil is the unordered
165
+set of artifacts and the compression techniques are just a detail of
166
+how the current implementation of Fossil happens to store these artifacts
167
+efficiently on disk.
168
+
169
+All of the original uncompressed and undeltaed artifacts can be extracted
170
+from a Fossil repository database using
171
+the [/help/deconstruct | fossil deconstruct]
172
+command. Going the other way, the [/help/reconstruct | fossil reconstruct]
173
+command will scan a directory hierarchy and add all files found to
174
+a new repository database. The [/help/artifact | fossil artifact] command
175
+can be used to extract individual artifacts from the repository database.
176
+
177
+
178
+
106179
<h3>2.3 Checkout Databases</h3>
107180
--- www/tech_overview.wiki
+++ www/tech_overview.wiki
@@ -4,23 +4,23 @@
4 </h2>
5
6 <h2>1.0 Introduction</h2>
7
8 At its lowest level, a Fossil repository consists of an unordered set
9 of immutable "artifacts". Think of these artifacts as "files", since in
10 many cases the artifacts do indeed exactly correspond to source code files
11 that are stored in the Fossil repostory. But other "control artifacts"
12 are also included in the mix. These control artifacts define the relationships
13 between artifacts - which files go together to form a particular
14 version of the project, who checked in that version and when, what was
15 the check-in comment, what wiki pages are included with the project, what
16 are the edit histories of each wiki page, what bug reports or tickets are
17 included, who contributed to the evolution of each ticket, and so forth,
18 and so on. This low-level file format is called the "global state" of
19 the repository, since this is the information that is synced to peer
20 repositories using push and pull operations. the low-level file format
21 is also called "enduring" since it is intended to last for generations.
22 The details of the low-level, enduring, global file format
23 are [./fileformat.wiki | described separately].
24
25 This article is about how Fossil is currently implemented. Instead of
26 dealing with vague abstractions of "enduring file formats" as the
@@ -29,19 +29,19 @@
29
30 <h2>2.0 Three Databases</h2>
31
32 Fossil stores state information in
33 [http://www.sqlite.org/ | SQLite] database files.
34 SQLite stores an entire relational database, including multiple tables and
35 indices, in a single disk file. The SQLite library allows the database
36 files to be efficiently queried and updated using the industry-standard
37 SQL language. And SQLite makes updates to these database files atomic,
38 even in the face of system crashes and power failures, meaning that even
39 a power loss in the middle of a commit will not damage the Fossil repository
40 content.
41
42 Fossil uses three separate SQLite databases:
43
44 <ol>
45 <li>The configuration database
46 <li>Repository databases
47 <li>Checkout databases
@@ -52,10 +52,21 @@
52 repository database per project. The repository database is the
53 file that people are normally referring to when they say
54 "a Fossil repository". The checkout database is found in the working
55 checkout for a project and contains state information that is unique
56 to that working checkout.
 
 
 
 
 
 
 
 
 
 
 
57
58 The chart below provides a quick summary of how each of these
59 database files are used by Fossil, with detailed discussion following.
60
61 <center><table border="1" width="80%" cellpadding="0">
@@ -80,10 +91,11 @@
80 </ul>
81 </td>
82 <td width="33%" valign="top">
83 <h3 align="center">Checkout Database<br>"_FOSSIL_"</h3>
84 <ul>
 
85 <li>The version currently checked out
86 <li>Other versions [/help/merge | merged] in but not
87 yet [/help/commit | committed]
88 <li>Changes from the [/help/add | add], [/help/delete | delete],
89 and [/help/rename | rename] commands that have not yet been committed
@@ -97,10 +109,71 @@
97 </table>
98 </center>
99
100 <h3>2.1 The Configuration Database</h3>
101
 
 
 
 
 
 
 
 
 
 
 
 
102
 
 
 
103
 
 
 
 
 
104
105 <h3>2.2 Repository Databases</h3>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106 <h3>2.3 Checkout Databases</h3>
107
--- www/tech_overview.wiki
+++ www/tech_overview.wiki
@@ -4,23 +4,23 @@
4 </h2>
5
6 <h2>1.0 Introduction</h2>
7
8 At its lowest level, a Fossil repository consists of an unordered set
9 of immutable "artifacts". You might think of these artifacts as "files",
10 since in many cases the artifacts exactly correspond to source code files
11 that are stored in the Fossil repostory. But other "control artifacts"
12 are also included in the mix. These control artifacts define the relationships
13 between artifacts - which files go together to form a particular
14 version of the project, who checked in that version and when, what was
15 the check-in comment, what wiki pages are included with the project, what
16 are the edit histories of each wiki page, what bug reports or tickets are
17 included, who contributed to the evolution of each ticket, and so forth,
18 and so on. This low-level file format is called the "global state" of
19 the repository, since this is the information that is synced to peer
20 repositories using push and pull operations. The low-level file format
21 is also called "enduring" since it is intended to last for many years.
22 The details of the low-level, enduring, global file format
23 are [./fileformat.wiki | described separately].
24
25 This article is about how Fossil is currently implemented. Instead of
26 dealing with vague abstractions of "enduring file formats" as the
@@ -29,19 +29,19 @@
29
30 <h2>2.0 Three Databases</h2>
31
32 Fossil stores state information in
33 [http://www.sqlite.org/ | SQLite] database files.
34 SQLite keeps an entire relational database, including multiple tables and
35 indices, in a single disk file. The SQLite library allows the database
36 files to be efficiently queried and updated using the industry-standard
37 SQL language. And SQLite makes updates to these database files atomic,
38 even if a system crashe or power failure occurs in the middle of the
39 update, meaning that repository content is protected even during severe
40 malfunctions.
41
42 Fossil uses three separate classes of SQLite databases:
43
44 <ol>
45 <li>The configuration database
46 <li>Repository databases
47 <li>Checkout databases
@@ -52,10 +52,21 @@
52 repository database per project. The repository database is the
53 file that people are normally referring to when they say
54 "a Fossil repository". The checkout database is found in the working
55 checkout for a project and contains state information that is unique
56 to that working checkout.
57
58 Fossil does not always use all three databaes files. The web interface,
59 for example, typically only uses the repository database. And the
60 [/help/all | fossil setting] command only opens the configuration database
61 when the --global option is used. But other commands use all three
62 databases at once. For example, the [/help/status | fossil status]
63 command will first locate the checkout database, then use the checkout
64 database to find the repository database, then open the configuration
65 database. Whenever multiple databases are used at the same time,
66 they are all opened on the same SQLite database connection using
67 SQLite's [http://www.sqlite.org/lang_attach.html | ATTACH] command.
68
69 The chart below provides a quick summary of how each of these
70 database files are used by Fossil, with detailed discussion following.
71
72 <center><table border="1" width="80%" cellpadding="0">
@@ -80,10 +91,11 @@
91 </ul>
92 </td>
93 <td width="33%" valign="top">
94 <h3 align="center">Checkout Database<br>"_FOSSIL_"</h3>
95 <ul>
96 <li>The repository database used by this checkout
97 <li>The version currently checked out
98 <li>Other versions [/help/merge | merged] in but not
99 yet [/help/commit | committed]
100 <li>Changes from the [/help/add | add], [/help/delete | delete],
101 and [/help/rename | rename] commands that have not yet been committed
@@ -97,10 +109,71 @@
109 </table>
110 </center>
111
112 <h3>2.1 The Configuration Database</h3>
113
114 The configuration database holds cross-repository preferences and a list of all
115 repositories for a single user.
116
117 The [/help/setting | fossil setting] command can be used to specify various
118 operating parameters and preferences for Fossil repositories. Settings can
119 apply to a single repository, or they can apply globally to all repositories
120 for a user. If both a global and a repository value exists for a setting,
121 then the repository-specific value takes precedence. All of the settings
122 have reasonable defaults, and so many users will never need to change them.
123 But if changes to settings are desired, the configuration database provides
124 a why to change settings for all repositories with a single command, rather
125 than having to change the setting individually on each repository.
126
127 The configuration database also maintains a list of respositories. This
128 list is used by the [/help/all | fossil all] command in order to run various
129 operations such as "sync" or "rebuild" on all repositories managed by a user.
130
131 On unix systems, the configuration database is named ".fossil" and is
132 located in the user's home directory. On windows, the configuration
133 database is named "_fossil" (using an underscore as the first character
134 instead of a dot) and is located in the directory specified by the
135 LOCALAPPDATA, APPDATA, or HOMEPATH environment variables, in that order.
136
137 <h3>2.2 Repository Databases</h3>
138
139 The repository database is the file that is commonly referred to as
140 "the repository". This is because the responsitory database contains,
141 among other than, the complete revision, ticket, and wiki history for
142 a project. It is customary to name the respository database after then
143 name of the project, with a ".fossil" suffix. For example, the respository
144 database for the self-hosting Fossil repository is called "fossil.fossil"
145 and the repository database for SQLite is called "sqlite.fossil".
146
147 <h4>2.2.1 Global Project State</h4>
148
149 The bulk of the repository database (typically 75 to 85%) consists
150 of the artifacts that comprise the
151 [./fileformat.wiki | enduring, global, shared state] of the project.
152 The artifacts are stored as BLOBs, compressed using
153 [http://www.zlib.net/ | zlib compression] and, where applicable,
154 using [./delta_encoder_algorithm.wiki | delta compression].
155 The combination of zlib and delta compression results in a considerable
156 space savings. For the SQLite project, at the time of this writing,
157 the total size of all artifacts is over 1.7 GB but thanks to the
158 combined zlib and delta compression, that content only takes up
159 51.4 MB of space in the repository database, for a compression ratio
160 of about 33 to 1.
161
162 Note that the zlib and delta compression is not an inherient part of
163 Fossil file format; it is just an optimization.
164 The enduring file format for Fossil is the unordered
165 set of artifacts and the compression techniques are just a detail of
166 how the current implementation of Fossil happens to store these artifacts
167 efficiently on disk.
168
169 All of the original uncompressed and undeltaed artifacts can be extracted
170 from a Fossil repository database using
171 the [/help/deconstruct | fossil deconstruct]
172 command. Going the other way, the [/help/reconstruct | fossil reconstruct]
173 command will scan a directory hierarchy and add all files found to
174 a new repository database. The [/help/artifact | fossil artifact] command
175 can be used to extract individual artifacts from the repository database.
176
177
178
179 <h3>2.3 Checkout Databases</h3>
180

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button