Fossil SCM

Update to the [/doc/trunk/www/stats.wiki | Stats] document.

drh 2012-02-25 15:31 trunk
Commit affb0019c9068467a6fe7cfbc76d0ca233721be3
1 file changed +92 -98
+92 -98
--- www/stats.wiki
+++ www/stats.wiki
@@ -2,13 +2,14 @@
22
<h1 align="center">Performance Statistics</h1>
33
44
The questions will inevitably arise: How does Fossil perform?
55
Does it use a lot of disk space or bandwidth? Is it scalable?
66
7
-In an attempt to answers these questions, this report looks at five
7
+In an attempt to answers these questions, this report looks at several
88
projects that use fossil for configuration management and examines how
99
well they are working. The following table is a summary of the results.
10
+(Last updated on 2012-02-26.)
1011
Explanation and analysis follows the table.
1112
1213
<table border=1>
1314
<tr>
1415
<th>Project</th>
@@ -21,122 +22,113 @@
2122
<th>Compression Ratio</th>
2223
<th>Clone Bandwidth</th>
2324
</tr>
2425
2526
<tr align="center">
26
-<td>SQLite
27
-<td>28643
28
-<td>6755
29
-<td>3373&nbsp;days<br>9.24&nbsp;yrs
30
-<td>2.00
31
-<td>1.27&nbsp;GB
32
-<td>35.4&nbsp;MB
33
-<td>35:1
34
-<td>982&nbsp;KB&nbsp;up<br>12.4&nbsp;MB&nbsp;down
35
-</tr>
36
-
37
-<tr align="center">
38
-<td>Fossil
39
-<td>4981
40
-<td>1272
41
-<td>764&nbsp;days<br>2.1&nbsp;yrs
42
-<td>1.66
43
-<td>144&nbsp;MB
44
-<td>8.74&nbsp;MB
45
-<td>16:1
46
-<td>128&nbsp;KB&nbsp;up<br>4.49&nbsp;MB&nbsp;down
47
-</tr>
48
-
49
-<tr align="center">
50
-<td>SLT
51
-<td>2062
52
-<td>67
53
-<td>266&nbsp;days
54
-<td>0.25
55
-<td>1.76&nbsp;GB
56
-<td>147&nbsp;MB
57
-<td>11:1
58
-<td>1.1&nbsp;MB&nbsp;up<br>141&nbsp;MB&nbsp;down
59
-</tr>
60
-
61
-<tr align="center">
62
-<td>TH3
63
-<td>1999
64
-<td>429
65
-<td>331&nbsp;days
66
-<td>1.30
67
-<td>70.5&nbsp;MB
68
-<td>6.3&nbsp;MB
69
-<td>11:1
70
-<td>55&nbsp;KB&nbsp;up<br>4.66&nbsp;MB&nbsp;down
71
-</tr>
72
-
73
-<tr align="center">
74
-<td>SQLite Docs
75
-<td>1787
76
-<td>444
77
-<td>650&nbsp;days<br>1.78&nbsp;yrs
78
-<td>0.68
79
-<td>43&nbsp;MB
80
-<td>4.9&nbsp;MB
81
-<td>8:1
82
-<td>46&nbsp;KB&nbsp;up<br>3.35&nbsp;MB&nbsp;down
27
+<td>[http://www.sqlite.org/src/timeline | SQLite]
28
+<td>41113
29
+<td>9943
30
+<td>4290&nbsp;days<br>11.75&nbsp;yrs
31
+<td>2.32
32
+<td>2.09&nbsp;GB
33
+<td>33.2&nbsp;MB
34
+<td>63:1
35
+<td>23.2&nbsp;MB
36
+</tr>
37
+
38
+<tr align="center">
39
+<td>[http://core.tcl.tk/tcl/timeline | TCL]
40
+<td>74806
41
+<td>13541
42
+<td>5085&nbsp;days<br>13.92&nbsp;yrs
43
+<td>2.66
44
+<td>5.2&nbsp;GB
45
+<td>86&nbsp;MB
46
+<td>60:1
47
+<td>67.0&nbsp;MB
48
+</tr>
49
+
50
+<tr align="center">
51
+<td>[/timeline | Fossil]
52
+<td>15561
53
+<td>3764
54
+<td>1681&nbsp;days<br>4.6&nbsp;yrs
55
+<td>2.24
56
+<td>721&nbsp;MB
57
+<td>18.8&nbsp;MB
58
+<td>38:1
59
+<td>12.0&nbsp;MB
60
+</tr>
61
+
62
+<tr align="center">
63
+<td>[http://www.sqlite.org/slt/timeline | SLT]
64
+<td>2174
65
+<td>100
66
+<td>1183&nbsp;days<br>3.24&nbsp;yrs
67
+<td>0.08
68
+<td>1.94&nbsp;GB
69
+<td>143&nbsp;MB
70
+<td>12:1
71
+<td>141&nbsp;MB
72
+</tr>
73
+
74
+<tr align="center">
75
+<td>[http://www.sqlite.org/th3.html | TH3]
76
+<td>5624
77
+<td>1472
78
+<td>1248&nbsp;days<br>3.42&nbsp;yrs
79
+<td>1.78
80
+<td>252&nbsp;MB
81
+<td>12.5&nbsp;MB
82
+<td>20:1
83
+<td>12.2&nbsp;MB
84
+</tr>
85
+
86
+<tr align="center">
87
+<td>[http://www.sqlite.org/docsrc/timeline | SQLite Docs]
88
+<td>3664
89
+<td>1003
90
+<td>1567&nbsp;days<br>4.29&nbsp;yrs
91
+<td>0.64
92
+<td>108&nbsp;MB
93
+<td>6.6&nbsp;MB
94
+<td>16:1
95
+<td>5.71&nbsp;MB
8396
</tr>
8497
8598
</table>
8699
87
-<h2>The Five Projects</h2>
88
-
89
-The five projects listed above were chosen because they have been in
90
-existance for a long time (relative to the age of fossil) or because
91
-they have larges amounts of content. The most important project using
92
-fossil is SQLite. Fossil itself
93
-is built on top of SQLite and so obviously SQLite has to predate fossil.
94
-SQLite was originally versioned using CVS, but recently the entire 9-year
95
-and 320-MB CVS history of SQLite was converted over to Fossil. This is
96
-an important datapoint because it demonstrates fossil's ability to manage
97
-a significant and long-running project.
98
-The next-longest running fossil project is fossil itself, at 2.1 years.
99
-The documentation for SQLite
100
-(identified above as "SQLite Docs") was split off of the main SQLite
101
-source tree and into its own fossil repository about 1.75 years ago.
102
-The "SQL Logic Test" or "SLT" project is a massive
103
-collection of SQL statements and their output used to compare the
104
-processing of SQLite against MySQL, PostgreSQL, Microsoft SQL Server,
105
-and Oracle.
106
-Finally "TH3" is a proprietary set of test cases for SQLite used to give
107
-100% branch test coverage of SQLite on embedded platforms. All projects
108
-except for TH3 are open-source.
109
-
110100
<h2>Measured Attributes</h2>
111101
112
-In fossil, every version of every file, every wiki page, every change to
102
+In Fossil, every version of every file, every wiki page, every change to
113103
every ticket, and every check-in is a separate "artifact". One way to
114
-think of a fossil project is as a bag of artifacts. Of course, there is
115
-a lot more than this going on in fossil. Many of the artifacts have meaning
104
+think of a Fossil project is as a bag of artifacts. Of course, there is
105
+a lot more than this going on in Fossil. Many of the artifacts have meaning
116106
and are related to other artifacts. But at a low level (for example when
117107
synchronizing two instances of the same project) the only thing that matters
118108
is the unordered collection of artifacts. In fact, one of the key
119
-characteristics of fossil is that the entire project history can be
109
+characteristics of Fossil is that the entire project history can be
120110
reconstructed simply by scanning the artifacts in an arbitrary order.
121111
122112
The number of check-ins is the number of times that the "commit" command
123113
has been run. A single check-in might change a 3 or 4 files, or it might
124
-change several dozen different files. Regardless of the number of files
114
+change dozens or hundreds of files. Regardless of the number of files
125115
changed, it still only counts as one check-in.
126116
127117
The "Uncompressed Size" is the total size of all the artifacts within
128
-the fossil repository assuming they were all uncompressed and stored
118
+the repository assuming they were all uncompressed and stored
129119
separately on the disk. Fossil makes use of delta compression between related
130120
versions of the same file, and then uses zlib compression on the resulting
131121
deltas. The total resulting repository size is shown after the uncompressed
132
-size.
122
+size. For this chart, "fossil rebuild --compress" was run on each repository
123
+prior to measuring its compressed size. Repository sizes would typically
124
+be 20% larger without that rebuild.
133125
134126
On the right end of the table, we show the "Clone Bandwidth". This is the
135
-total number of bytes sent from client to server ("uplink") and from server
136
-back to client ("downlink") in order to clone a repository. These byte counts
137
-include HTTP protocol overhead.
127
+total number of bytes sent from server back to the client. The number of
128
+bytes sent from client to server is neglible in comparison.
129
+These byte counts include HTTP protocol overhead.
138130
139131
In the table and throughout this article,
140132
"GB" means gigabytes (10<sup><small>9</small></sup> bytes)
141133
not <a href="http://en.wikipedia.org/wiki/Gibibyte">gibibytes</a>
142134
(2<sup><small>30</small></sup> bytes). Similarly, "MB" and "KB"
@@ -144,22 +136,24 @@
144136
145137
<h2>Analysis And Supplimental Data</h2>
146138
147139
Perhaps the two most interesting datapoints in the above table are SQLite
148140
and SLT. SQLite is a long-running project with long revision chains.
149
-Some of the files in SQLite have been edited close to a thousand times.
141
+Some of the files in SQLite have been edited over a thousand times.
150142
Each of these edits is stored as a delta, and hence the SQLite project
151
-gets excellent 35:1 compression. SLT, on the other hand, consists of
143
+gets excellent 63:1 compression. SLT, on the other hand, consists of
152144
many large (megabyte-sized) SQL scripts that have one or maybe two
153
-versions. There is very little delta compression occurring and so the
145
+edits each. There is very little delta compression occurring and so the
154146
overall repository compression ratio is much lower. Note also that
155147
quite a bit more bandwidth is required to clone SLT than SQLite.
156148
157149
For the first nine years of its development, SQLite was versioned by CVS.
158150
The resulting CVS repository measured over 320MB in size. So, the
159
-developers were
160
-pleasently surprised to see that this entire project could be cloned in
161
-fossil using only about 13MB of network traffic. The "sync" protocol
151
+developers were surprised to see that this entire project could be cloned in
152
+fossil using only about 23.2MB of network traffic. (This 23.2MB includes
153
+all the changes to SQLite that have been made since the conversion from
154
+CVS. Of those changes are omitted, the clone bandwidth drops to 13MB.)
155
+The "sync" protocol
162156
used by fossil has turned out to be surprisingly efficient. A typical
163157
check-in on SQLite might use 3 or 4KB of network bandwidth total. Hardly
164158
worth measuring. The sync protocol is efficient enough that, once cloned,
165
-fossil could easily be used over a dial-up connection.
159
+Fossil could easily be used over a dial-up connection.
166160
--- www/stats.wiki
+++ www/stats.wiki
@@ -2,13 +2,14 @@
2 <h1 align="center">Performance Statistics</h1>
3
4 The questions will inevitably arise: How does Fossil perform?
5 Does it use a lot of disk space or bandwidth? Is it scalable?
6
7 In an attempt to answers these questions, this report looks at five
8 projects that use fossil for configuration management and examines how
9 well they are working. The following table is a summary of the results.
 
10 Explanation and analysis follows the table.
11
12 <table border=1>
13 <tr>
14 <th>Project</th>
@@ -21,122 +22,113 @@
21 <th>Compression Ratio</th>
22 <th>Clone Bandwidth</th>
23 </tr>
24
25 <tr align="center">
26 <td>SQLite
27 <td>28643
28 <td>6755
29 <td>3373&nbsp;days<br>9.24&nbsp;yrs
30 <td>2.00
31 <td>1.27&nbsp;GB
32 <td>35.4&nbsp;MB
33 <td>35:1
34 <td>982&nbsp;KB&nbsp;up<br>12.4&nbsp;MB&nbsp;down
35 </tr>
36
37 <tr align="center">
38 <td>Fossil
39 <td>4981
40 <td>1272
41 <td>764&nbsp;days<br>2.1&nbsp;yrs
42 <td>1.66
43 <td>144&nbsp;MB
44 <td>8.74&nbsp;MB
45 <td>16:1
46 <td>128&nbsp;KB&nbsp;up<br>4.49&nbsp;MB&nbsp;down
47 </tr>
48
49 <tr align="center">
50 <td>SLT
51 <td>2062
52 <td>67
53 <td>266&nbsp;days
54 <td>0.25
55 <td>1.76&nbsp;GB
56 <td>147&nbsp;MB
57 <td>11:1
58 <td>1.1&nbsp;MB&nbsp;up<br>141&nbsp;MB&nbsp;down
59 </tr>
60
61 <tr align="center">
62 <td>TH3
63 <td>1999
64 <td>429
65 <td>331&nbsp;days
66 <td>1.30
67 <td>70.5&nbsp;MB
68 <td>6.3&nbsp;MB
69 <td>11:1
70 <td>55&nbsp;KB&nbsp;up<br>4.66&nbsp;MB&nbsp;down
71 </tr>
72
73 <tr align="center">
74 <td>SQLite Docs
75 <td>1787
76 <td>444
77 <td>650&nbsp;days<br>1.78&nbsp;yrs
78 <td>0.68
79 <td>43&nbsp;MB
80 <td>4.9&nbsp;MB
81 <td>8:1
82 <td>46&nbsp;KB&nbsp;up<br>3.35&nbsp;MB&nbsp;down
 
 
 
 
 
 
 
 
 
 
 
 
83 </tr>
84
85 </table>
86
87 <h2>The Five Projects</h2>
88
89 The five projects listed above were chosen because they have been in
90 existance for a long time (relative to the age of fossil) or because
91 they have larges amounts of content. The most important project using
92 fossil is SQLite. Fossil itself
93 is built on top of SQLite and so obviously SQLite has to predate fossil.
94 SQLite was originally versioned using CVS, but recently the entire 9-year
95 and 320-MB CVS history of SQLite was converted over to Fossil. This is
96 an important datapoint because it demonstrates fossil's ability to manage
97 a significant and long-running project.
98 The next-longest running fossil project is fossil itself, at 2.1 years.
99 The documentation for SQLite
100 (identified above as "SQLite Docs") was split off of the main SQLite
101 source tree and into its own fossil repository about 1.75 years ago.
102 The "SQL Logic Test" or "SLT" project is a massive
103 collection of SQL statements and their output used to compare the
104 processing of SQLite against MySQL, PostgreSQL, Microsoft SQL Server,
105 and Oracle.
106 Finally "TH3" is a proprietary set of test cases for SQLite used to give
107 100% branch test coverage of SQLite on embedded platforms. All projects
108 except for TH3 are open-source.
109
110 <h2>Measured Attributes</h2>
111
112 In fossil, every version of every file, every wiki page, every change to
113 every ticket, and every check-in is a separate "artifact". One way to
114 think of a fossil project is as a bag of artifacts. Of course, there is
115 a lot more than this going on in fossil. Many of the artifacts have meaning
116 and are related to other artifacts. But at a low level (for example when
117 synchronizing two instances of the same project) the only thing that matters
118 is the unordered collection of artifacts. In fact, one of the key
119 characteristics of fossil is that the entire project history can be
120 reconstructed simply by scanning the artifacts in an arbitrary order.
121
122 The number of check-ins is the number of times that the "commit" command
123 has been run. A single check-in might change a 3 or 4 files, or it might
124 change several dozen different files. Regardless of the number of files
125 changed, it still only counts as one check-in.
126
127 The "Uncompressed Size" is the total size of all the artifacts within
128 the fossil repository assuming they were all uncompressed and stored
129 separately on the disk. Fossil makes use of delta compression between related
130 versions of the same file, and then uses zlib compression on the resulting
131 deltas. The total resulting repository size is shown after the uncompressed
132 size.
 
 
133
134 On the right end of the table, we show the "Clone Bandwidth". This is the
135 total number of bytes sent from client to server ("uplink") and from server
136 back to client ("downlink") in order to clone a repository. These byte counts
137 include HTTP protocol overhead.
138
139 In the table and throughout this article,
140 "GB" means gigabytes (10<sup><small>9</small></sup> bytes)
141 not <a href="http://en.wikipedia.org/wiki/Gibibyte">gibibytes</a>
142 (2<sup><small>30</small></sup> bytes). Similarly, "MB" and "KB"
@@ -144,22 +136,24 @@
144
145 <h2>Analysis And Supplimental Data</h2>
146
147 Perhaps the two most interesting datapoints in the above table are SQLite
148 and SLT. SQLite is a long-running project with long revision chains.
149 Some of the files in SQLite have been edited close to a thousand times.
150 Each of these edits is stored as a delta, and hence the SQLite project
151 gets excellent 35:1 compression. SLT, on the other hand, consists of
152 many large (megabyte-sized) SQL scripts that have one or maybe two
153 versions. There is very little delta compression occurring and so the
154 overall repository compression ratio is much lower. Note also that
155 quite a bit more bandwidth is required to clone SLT than SQLite.
156
157 For the first nine years of its development, SQLite was versioned by CVS.
158 The resulting CVS repository measured over 320MB in size. So, the
159 developers were
160 pleasently surprised to see that this entire project could be cloned in
161 fossil using only about 13MB of network traffic. The "sync" protocol
 
 
162 used by fossil has turned out to be surprisingly efficient. A typical
163 check-in on SQLite might use 3 or 4KB of network bandwidth total. Hardly
164 worth measuring. The sync protocol is efficient enough that, once cloned,
165 fossil could easily be used over a dial-up connection.
166
--- www/stats.wiki
+++ www/stats.wiki
@@ -2,13 +2,14 @@
2 <h1 align="center">Performance Statistics</h1>
3
4 The questions will inevitably arise: How does Fossil perform?
5 Does it use a lot of disk space or bandwidth? Is it scalable?
6
7 In an attempt to answers these questions, this report looks at several
8 projects that use fossil for configuration management and examines how
9 well they are working. The following table is a summary of the results.
10 (Last updated on 2012-02-26.)
11 Explanation and analysis follows the table.
12
13 <table border=1>
14 <tr>
15 <th>Project</th>
@@ -21,122 +22,113 @@
22 <th>Compression Ratio</th>
23 <th>Clone Bandwidth</th>
24 </tr>
25
26 <tr align="center">
27 <td>[http://www.sqlite.org/src/timeline | SQLite]
28 <td>41113
29 <td>9943
30 <td>4290&nbsp;days<br>11.75&nbsp;yrs
31 <td>2.32
32 <td>2.09&nbsp;GB
33 <td>33.2&nbsp;MB
34 <td>63:1
35 <td>23.2&nbsp;MB
36 </tr>
37
38 <tr align="center">
39 <td>[http://core.tcl.tk/tcl/timeline | TCL]
40 <td>74806
41 <td>13541
42 <td>5085&nbsp;days<br>13.92&nbsp;yrs
43 <td>2.66
44 <td>5.2&nbsp;GB
45 <td>86&nbsp;MB
46 <td>60:1
47 <td>67.0&nbsp;MB
48 </tr>
49
50 <tr align="center">
51 <td>[/timeline | Fossil]
52 <td>15561
53 <td>3764
54 <td>1681&nbsp;days<br>4.6&nbsp;yrs
55 <td>2.24
56 <td>721&nbsp;MB
57 <td>18.8&nbsp;MB
58 <td>38:1
59 <td>12.0&nbsp;MB
60 </tr>
61
62 <tr align="center">
63 <td>[http://www.sqlite.org/slt/timeline | SLT]
64 <td>2174
65 <td>100
66 <td>1183&nbsp;days<br>3.24&nbsp;yrs
67 <td>0.08
68 <td>1.94&nbsp;GB
69 <td>143&nbsp;MB
70 <td>12:1
71 <td>141&nbsp;MB
72 </tr>
73
74 <tr align="center">
75 <td>[http://www.sqlite.org/th3.html | TH3]
76 <td>5624
77 <td>1472
78 <td>1248&nbsp;days<br>3.42&nbsp;yrs
79 <td>1.78
80 <td>252&nbsp;MB
81 <td>12.5&nbsp;MB
82 <td>20:1
83 <td>12.2&nbsp;MB
84 </tr>
85
86 <tr align="center">
87 <td>[http://www.sqlite.org/docsrc/timeline | SQLite Docs]
88 <td>3664
89 <td>1003
90 <td>1567&nbsp;days<br>4.29&nbsp;yrs
91 <td>0.64
92 <td>108&nbsp;MB
93 <td>6.6&nbsp;MB
94 <td>16:1
95 <td>5.71&nbsp;MB
96 </tr>
97
98 </table>
99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100 <h2>Measured Attributes</h2>
101
102 In Fossil, every version of every file, every wiki page, every change to
103 every ticket, and every check-in is a separate "artifact". One way to
104 think of a Fossil project is as a bag of artifacts. Of course, there is
105 a lot more than this going on in Fossil. Many of the artifacts have meaning
106 and are related to other artifacts. But at a low level (for example when
107 synchronizing two instances of the same project) the only thing that matters
108 is the unordered collection of artifacts. In fact, one of the key
109 characteristics of Fossil is that the entire project history can be
110 reconstructed simply by scanning the artifacts in an arbitrary order.
111
112 The number of check-ins is the number of times that the "commit" command
113 has been run. A single check-in might change a 3 or 4 files, or it might
114 change dozens or hundreds of files. Regardless of the number of files
115 changed, it still only counts as one check-in.
116
117 The "Uncompressed Size" is the total size of all the artifacts within
118 the repository assuming they were all uncompressed and stored
119 separately on the disk. Fossil makes use of delta compression between related
120 versions of the same file, and then uses zlib compression on the resulting
121 deltas. The total resulting repository size is shown after the uncompressed
122 size. For this chart, "fossil rebuild --compress" was run on each repository
123 prior to measuring its compressed size. Repository sizes would typically
124 be 20% larger without that rebuild.
125
126 On the right end of the table, we show the "Clone Bandwidth". This is the
127 total number of bytes sent from server back to the client. The number of
128 bytes sent from client to server is neglible in comparison.
129 These byte counts include HTTP protocol overhead.
130
131 In the table and throughout this article,
132 "GB" means gigabytes (10<sup><small>9</small></sup> bytes)
133 not <a href="http://en.wikipedia.org/wiki/Gibibyte">gibibytes</a>
134 (2<sup><small>30</small></sup> bytes). Similarly, "MB" and "KB"
@@ -144,22 +136,24 @@
136
137 <h2>Analysis And Supplimental Data</h2>
138
139 Perhaps the two most interesting datapoints in the above table are SQLite
140 and SLT. SQLite is a long-running project with long revision chains.
141 Some of the files in SQLite have been edited over a thousand times.
142 Each of these edits is stored as a delta, and hence the SQLite project
143 gets excellent 63:1 compression. SLT, on the other hand, consists of
144 many large (megabyte-sized) SQL scripts that have one or maybe two
145 edits each. There is very little delta compression occurring and so the
146 overall repository compression ratio is much lower. Note also that
147 quite a bit more bandwidth is required to clone SLT than SQLite.
148
149 For the first nine years of its development, SQLite was versioned by CVS.
150 The resulting CVS repository measured over 320MB in size. So, the
151 developers were surprised to see that this entire project could be cloned in
152 fossil using only about 23.2MB of network traffic. (This 23.2MB includes
153 all the changes to SQLite that have been made since the conversion from
154 CVS. Of those changes are omitted, the clone bandwidth drops to 13MB.)
155 The "sync" protocol
156 used by fossil has turned out to be surprisingly efficient. A typical
157 check-in on SQLite might use 3 or 4KB of network bandwidth total. Hardly
158 worth measuring. The sync protocol is efficient enough that, once cloned,
159 Fossil could easily be used over a dial-up connection.
160

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button