Fossil SCM

Documentation updates

drh 2007-07-24 12:52 UTC trunk
Commit b807acf62eecfbf884fa10244ea59d78f39ac0a9
+169 -85
--- www/fileformat.html
+++ www/fileformat.html
@@ -1,95 +1,179 @@
11
<html>
22
<head>
3
-<title>Fossil File Formats</title>
3
+<title>Fossil File Format</title>
44
</head>
55
<body bgcolor="white">
66
<h1 align="center">
77
Fossil File Formats
88
</h1>
99
1010
<p>
1111
The global state of a fossil repository is determined by an unordered
12
-set of content files. Each of these files has a format which is defined
13
-by this document.
14
-</p>
15
-
16
-<h2>1.0 General Formatting Rules</h2>
17
-
18
-<p>
19
-Fossil content files consist of a header, a blank line, and optional
20
-content.
21
-</p>
22
-
23
-<p>
24
-The header is divided into "properties" by newline ('\n', 0x0a)
25
-characters. Each header property is divided into tokens by space (' ', 0x20)
26
-characters. The first token of each property is the property name.
27
-Subsequent tokens (if any) are arguments to the property.
28
-</p>
29
-
30
-<p>
31
-The blank line that separates the header from the content can be
32
-thought of as a property line that contains no tokens. Everything
33
-that follows the newline character that terminates the blank line
34
-is content. The blank line is always present but the content is
35
-optional.
36
-</p>
37
-
38
-<p>
39
-All tokens in a property line are encoded to escape special characters.
40
-The encoding is as follows:
41
-</p>
42
-
43
-<blockquote>
44
-<table border="1">
45
-<tr><th>Input Character</th><th>Encoded As</th></tr>
46
-<tr><td align="center"> space (0x20) </td><td align="center"> \s </td></tr>
47
-<tr><td align="center"> newline (0x0A) </td><td align="center"> \n </td></tr>
48
-<tr><td align="center"> carriage return (0x0D) </td><td align="center"> \r </td></tr>
49
-<tr><td align="center"> tab (0x09) </td><td align="center"> \t </td></tr>
50
-<tr><td align="center"> vertical tab (0x0B) </td><td align="center"> \v </td></tr>
51
-<tr><td align="center"> formfeed (0x0C) </td><td align="center"> \f </td></tr>
52
-<tr><td align="center"> nul (0x00) </td><td align="center"> \0 </td></tr>
53
-<tr><td align="center"> backslash (0x5C) </td><td align="center"> \\ </td></tr>
54
-</table>
55
-</blockquote>
56
-
57
-<p>
58
-Characters other than the ones shown in the table above are passed through
59
-the encoder without change.
60
-</p>
61
-
62
-<p>
63
-All properties names are unpunctuated lower-case ASCII strings.
64
-The properties appear in the header in sorted order (using
65
-memcpy() as the comparision function) except for the "signature"
66
-property which always occurs first.
67
-</p>
68
-
69
-<h2>2.0 Common Properties</h2>
70
-
71
-<p>
72
-Every content file has a "time" property. The argument to the
73
-time property is an integer which is the number of seconds since
74
-1970 UTC when the content file was created. For example:
75
-</p>
76
-
77
-<blockquote>
78
-time 1181404746
79
-</blockquote>
80
-
81
-<p>
82
-Every content file has a "type" property. The argument to the
83
-type property defines the purpose of the content file. The
84
-argument can be strings like "version", "folder", "file", or "user".
85
-</p>
86
-
87
-<p>
88
-The first property of a content file is the digital signature. The
89
-name of the signature property is "signature". There are two arguments.
90
-The first argument is the SHA256 hash of the content file that defines
91
-the user who signed this file. User records themselves are self-signed
92
-and so the first argument is simply "*" for user records. The second
93
-argument is the digital signature of an SHA256 hash of the entire
94
-file (header and content) except for the signature line itself.
95
-</p>
12
+set of files. Some files used to represent wiki pages, trouble tickets,
13
+and the special "manifest" file has a specific and well-defined format.
14
+Other files are just the content of the files. Files can be text or
15
+binary.
16
+</p>
17
+
18
+<p>
19
+Each file in the repository is named by its SHA1 hash.
20
+Some files have a particular format which qualifies them
21
+as "manifests". A manifest assigns filenames to a subset
22
+of the files in the repository, in order to provide a
23
+snapshot of the state of the project at a point in time.
24
+Each manifest file corresponds to a version or baseline
25
+of the project.
26
+</p>
27
+
28
+<h2>1.0 The Manifest File</h2>
29
+
30
+<p>
31
+Any file in the repository that follows the syntactic rules
32
+of a manifest is a manifest. Note that a manifest can
33
+be both a real manifest and also a content file, though this
34
+is rare.
35
+</p>
36
+
37
+<p>
38
+A manifest is a line-oriented text file. Newline characters
39
+(ASCII 0x0a) separate lines. Each line begins with a single
40
+character "line type". Zero or more arguments may follow
41
+the line type. All arguments are separated from each other
42
+and from the line-type character by a single space
43
+character. There is no surplus white space between arguments
44
+and no leading or trailing whitespace except for the newline
45
+character that acts as the line separator.
46
+</p>
47
+
48
+<p>
49
+All lines of the manifest occur in strict sorted lexigraphical order.
50
+No line may be duplicated.
51
+The entire manifest file may be PGP clear-signed, but otherwise it
52
+may contain no additional text or data beyond what is described here.
53
+</p>
54
+
55
+<p>
56
+Allowed lines in the manifest are as follows:
57
+</p>
58
+
59
+<blockquote>
60
+<b>C</b> <i>checkin-comment</i><br>
61
+<b>D</b> <i>time-and-date-stamp</i><br>
62
+<b>F</b> <i>filename</i> <i>SHA1-hash</i><br>
63
+<b>P</b> <i>SHA1-hash</i>+<br>
64
+<b>R</b> <i>repository-checksum</i><br>
65
+<b>U</b> <i>user-login</i><br>
66
+<b>Z</b> <i>manifest-checksum</i>
67
+</blockquote>
68
+
69
+<p>
70
+A manifest must have exactly one C-line. The sole argument to
71
+the C-line is a check-in comment that describes the baseline that
72
+the manifest defines. The check-in comment is text. The following
73
+escape sequences are applied to the text:
74
+A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73). A
75
+newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E). A backslash
76
+(ASCII 0x5C) is represented as two backslashes "\\". Apart from
77
+space and newline, no other whitespace characters are allowed in
78
+the check-in comment. Nor are any unprintable characters allowed
79
+in the comment.
80
+</p>
81
+
82
+<p>
83
+A manifest must have exactly one D-line. The sole argument to
84
+the D-line is a date-time stamp in the ISO8601 format. The
85
+date and time should be in coordinated universal time (UTC).
86
+The format is:
87
+</p>
88
+
89
+<blockquote>
90
+<i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i>
91
+</blockquote>
92
+
93
+<p>
94
+A manifest has zero or more F-lines. Each F-line defines a file
95
+(other than the manifest itself) which is part of the baseline that
96
+the manifest defines. There are two arguments. The first argment
97
+is the pathname of the file in the baseline relative to the root
98
+of the project file hierarchy. No ".." or "." directories are allowed
99
+within the filename. Space characters are escaped as in C-line
100
+comment text. Backslash characters and newlines are not allowed
101
+within filenames. The directory separator character is a forward
102
+slash (ASCII 0x2F). The second argument to the F-line is the
103
+full 40-character hexadecimal SHA1 hash of the file content.
104
+Upper-case letters ABCDEF are used for the higher digits of the
105
+hexadecimal.
106
+</p>
107
+
108
+<p>
109
+A manifest has zero or one P-lines. Most manifests have one P-line.
110
+The P-line has a varying number of arguments that
111
+defines other manifests from which the current manifest
112
+is derived. Each argument is an 40-character uppercase
113
+hexadecimal SHA1 of the predecessor manifest. All arguments
114
+to the P-line must be unique to that line.
115
+The first predecessor is the manifests direct ancestor.
116
+Other arguments define manifests with which the first was
117
+merged to yield the current manifest. Most manifests have
118
+a P-line with a single argument. The first manifest in the
119
+project has no ancestors and thus has no P-line.
120
+</p>
121
+
122
+<p>
123
+A manifest may optionally have a single R-line. The R-line has
124
+a single argument which is the MD5 checksum of all files in
125
+the baseline except the manifest itself. The checksum is expressed
126
+as 32-characters of uppercase hexadecimal. The checksum is
127
+computed as follows: For each file in the baseline (except for
128
+the manifest itself) in strict sorted lexigraphical order,
129
+take the pathname of the file relative to the root of the
130
+repository, append a single space (ASCII 0x20), the
131
+size of the file in ASCII decimal, a single newline
132
+character (ASCII 0x0A), and the complete text of the file.
133
+Compute the MD5 checksum of the the result.
134
+</p>
135
+
136
+<p>
137
+Each manifest has a single U-line. The argument to the U-line is
138
+the login of the user who created the manifest. The login name
139
+is encoded using the same character escapes as is used for the
140
+check-in comment argument to the C-line.
141
+</p>
142
+
143
+<p>
144
+A manifest has an option Z-line as its last line. The argument
145
+to the Z-line is a 32-character uppercase hexadecimal MD5 hash
146
+of all prior lines of the manifest up to and including the newline
147
+character that immediately preceeds the "Z". The Z-line is just
148
+a sanity check to prove that the manifest is well-formed and
149
+consistent.
150
+</p>
151
+
152
+<h2>2.0 Trouble Tickets</h2>
153
+
154
+<p>
155
+Each trouble ticket is a file in the repository and appears in
156
+a manifest for every baseline in which the ticket exists.
157
+Trouble tickets occur in a specific subdirectory of the file
158
+heirarchy. The name of the subdirectory that contains tickets
159
+is part of the local state of each repository. The filename
160
+of each trouble ticket has a ".tkt" suffix. The trouble ticket
161
+has a particular file format defined below.
162
+</p>
163
+
164
+<i>To be continued...</i>
165
+
166
+<h2>3.0 Wiki Pages</h2>
167
+
168
+<p>
169
+Each wiki is a file in the repository and appears in
170
+a manifest for every baseline in which that wiki page exists.
171
+Wiki pages occur in a specific subdirectory of the file
172
+heirarchy. The name of the subdirectory that contains wiki pages
173
+is part of the local state of each repository. The filename
174
+of each wiki page has a ".wiki" suffix. The base name of
175
+the file is the name of the wiki page. The wiki pages
176
+have a particular file format defined below.
177
+</p>
178
+
179
+<i>To be continued...</i>
96180
--- www/fileformat.html
+++ www/fileformat.html
@@ -1,95 +1,179 @@
1 <html>
2 <head>
3 <title>Fossil File Formats</title>
4 </head>
5 <body bgcolor="white">
6 <h1 align="center">
7 Fossil File Formats
8 </h1>
9
10 <p>
11 The global state of a fossil repository is determined by an unordered
12 set of content files. Each of these files has a format which is defined
13 by this document.
14 </p>
15
16 <h2>1.0 General Formatting Rules</h2>
17
18 <p>
19 Fossil content files consist of a header, a blank line, and optional
20 content.
21 </p>
22
23 <p>
24 The header is divided into "properties" by newline ('\n', 0x0a)
25 characters. Each header property is divided into tokens by space (' ', 0x20)
26 characters. The first token of each property is the property name.
27 Subsequent tokens (if any) are arguments to the property.
28 </p>
29
30 <p>
31 The blank line that separates the header from the content can be
32 thought of as a property line that contains no tokens. Everything
33 that follows the newline character that terminates the blank line
34 is content. The blank line is always present but the content is
35 optional.
36 </p>
37
38 <p>
39 All tokens in a property line are encoded to escape special characters.
40 The encoding is as follows:
41 </p>
42
43 <blockquote>
44 <table border="1">
45 <tr><th>Input Character</th><th>Encoded As</th></tr>
46 <tr><td align="center"> space (0x20) </td><td align="center"> \s </td></tr>
47 <tr><td align="center"> newline (0x0A) </td><td align="center"> \n </td></tr>
48 <tr><td align="center"> carriage return (0x0D) </td><td align="center"> \r </td></tr>
49 <tr><td align="center"> tab (0x09) </td><td align="center"> \t </td></tr>
50 <tr><td align="center"> vertical tab (0x0B) </td><td align="center"> \v </td></tr>
51 <tr><td align="center"> formfeed (0x0C) </td><td align="center"> \f </td></tr>
52 <tr><td align="center"> nul (0x00) </td><td align="center"> \0 </td></tr>
53 <tr><td align="center"> backslash (0x5C) </td><td align="center"> \\ </td></tr>
54 </table>
55 </blockquote>
56
57 <p>
58 Characters other than the ones shown in the table above are passed through
59 the encoder without change.
60 </p>
61
62 <p>
63 All properties names are unpunctuated lower-case ASCII strings.
64 The properties appear in the header in sorted order (using
65 memcpy() as the comparision function) except for the "signature"
66 property which always occurs first.
67 </p>
68
69 <h2>2.0 Common Properties</h2>
70
71 <p>
72 Every content file has a "time" property. The argument to the
73 time property is an integer which is the number of seconds since
74 1970 UTC when the content file was created. For example:
75 </p>
76
77 <blockquote>
78 time 1181404746
79 </blockquote>
80
81 <p>
82 Every content file has a "type" property. The argument to the
83 type property defines the purpose of the content file. The
84 argument can be strings like "version", "folder", "file", or "user".
85 </p>
86
87 <p>
88 The first property of a content file is the digital signature. The
89 name of the signature property is "signature". There are two arguments.
90 The first argument is the SHA256 hash of the content file that defines
91 the user who signed this file. User records themselves are self-signed
92 and so the first argument is simply "*" for user records. The second
93 argument is the digital signature of an SHA256 hash of the entire
94 file (header and content) except for the signature line itself.
95 </p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
--- www/fileformat.html
+++ www/fileformat.html
@@ -1,95 +1,179 @@
1 <html>
2 <head>
3 <title>Fossil File Format</title>
4 </head>
5 <body bgcolor="white">
6 <h1 align="center">
7 Fossil File Formats
8 </h1>
9
10 <p>
11 The global state of a fossil repository is determined by an unordered
12 set of files. Some files used to represent wiki pages, trouble tickets,
13 and the special "manifest" file has a specific and well-defined format.
14 Other files are just the content of the files. Files can be text or
15 binary.
16 </p>
17
18 <p>
19 Each file in the repository is named by its SHA1 hash.
20 Some files have a particular format which qualifies them
21 as "manifests". A manifest assigns filenames to a subset
22 of the files in the repository, in order to provide a
23 snapshot of the state of the project at a point in time.
24 Each manifest file corresponds to a version or baseline
25 of the project.
26 </p>
27
28 <h2>1.0 The Manifest File</h2>
29
30 <p>
31 Any file in the repository that follows the syntactic rules
32 of a manifest is a manifest. Note that a manifest can
33 be both a real manifest and also a content file, though this
34 is rare.
35 </p>
36
37 <p>
38 A manifest is a line-oriented text file. Newline characters
39 (ASCII 0x0a) separate lines. Each line begins with a single
40 character "line type". Zero or more arguments may follow
41 the line type. All arguments are separated from each other
42 and from the line-type character by a single space
43 character. There is no surplus white space between arguments
44 and no leading or trailing whitespace except for the newline
45 character that acts as the line separator.
46 </p>
47
48 <p>
49 All lines of the manifest occur in strict sorted lexigraphical order.
50 No line may be duplicated.
51 The entire manifest file may be PGP clear-signed, but otherwise it
52 may contain no additional text or data beyond what is described here.
53 </p>
54
55 <p>
56 Allowed lines in the manifest are as follows:
57 </p>
58
59 <blockquote>
60 <b>C</b> <i>checkin-comment</i><br>
61 <b>D</b> <i>time-and-date-stamp</i><br>
62 <b>F</b> <i>filename</i> <i>SHA1-hash</i><br>
63 <b>P</b> <i>SHA1-hash</i>+<br>
64 <b>R</b> <i>repository-checksum</i><br>
65 <b>U</b> <i>user-login</i><br>
66 <b>Z</b> <i>manifest-checksum</i>
67 </blockquote>
68
69 <p>
70 A manifest must have exactly one C-line. The sole argument to
71 the C-line is a check-in comment that describes the baseline that
72 the manifest defines. The check-in comment is text. The following
73 escape sequences are applied to the text:
74 A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73). A
75 newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E). A backslash
76 (ASCII 0x5C) is represented as two backslashes "\\". Apart from
77 space and newline, no other whitespace characters are allowed in
78 the check-in comment. Nor are any unprintable characters allowed
79 in the comment.
80 </p>
81
82 <p>
83 A manifest must have exactly one D-line. The sole argument to
84 the D-line is a date-time stamp in the ISO8601 format. The
85 date and time should be in coordinated universal time (UTC).
86 The format is:
87 </p>
88
89 <blockquote>
90 <i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i>
91 </blockquote>
92
93 <p>
94 A manifest has zero or more F-lines. Each F-line defines a file
95 (other than the manifest itself) which is part of the baseline that
96 the manifest defines. There are two arguments. The first argment
97 is the pathname of the file in the baseline relative to the root
98 of the project file hierarchy. No ".." or "." directories are allowed
99 within the filename. Space characters are escaped as in C-line
100 comment text. Backslash characters and newlines are not allowed
101 within filenames. The directory separator character is a forward
102 slash (ASCII 0x2F). The second argument to the F-line is the
103 full 40-character hexadecimal SHA1 hash of the file content.
104 Upper-case letters ABCDEF are used for the higher digits of the
105 hexadecimal.
106 </p>
107
108 <p>
109 A manifest has zero or one P-lines. Most manifests have one P-line.
110 The P-line has a varying number of arguments that
111 defines other manifests from which the current manifest
112 is derived. Each argument is an 40-character uppercase
113 hexadecimal SHA1 of the predecessor manifest. All arguments
114 to the P-line must be unique to that line.
115 The first predecessor is the manifests direct ancestor.
116 Other arguments define manifests with which the first was
117 merged to yield the current manifest. Most manifests have
118 a P-line with a single argument. The first manifest in the
119 project has no ancestors and thus has no P-line.
120 </p>
121
122 <p>
123 A manifest may optionally have a single R-line. The R-line has
124 a single argument which is the MD5 checksum of all files in
125 the baseline except the manifest itself. The checksum is expressed
126 as 32-characters of uppercase hexadecimal. The checksum is
127 computed as follows: For each file in the baseline (except for
128 the manifest itself) in strict sorted lexigraphical order,
129 take the pathname of the file relative to the root of the
130 repository, append a single space (ASCII 0x20), the
131 size of the file in ASCII decimal, a single newline
132 character (ASCII 0x0A), and the complete text of the file.
133 Compute the MD5 checksum of the the result.
134 </p>
135
136 <p>
137 Each manifest has a single U-line. The argument to the U-line is
138 the login of the user who created the manifest. The login name
139 is encoded using the same character escapes as is used for the
140 check-in comment argument to the C-line.
141 </p>
142
143 <p>
144 A manifest has an option Z-line as its last line. The argument
145 to the Z-line is a 32-character uppercase hexadecimal MD5 hash
146 of all prior lines of the manifest up to and including the newline
147 character that immediately preceeds the "Z". The Z-line is just
148 a sanity check to prove that the manifest is well-formed and
149 consistent.
150 </p>
151
152 <h2>2.0 Trouble Tickets</h2>
153
154 <p>
155 Each trouble ticket is a file in the repository and appears in
156 a manifest for every baseline in which the ticket exists.
157 Trouble tickets occur in a specific subdirectory of the file
158 heirarchy. The name of the subdirectory that contains tickets
159 is part of the local state of each repository. The filename
160 of each trouble ticket has a ".tkt" suffix. The trouble ticket
161 has a particular file format defined below.
162 </p>
163
164 <i>To be continued...</i>
165
166 <h2>3.0 Wiki Pages</h2>
167
168 <p>
169 Each wiki is a file in the repository and appears in
170 a manifest for every baseline in which that wiki page exists.
171 Wiki pages occur in a specific subdirectory of the file
172 heirarchy. The name of the subdirectory that contains wiki pages
173 is part of the local state of each repository. The filename
174 of each wiki page has a ".wiki" suffix. The base name of
175 the file is the name of the wiki page. The wiki pages
176 have a particular file format defined below.
177 </p>
178
179 <i>To be continued...</i>
180
+12 -15
--- www/index.html
+++ www/index.html
@@ -7,25 +7,26 @@
77
88
<p>
99
This is a preliminary homepage for a new software configuration
1010
management system called "Fossil".
1111
The code is currently under development, and has been for about
12
-a year. Nothing is available for download or inspection
13
-as of this writing (2007-06-09).
12
+two years. (We have iterated the design multiple times.)
13
+Nothing is available for download or inspection
14
+as of this writing (2007-07-24).
1415
But the system is self-hosting now.
1516
Hopefully something will be available soon.
1617
</p>
1718
18
-<p>Distinctive features of Fossil:</p>
19
+<p>Design Goals For Fossil:</p>
1920
2021
<ul>
2122
<li>Supports disconnected, distributed development (like
2223
<a href="http://kerneltrap.org/node/4982">git</a>,
2324
<a href="http://www.venge.net/monotone/">monotone</a>,
2425
<a href="http://www.selenic.com/mercurial/wiki/index.cgi">mercurial</a>, or
2526
<a href="http://www.bitkeeper.com/">bitkeeper</a>)
26
-or tightly coupled client/server operation (like
27
+or client/server operation (like
2728
<a href="http://www.nongnu.org/cvs/">CVS</a> or
2829
<a href="http://subversion.tigris.org/">subversion</a>)
2930
or both at the same time</li>
3031
<li>Integrated bug tracking and wiki, along the lines of
3132
<a href="http://www.cvstrac.org/">CVSTrac</a> and
@@ -38,29 +39,25 @@
3839
trivial to install</li>
3940
<li>Server runs as <a href="http://www.w3.org/CGI/">CGI</a>, using
4041
<a href="http://en.wikipedia.org/wiki/inetd">inetd</a> or
4142
<a href="http://www.xinetd.org/">xinetd</a> or using its own built-in,
4243
standalone web server.</li>
43
-<li>The entire project contained in single disk file (which also
44
+<li>An entire project contained in single disk file (which also
4445
happens to be an <a href="http://www.sqlite.org/">SQLite</a> database.)</li>
45
-<li>Self sign-up (at the administrators discretion) including the
46
-ability to support secure anonymous check-ins (also optional).</li>
47
-<li>Digital signatures on all files, versions,
48
-<a href="http://wiki.org/wiki.cgi?WhatIsWiki">wiki</a> pages,
49
-trouble tickets, etc. Everything is digitally signed.</li>
5046
<li>Trivial to setup and administer</li>
5147
<li>Files and versions identified by their
52
-<a href="http://en.wikipedia.org/wiki/SHA-1">SHA-256</a> signature expressed
53
-in <a href="base32.html">base-32 notation</a>.
48
+<a href="http://en.wikipedia.org/wiki/SHA-1">SHA1</a> signature.</a>
5449
Any unique prefix is sufficient to identify a file
5550
or version - usually the first 4 or 5 characters suffice.</li>
51
+<li>The file format is trival and requires nothing more complex
52
+than a text editor and the "sha1sum" command-line utility to decode.</li>
5653
<li>Automatic <a href="selfcheck.html">self-check</a>
5754
on repository changes makes it exceedingly
5855
unlikely that data will ever be lost because of a software bug.</li>
5956
</ul>
6057
61
-<p>Goals of fossil:</p>
58
+<p>Objectives Of Fossil:</p>
6259
6360
<ul>
6461
<li>Fossil should be ridiculously easy to install and operate.</li>
6562
<li>With fossil, it should be possible (and easy) to set up a project
6663
on an inexpensive shared-hosting ISP
@@ -78,13 +75,13 @@
7875
7976
<p>Links:</p>
8077
8178
<ul>
8279
<li><a href="pop.html">Principals Of Operation</a></li>
83
-<li>The <a href="base32.html">base-32 encoding</a> mechanism used
84
-by Fossil.</li>
80
+<li>The <a href="selfcheck.html">automatic self-check</a> mechanism
81
+helps insure project integrity.</li>
8582
<li>The <a href="fileformat.html">file format</a> used by every content
8683
file stored in the repository.</li>
8784
</ul>
8885
8986
</body>
9087
</html>
9188
--- www/index.html
+++ www/index.html
@@ -7,25 +7,26 @@
7
8 <p>
9 This is a preliminary homepage for a new software configuration
10 management system called "Fossil".
11 The code is currently under development, and has been for about
12 a year. Nothing is available for download or inspection
13 as of this writing (2007-06-09).
 
14 But the system is self-hosting now.
15 Hopefully something will be available soon.
16 </p>
17
18 <p>Distinctive features of Fossil:</p>
19
20 <ul>
21 <li>Supports disconnected, distributed development (like
22 <a href="http://kerneltrap.org/node/4982">git</a>,
23 <a href="http://www.venge.net/monotone/">monotone</a>,
24 <a href="http://www.selenic.com/mercurial/wiki/index.cgi">mercurial</a>, or
25 <a href="http://www.bitkeeper.com/">bitkeeper</a>)
26 or tightly coupled client/server operation (like
27 <a href="http://www.nongnu.org/cvs/">CVS</a> or
28 <a href="http://subversion.tigris.org/">subversion</a>)
29 or both at the same time</li>
30 <li>Integrated bug tracking and wiki, along the lines of
31 <a href="http://www.cvstrac.org/">CVSTrac</a> and
@@ -38,29 +39,25 @@
38 trivial to install</li>
39 <li>Server runs as <a href="http://www.w3.org/CGI/">CGI</a>, using
40 <a href="http://en.wikipedia.org/wiki/inetd">inetd</a> or
41 <a href="http://www.xinetd.org/">xinetd</a> or using its own built-in,
42 standalone web server.</li>
43 <li>The entire project contained in single disk file (which also
44 happens to be an <a href="http://www.sqlite.org/">SQLite</a> database.)</li>
45 <li>Self sign-up (at the administrators discretion) including the
46 ability to support secure anonymous check-ins (also optional).</li>
47 <li>Digital signatures on all files, versions,
48 <a href="http://wiki.org/wiki.cgi?WhatIsWiki">wiki</a> pages,
49 trouble tickets, etc. Everything is digitally signed.</li>
50 <li>Trivial to setup and administer</li>
51 <li>Files and versions identified by their
52 <a href="http://en.wikipedia.org/wiki/SHA-1">SHA-256</a> signature expressed
53 in <a href="base32.html">base-32 notation</a>.
54 Any unique prefix is sufficient to identify a file
55 or version - usually the first 4 or 5 characters suffice.</li>
 
 
56 <li>Automatic <a href="selfcheck.html">self-check</a>
57 on repository changes makes it exceedingly
58 unlikely that data will ever be lost because of a software bug.</li>
59 </ul>
60
61 <p>Goals of fossil:</p>
62
63 <ul>
64 <li>Fossil should be ridiculously easy to install and operate.</li>
65 <li>With fossil, it should be possible (and easy) to set up a project
66 on an inexpensive shared-hosting ISP
@@ -78,13 +75,13 @@
78
79 <p>Links:</p>
80
81 <ul>
82 <li><a href="pop.html">Principals Of Operation</a></li>
83 <li>The <a href="base32.html">base-32 encoding</a> mechanism used
84 by Fossil.</li>
85 <li>The <a href="fileformat.html">file format</a> used by every content
86 file stored in the repository.</li>
87 </ul>
88
89 </body>
90 </html>
91
--- www/index.html
+++ www/index.html
@@ -7,25 +7,26 @@
7
8 <p>
9 This is a preliminary homepage for a new software configuration
10 management system called "Fossil".
11 The code is currently under development, and has been for about
12 two years. (We have iterated the design multiple times.)
13 Nothing is available for download or inspection
14 as of this writing (2007-07-24).
15 But the system is self-hosting now.
16 Hopefully something will be available soon.
17 </p>
18
19 <p>Design Goals For Fossil:</p>
20
21 <ul>
22 <li>Supports disconnected, distributed development (like
23 <a href="http://kerneltrap.org/node/4982">git</a>,
24 <a href="http://www.venge.net/monotone/">monotone</a>,
25 <a href="http://www.selenic.com/mercurial/wiki/index.cgi">mercurial</a>, or
26 <a href="http://www.bitkeeper.com/">bitkeeper</a>)
27 or client/server operation (like
28 <a href="http://www.nongnu.org/cvs/">CVS</a> or
29 <a href="http://subversion.tigris.org/">subversion</a>)
30 or both at the same time</li>
31 <li>Integrated bug tracking and wiki, along the lines of
32 <a href="http://www.cvstrac.org/">CVSTrac</a> and
@@ -38,29 +39,25 @@
39 trivial to install</li>
40 <li>Server runs as <a href="http://www.w3.org/CGI/">CGI</a>, using
41 <a href="http://en.wikipedia.org/wiki/inetd">inetd</a> or
42 <a href="http://www.xinetd.org/">xinetd</a> or using its own built-in,
43 standalone web server.</li>
44 <li>An entire project contained in single disk file (which also
45 happens to be an <a href="http://www.sqlite.org/">SQLite</a> database.)</li>
 
 
 
 
 
46 <li>Trivial to setup and administer</li>
47 <li>Files and versions identified by their
48 <a href="http://en.wikipedia.org/wiki/SHA-1">SHA1</a> signature.</a>
 
49 Any unique prefix is sufficient to identify a file
50 or version - usually the first 4 or 5 characters suffice.</li>
51 <li>The file format is trival and requires nothing more complex
52 than a text editor and the "sha1sum" command-line utility to decode.</li>
53 <li>Automatic <a href="selfcheck.html">self-check</a>
54 on repository changes makes it exceedingly
55 unlikely that data will ever be lost because of a software bug.</li>
56 </ul>
57
58 <p>Objectives Of Fossil:</p>
59
60 <ul>
61 <li>Fossil should be ridiculously easy to install and operate.</li>
62 <li>With fossil, it should be possible (and easy) to set up a project
63 on an inexpensive shared-hosting ISP
@@ -78,13 +75,13 @@
75
76 <p>Links:</p>
77
78 <ul>
79 <li><a href="pop.html">Principals Of Operation</a></li>
80 <li>The <a href="selfcheck.html">automatic self-check</a> mechanism
81 helps insure project integrity.</li>
82 <li>The <a href="fileformat.html">file format</a> used by every content
83 file stored in the repository.</li>
84 </ul>
85
86 </body>
87 </html>
88
+23 -38
--- www/pop.html
+++ www/pop.html
@@ -27,18 +27,18 @@
2727
for each repository is private to that repository.
2828
The global state represents the content of the project.
2929
The local state identifies the authorized users and
3030
access policies for a particular repository.</p></li>
3131
32
-<li><p>The global state of a repository is an mostly unordered
32
+<li><p>The global state of a repository is an unordered
3333
collection of files. Each file is named by
34
-its SHA256 hash. The name is encoded as a 52-digit
35
-base-32 number. In many contexts, the name can be
34
+its SHA1 hash encoded in hexadecimal.
35
+In many contexts, the name can be
3636
abbreviated to a unique prefix. A five- or six-character
3737
prefix usually suffices to uniquely identify a file.</p></li>
3838
39
-<li><p>Because files are named by their SHA256 hash, all files
39
+<li><p>Because files are named by their SHA1 hash, all files
4040
are immutable. Any change to the content of a file also
4141
changes the hash that forms the files name, thus
4242
creating a new file. Both the old original version of the
4343
file and the new change are preserved under different names.</p></li>
4444
@@ -45,52 +45,37 @@
4545
<li><p>It is theoretically possible for two files with different
4646
content to share the same hash. But finding two such
4747
files is so incredibly difficult and unlikely that we
4848
consider it to be an impossibility.</p></li>
4949
50
-<li><p>The files that comprise the global state of a repository
51
-consist of a header followed by optional content. Every
52
-file contains an RSA signature in the header. And every
53
-file contains a "file type" designator in the header.
54
-Additional information is also found in the header depending
55
-on the file type.</p></li>
56
-
57
-<li><p>The file that comprise the global state of a repository
50
+<li><p>The signature of a file is the SHA1 hash of the
51
+file itself, exactly as it appears on disk. No prefix
52
+or meta-information about the file is added before computing
53
+the hash. So you can
54
+always find the SHA1 signature of a file by using the
55
+"sha1sum" command-line utility.</p></li>
56
+
57
+<li><p>The files that comprise the global state of a repository
5858
are the complete global state of that repository. The SQLite
5959
database that holds the repository contains additional information
6060
about linkages between files, but all of that added information
61
-can be discarded and reconstructed by scanning the content
61
+can be discarded and reconstructed by rescanning the content
6262
files.</p></li>
6363
6464
<li><p>Two repositories for the same project can synchronize
6565
their global states simply by sharing files. The local
6666
state of repositories is not normally synchronized or
6767
shared.</p></li>
6868
69
-<li><p>The name of a file is its SHA256 hash in a base-32
70
-encoding. The digits of the base-32 encode are as
71
-follows:
72
-
73
-<blockquote><b>
74
- 0123456789abcdefghjkmnpqrstuvwxy
75
-</b></blockquote>
76
-
77
-<p>The letters "o", "i", and "l" are omitted from the
78
-encoding character set to avoid confusion with the
79
-digits "0" and "1". On input, upper and lower case
80
-letters are treated the same, the letter "o" is
81
-interpreted as a zero ("0") and the letters "i" and
82
-"l" are interpreted as a one ("1"). The full name of
83
-a file is 52 characters long. The first 4 bits of the
84
-SHA256 has are repeated onto the end of the hash so that
85
-the last digit in the base-32 encoding will contain a
86
-full 5 bits.
87
-For convenience, files
88
-may often be abbreviated to a unique prefix and the
89
-repository will automatically expand the name to
90
-its full 52 characters. In practice, 5 or 6
91
-characters are usually sufficient to give a unique
92
-name prefix to files even in the largest of projects.</p></li>
93
-</ul>
69
+<li><p>Every repository has a special file at the top-level
70
+named "manifest" which is an index of all other files in
71
+the system. The manifest is automatically created and
72
+maintained by the system.</p></li>
73
+
74
+<li><p>The <a href="fileformat.html">file format</a>
75
+is very simple so that with access
76
+to the original content files, one can easily reconstruct
77
+the content of a baseline without the need for any
78
+special tools or software.</p></li>
9479
9580
</body>
9681
</html>
9782
--- www/pop.html
+++ www/pop.html
@@ -27,18 +27,18 @@
27 for each repository is private to that repository.
28 The global state represents the content of the project.
29 The local state identifies the authorized users and
30 access policies for a particular repository.</p></li>
31
32 <li><p>The global state of a repository is an mostly unordered
33 collection of files. Each file is named by
34 its SHA256 hash. The name is encoded as a 52-digit
35 base-32 number. In many contexts, the name can be
36 abbreviated to a unique prefix. A five- or six-character
37 prefix usually suffices to uniquely identify a file.</p></li>
38
39 <li><p>Because files are named by their SHA256 hash, all files
40 are immutable. Any change to the content of a file also
41 changes the hash that forms the files name, thus
42 creating a new file. Both the old original version of the
43 file and the new change are preserved under different names.</p></li>
44
@@ -45,52 +45,37 @@
45 <li><p>It is theoretically possible for two files with different
46 content to share the same hash. But finding two such
47 files is so incredibly difficult and unlikely that we
48 consider it to be an impossibility.</p></li>
49
50 <li><p>The files that comprise the global state of a repository
51 consist of a header followed by optional content. Every
52 file contains an RSA signature in the header. And every
53 file contains a "file type" designator in the header.
54 Additional information is also found in the header depending
55 on the file type.</p></li>
56
57 <li><p>The file that comprise the global state of a repository
58 are the complete global state of that repository. The SQLite
59 database that holds the repository contains additional information
60 about linkages between files, but all of that added information
61 can be discarded and reconstructed by scanning the content
62 files.</p></li>
63
64 <li><p>Two repositories for the same project can synchronize
65 their global states simply by sharing files. The local
66 state of repositories is not normally synchronized or
67 shared.</p></li>
68
69 <li><p>The name of a file is its SHA256 hash in a base-32
70 encoding. The digits of the base-32 encode are as
71 follows:
72
73 <blockquote><b>
74 0123456789abcdefghjkmnpqrstuvwxy
75 </b></blockquote>
76
77 <p>The letters "o", "i", and "l" are omitted from the
78 encoding character set to avoid confusion with the
79 digits "0" and "1". On input, upper and lower case
80 letters are treated the same, the letter "o" is
81 interpreted as a zero ("0") and the letters "i" and
82 "l" are interpreted as a one ("1"). The full name of
83 a file is 52 characters long. The first 4 bits of the
84 SHA256 has are repeated onto the end of the hash so that
85 the last digit in the base-32 encoding will contain a
86 full 5 bits.
87 For convenience, files
88 may often be abbreviated to a unique prefix and the
89 repository will automatically expand the name to
90 its full 52 characters. In practice, 5 or 6
91 characters are usually sufficient to give a unique
92 name prefix to files even in the largest of projects.</p></li>
93 </ul>
94
95 </body>
96 </html>
97
--- www/pop.html
+++ www/pop.html
@@ -27,18 +27,18 @@
27 for each repository is private to that repository.
28 The global state represents the content of the project.
29 The local state identifies the authorized users and
30 access policies for a particular repository.</p></li>
31
32 <li><p>The global state of a repository is an unordered
33 collection of files. Each file is named by
34 its SHA1 hash encoded in hexadecimal.
35 In many contexts, the name can be
36 abbreviated to a unique prefix. A five- or six-character
37 prefix usually suffices to uniquely identify a file.</p></li>
38
39 <li><p>Because files are named by their SHA1 hash, all files
40 are immutable. Any change to the content of a file also
41 changes the hash that forms the files name, thus
42 creating a new file. Both the old original version of the
43 file and the new change are preserved under different names.</p></li>
44
@@ -45,52 +45,37 @@
45 <li><p>It is theoretically possible for two files with different
46 content to share the same hash. But finding two such
47 files is so incredibly difficult and unlikely that we
48 consider it to be an impossibility.</p></li>
49
50 <li><p>The signature of a file is the SHA1 hash of the
51 file itself, exactly as it appears on disk. No prefix
52 or meta-information about the file is added before computing
53 the hash. So you can
54 always find the SHA1 signature of a file by using the
55 "sha1sum" command-line utility.</p></li>
56
57 <li><p>The files that comprise the global state of a repository
58 are the complete global state of that repository. The SQLite
59 database that holds the repository contains additional information
60 about linkages between files, but all of that added information
61 can be discarded and reconstructed by rescanning the content
62 files.</p></li>
63
64 <li><p>Two repositories for the same project can synchronize
65 their global states simply by sharing files. The local
66 state of repositories is not normally synchronized or
67 shared.</p></li>
68
69 <li><p>Every repository has a special file at the top-level
70 named "manifest" which is an index of all other files in
71 the system. The manifest is automatically created and
72 maintained by the system.</p></li>
73
74 <li><p>The <a href="fileformat.html">file format</a>
75 is very simple so that with access
76 to the original content files, one can easily reconstruct
77 the content of a baseline without the need for any
78 special tools or software.</p></li>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
80 </body>
81 </html>
82
+17 -38
--- www/selfcheck.html
+++ www/selfcheck.html
@@ -17,12 +17,12 @@
1717
1818
<h2>Atomic Check-ins With Rollback</h2>
1919
2020
<p>
2121
The fossil repository is an
22
-<a href="http://www.sqlite.org/">SQLite</a> database file. SQLite
23
-is very mature and stable and has been in wide-spread use for many
22
+<a href="http://www.sqlite.org/">SQLite version 3</a> database file.
23
+SQLite is very mature and stable and has been in wide-spread use for many
2424
years, so we have little worries that it might cause repository
2525
corruption. SQLite
2626
databases do not corrupt even if a program or system crash or power
2727
failure occurs in the middle of the update. If some kind of crash
2828
does occur in the middle of a change, then all the changes are rolled
@@ -56,14 +56,14 @@
5656
of every content file that it changes just prior to transaction
5757
commit. So during the course of check-in, many different files
5858
in the repository might be modified. Some files are simply
5959
compressed. Other files are delta encoded and then compressed.
6060
While all this is going on, fossil makes a record of every file
61
-that is encoded and the MD5 hash of the original content of that
61
+that is encoded and the SHA1 hash of the original content of that
6262
file. Then just before transaction commit, fossil re-extracts
6363
the original content of all files that were written, computes
64
-the MD5 checksum again, and verifies that the checksums match.
64
+the SHA1 checksum again, and verifies that the checksums match.
6565
If anything does not match up, an error
6666
message is printed and the transaction rolls back.
6767
</p>
6868
6969
<p>
@@ -72,40 +72,19 @@
7272
Hence bugs in fossil are unlikely to corrupt the repository in
7373
a way that prevents us from extracting historical versions of
7474
files.
7575
</p>
7676
77
-<h2>Checksums on all files and versions</h2>
78
-
79
-<p>
80
-Repository records of type "file" (records that hold the content
81
-of project files) contain a "cksum" property which records the
82
-MD5 checksum of the content of that file. So if something goes
83
-wrong in the file extraction process we will at least know about
84
-it. This checksum is in addition to the digital signature that
85
-is over the entire header and content of the record.
86
-</p>
87
-
88
-<p>
89
-Repository records of type "version" contain a "cksum"
90
-property that holds the MD5 checksum of the concatenation of
91
-every file in the entire project. During a check-in, after
92
-fossil has inserted all changes into the repository, it goes
93
-back and rereads every file out of the repository and recomputes
94
-this global checksum based on the respository content. It then
95
-computes an MD5 checksum over the files on disk. If these two
96
-checksums do not match, the check-in files and rolls back.
97
-Thus if a check-in transaction is successful, we have high
98
-confidence that the content in the repository exactly matches
99
-the content on disk.
100
-</p>
101
-
102
-<p>
103
-Every project files is verified by three separate checksums.
104
-There is an SHA256 checksum used as part of the digital signature
105
-on the file. There is an MD5 checksum on the content of each
106
-individual file. And there is a global MD5 checksum over the
107
-entire project source tree. If any of these cross-checks do not
108
-match then the operation fails and an error is displayed. Taken
109
-together, these cross-checks give us high confidence that the
110
-files you checked out are identical to the files you checked in.
77
+<h2>Checksum Over All Files In A Baseline</h2>
78
+
79
+<p>
80
+Manifest files that define a baseline have two fields (the
81
+R-line and Z-line) that record MD5 hashs of the manifest itself
82
+and of all other files in the manifest. Prior to any check-in
83
+commit, these checksums are verified to ensure that the baseline
84
+checked in agrees exactly with what is on disk. Similarly,
85
+the repository checksum is verified after a checkout to make
86
+sure that the entire repository was checked out correctly.
87
+Note that these added checks use a different hash (MD5 instead
88
+of SHA1) in order to avoid common-mode failures in the hash
89
+algorithm implementation.
11190
</p>
11291
--- www/selfcheck.html
+++ www/selfcheck.html
@@ -17,12 +17,12 @@
17
18 <h2>Atomic Check-ins With Rollback</h2>
19
20 <p>
21 The fossil repository is an
22 <a href="http://www.sqlite.org/">SQLite</a> database file. SQLite
23 is very mature and stable and has been in wide-spread use for many
24 years, so we have little worries that it might cause repository
25 corruption. SQLite
26 databases do not corrupt even if a program or system crash or power
27 failure occurs in the middle of the update. If some kind of crash
28 does occur in the middle of a change, then all the changes are rolled
@@ -56,14 +56,14 @@
56 of every content file that it changes just prior to transaction
57 commit. So during the course of check-in, many different files
58 in the repository might be modified. Some files are simply
59 compressed. Other files are delta encoded and then compressed.
60 While all this is going on, fossil makes a record of every file
61 that is encoded and the MD5 hash of the original content of that
62 file. Then just before transaction commit, fossil re-extracts
63 the original content of all files that were written, computes
64 the MD5 checksum again, and verifies that the checksums match.
65 If anything does not match up, an error
66 message is printed and the transaction rolls back.
67 </p>
68
69 <p>
@@ -72,40 +72,19 @@
72 Hence bugs in fossil are unlikely to corrupt the repository in
73 a way that prevents us from extracting historical versions of
74 files.
75 </p>
76
77 <h2>Checksums on all files and versions</h2>
78
79 <p>
80 Repository records of type "file" (records that hold the content
81 of project files) contain a "cksum" property which records the
82 MD5 checksum of the content of that file. So if something goes
83 wrong in the file extraction process we will at least know about
84 it. This checksum is in addition to the digital signature that
85 is over the entire header and content of the record.
86 </p>
87
88 <p>
89 Repository records of type "version" contain a "cksum"
90 property that holds the MD5 checksum of the concatenation of
91 every file in the entire project. During a check-in, after
92 fossil has inserted all changes into the repository, it goes
93 back and rereads every file out of the repository and recomputes
94 this global checksum based on the respository content. It then
95 computes an MD5 checksum over the files on disk. If these two
96 checksums do not match, the check-in files and rolls back.
97 Thus if a check-in transaction is successful, we have high
98 confidence that the content in the repository exactly matches
99 the content on disk.
100 </p>
101
102 <p>
103 Every project files is verified by three separate checksums.
104 There is an SHA256 checksum used as part of the digital signature
105 on the file. There is an MD5 checksum on the content of each
106 individual file. And there is a global MD5 checksum over the
107 entire project source tree. If any of these cross-checks do not
108 match then the operation fails and an error is displayed. Taken
109 together, these cross-checks give us high confidence that the
110 files you checked out are identical to the files you checked in.
111 </p>
112
--- www/selfcheck.html
+++ www/selfcheck.html
@@ -17,12 +17,12 @@
17
18 <h2>Atomic Check-ins With Rollback</h2>
19
20 <p>
21 The fossil repository is an
22 <a href="http://www.sqlite.org/">SQLite version 3</a> database file.
23 SQLite is very mature and stable and has been in wide-spread use for many
24 years, so we have little worries that it might cause repository
25 corruption. SQLite
26 databases do not corrupt even if a program or system crash or power
27 failure occurs in the middle of the update. If some kind of crash
28 does occur in the middle of a change, then all the changes are rolled
@@ -56,14 +56,14 @@
56 of every content file that it changes just prior to transaction
57 commit. So during the course of check-in, many different files
58 in the repository might be modified. Some files are simply
59 compressed. Other files are delta encoded and then compressed.
60 While all this is going on, fossil makes a record of every file
61 that is encoded and the SHA1 hash of the original content of that
62 file. Then just before transaction commit, fossil re-extracts
63 the original content of all files that were written, computes
64 the SHA1 checksum again, and verifies that the checksums match.
65 If anything does not match up, an error
66 message is printed and the transaction rolls back.
67 </p>
68
69 <p>
@@ -72,40 +72,19 @@
72 Hence bugs in fossil are unlikely to corrupt the repository in
73 a way that prevents us from extracting historical versions of
74 files.
75 </p>
76
77 <h2>Checksum Over All Files In A Baseline</h2>
78
79 <p>
80 Manifest files that define a baseline have two fields (the
81 R-line and Z-line) that record MD5 hashs of the manifest itself
82 and of all other files in the manifest. Prior to any check-in
83 commit, these checksums are verified to ensure that the baseline
84 checked in agrees exactly with what is on disk. Similarly,
85 the repository checksum is verified after a checkout to make
86 sure that the entire repository was checked out correctly.
87 Note that these added checks use a different hash (MD5 instead
88 of SHA1) in order to avoid common-mode failures in the hash
89 algorithm implementation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90 </p>
91

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button