Fossil SCM

Describe an enhancement to manifest artifacts that allows for an hierarchical description of the structure of a check-in. It is hoped that this new format will work more efficiently for large repositories, and make clone and pull from Git much easier and faster. This check-in is a documentation change only. the new hierarchical manifest type has not yet been implemented in code.

drh 2015-12-22 07:18 UTC trunk
Commit 7576a0f1b979662ea6ecec4ab3cd009a9bd5fbe2
1 file changed +204 -97
+204 -97
--- www/fileformat.wiki
+++ www/fileformat.wiki
@@ -9,11 +9,11 @@
99
searchable, and extensible by people not yet born.
1010
1111
The global state of a fossil repository is an unordered
1212
set of <i>artifacts</i>.
1313
An artifact might be a source code file, the text of a wiki page,
14
-part of a trouble ticket, or one of several special control artifacts
14
+part of a trouble ticket, or one of several special artifacts
1515
used to show the relationships between other artifacts within the
1616
project. Each artifact is normally represented on disk as a separate
1717
file. Artifacts can be text or binary.
1818
1919
In addition to the global state,
@@ -33,26 +33,11 @@
3333
No prefixes or meta information is added to an artifact before
3434
its hash is computed. The name of an artifact in the repository
3535
is exactly the same SHA1 hash that is computed by sha1sum
3636
on the file as it exists in your source tree.</p>
3737
38
-Some artifacts have a particular format which gives them special
39
-meaning to fossil. Fossil recognizes:
40
-
41
-<ul>
42
-<li> [#manifest | Manifests] </li>
43
-<li> [#cluster | Clusters] </li>
44
-<li> [#ctrl | Control Artifacts] </li>
45
-<li> [#wikichng | Wiki Pages] </li>
46
-<li> [#tktchng | Ticket Changes] </li>
47
-<li> [#attachment | Attachments] </li>
48
-<li> [#event | TechNotes] </li>
49
-</ul>
50
-
51
-These seven artifact types are described in the following sections.
52
-
53
-In the current implementation (as of 2009-01-25) the artifacts that
38
+In the current implementation (as of 2015-12-23) the artifacts that
5439
make up a fossil repository are stored as delta- and zlib-compressed
5540
blobs in an <a href="http://www.sqlite.org/">SQLite</a> database. This
5641
is an implementation detail and might change in a future release. For
5742
the purpose of this article "file format" means the format of the artifacts,
5843
not how the artifacts are stored on disk. It is the artifact format that
@@ -60,42 +45,69 @@
6045
disk, though stable, is not intended to live as long as the
6146
artifact format.
6247
6348
All of the artifacts can be extracted from a Fossil repository using
6449
the "fossil deconstruct" command.
50
+
51
+<h2>1.0 Special Artifacts</h2>
52
+
53
+Some artifacts have a particular format which gives them special
54
+meaning to fossil. Fossil recognizes:
55
+
56
+<ul>
57
+<li> [#manifest | Manifests] </li>
58
+<li> [#directory | Directories] </li>
59
+<li> [#cluster | Clusters] </li>
60
+<li> [#ctrl | Tags] </li>
61
+<li> [#wikichng | Wiki Pages] </li>
62
+<li> [#tktchng | Ticket Changes] </li>
63
+<li> [#attachment | Attachments] </li>
64
+<li> [#event | TechNotes] </li>
65
+</ul>
66
+
67
+Any artifact is not one of the above eight special artifacts is a
68
+"content" artifact. Every distinct version of every file under
69
+management is a content artifact, as are attachments to wiki pages
70
+and tickets.
71
+
72
+Any artifact that follows the appropriate syntactic rules is a special
73
+artifact. It is possible for the same artifact to be used as both
74
+a special artifact and a content artifact, thought this is rare and
75
+probably undesirable. (Future versions of Fossil might restrict attempts
76
+to check-in special artifacts as content files.)
77
+To prevent accidental occurrences of the same artifact being used as both
78
+a special artifact and a content artifact, the syntactic rules for
79
+special artifacts are very strict.
80
+
81
+All special artifacts are pure UTF8 text. Newline characters
82
+(ASCII 0x0a) separate the artifact into "cards".
83
+Each card begins with a single
84
+character "card type". Zero or more arguments may follow
85
+the card type. All arguments are separated from each other
86
+and from the card-type character by a single space
87
+character (ASCII 0x20). There is no surplus white space between arguments
88
+and no leading or trailing whitespace except for the newline
89
+character that acts as the card separator.
90
+
91
+All cards of a special artifact occur in strict sorted lexicographical order.
92
+No card may be duplicated.
93
+Some special artifacts (example: [#manifest|manifests])
94
+may be PGP clear-signed, but otherwise special artifacts
95
+may contain no additional text or data.
96
+
6597
6698
<a name="manifest"></a>
67
-<h2>1.0 The Manifest</h2>
99
+<h2>1.1 The Manifest Artifact</h2>
68100
69101
A manifest defines a check-in or version of the project
70102
source tree. The manifest contains a list of artifacts for
71103
each file in the project and the corresponding filenames, as
72104
well as information such as parent check-ins, the name of the
73105
programmer who created the check-in, the date and time when
74106
the check-in was created, and any check-in comments associated
75107
with the check-in.
76108
77
-Any artifact in the repository that follows the syntactic rules
78
-of a manifest is a manifest. Note that a manifest can
79
-be both a real manifest and also a content file, though this
80
-is rare.
81
-
82
-A manifest is a text file. Newline characters
83
-(ASCII 0x0a) separate the file into "cards".
84
-Each card begins with a single
85
-character "card type". Zero or more arguments may follow
86
-the card type. All arguments are separated from each other
87
-and from the card-type character by a single space
88
-character. There is no surplus white space between arguments
89
-and no leading or trailing whitespace except for the newline
90
-character that acts as the card separator.
91
-
92
-All cards of the manifest occur in strict sorted lexicographical order.
93
-No card may be duplicated.
94
-The entire manifest may be PGP clear-signed, but otherwise it
95
-may contain no additional text or data beyond what is described here.
96
-
97109
Allowed cards in the manifest are as follows:
98110
99111
<blockquote>
100112
<b>B</b> <i>baseline-manifest</i><br>
101113
<b>C</b> <i>checkin-comment</i><br>
@@ -126,47 +138,73 @@
126138
newline (ASCII 0x0a) is "\n" (ASCII 0x5C, x6E). A backslash
127139
(ASCII 0x5C) is represented as two backslashes "\\". Apart from
128140
space and newline, no other whitespace characters are allowed in
129141
the check-in comment. Nor are any unprintable characters allowed
130142
in the comment.
143
+
144
+A manifest has zero or one N-cards. The N-card specifies the mimetype for the
145
+text in the comment of the C-card. If the N-card is omitted, a default mimetype
146
+is used.
131147
132148
A manifest must have exactly one D-card. The sole argument to
133149
the D-card is a date-time stamp in the ISO8601 format. The
134150
date and time should be in coordinated universal time (UTC).
135
-The format one of:
151
+The format must be one of:
136152
137153
<blockquote>
138154
<i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i><br>
139155
<i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i><b>.</b><i>SSS</i>
140156
</blockquote>
141157
142
-A manifest has zero or more F-cards. Each F-card identifies a file
158
+A manifest has zero or more F-cards. Each F-card identifies a file or
159
+subdirectory
143160
that is part of the check-in. There are one, two, three, or four
144
-arguments. The first argument is the pathname of the file in the
161
+arguments. The first argument is the pathname of the file or
162
+subdirectory in the
145163
check-in relative to the root of the project file hierarchy. No ".."
146164
or "." directories are allowed within the filename. Space characters
147165
are escaped as in C-card comment text. Backslash characters and
148166
newlines are not allowed within filenames. The directory separator
149167
character is a forward slash (ASCII 0x2F). The second argument to the
150168
F-card is the full 40-character lower-case hexadecimal SHA1 hash of
151
-the content artifact. The second argument is required for baseline
169
+the content artifact, or of the [#directory|directory artifact] if
170
+the "d" permission is present. The second argument is required for baseline
152171
manifests but is optional for delta manifests. When the second
153172
argument to the F-card is omitted, it means that the file has been
154173
deleted relative to the baseline (files removed in baseline manifests
155174
versions are <em>not</em> added as F-cards). The optional 3rd argument
156175
defines any special access permissions associated with the file. This
157176
can be defined as "x" to mean that the file is executable or "l"
158
-(small letter ell) to mean a symlink. All files are always readable
159
-and writable. This can be expressed by "w" permission if desired but
160
-is optional. The file format might be extended with new permission
177
+(small letter ell) to mean a symlink or "d" to mean the entry describes
178
+a subdirectory rather than a file. All files and subdirectories
179
+are always readable and writable. This can be expressed by "w"
180
+permission if desired but the "w" permission is optional and is ignored
181
+by Fossil. The file format might be extended with new permission
161182
letters in the future. The optional 4th argument is the name of the
162183
same file as it existed in the parent check-in. If the name of the
163184
file is unchanged from its parent, then the 4th argument is omitted.
164185
165
-A manifest has zero or one N-cards. The N-card specifies the mimetype for the
166
-text in the comment of the C-card. If the N-card is omitted, a default mimetype
167
-is used.
186
+Manifests may be either flat or hierarchical. A flat manifest lists
187
+all files in the check-in, including all files in subdirectories. A
188
+flat manifest may not include F-cards with the "d" permission. An
189
+heirarchical manifest only lists the files or subdirectories at the
190
+top-level of the check-in. An heirarchical manifest may not include
191
+an F-card entries that have a directory separator character ("/").
192
+An heirarchical manifest may not be a delta-manifest (it may not have
193
+a B-card) nor may it be used as a baseline-manifest by some other
194
+delta-manifest. Hierarchical manifests
195
+are only recognized by Fossil versions 1.35 and later. Repositories
196
+that contain hierarchical manifests will cause problems for earlier
197
+versions of Fossil.
198
+
199
+When an F-card refers to a subdirectory (that is to say, when the
200
+F-card is part of an hierarchical manifest and contains the "d"
201
+permission) then the referenced directory artifact must be a
202
+[#directory|well-formed directory artifact] that contains a
203
+G-card that exactly matches the name of the subdirectory as assigned
204
+by the F-card. If these conditions are not met, then the artifact is
205
+not a valid manifest.
168206
169207
A manifest has zero or one P-cards. Most manifests have one P-card.
170208
The P-card has a varying number of arguments that
171209
defines other manifests from which the current manifest
172210
is derived. Each argument is an 40-character lowercase
@@ -234,34 +272,48 @@
234272
a sanity check to prove that the manifest is well-formed and
235273
consistent.
236274
237275
A sample manifest from Fossil itself can be seen
238276
[/artifact/28987096ac | here].
277
+
278
+<a name="directory"></a>
279
+<h3>1.2 Directory Artifacts</h3>
280
+
281
+A directory artifact describes the files and subdirectories within a
282
+single directory of an hierarchical manifest. Directory artifacts
283
+are only recognized by Fossil version 1.35 and later (circa 2015-12-23).
284
+
285
+Directory artifacts contain zero or more F-cards and exactly one Z-card,
286
+in the same format as a manifest. A directory artifact also contains
287
+exactly one G-card with a single argument that is the pathname
288
+of the directory relative to the root of the repository.
289
+The format of the directory name in a G-card is
290
+the same as the format of a filename in an F-card.
291
+
292
+The F-cards in a directory artifact may not contain directory separator
293
+characters. The content of subdirectories must be expressed using
294
+additional directory artifacts referenced by F-cards with the "d"
295
+permission. All F-cards in a directory artifact must contain at least
296
+two arguments.
297
+
298
+When an F-card X of directory artifact Y refers to
299
+subdirectory Z (that is to say, when F-card X contains
300
+the "d" permission and the second argument on X is the SHA1
301
+hash of directory artifact Z) then the G-card of Z must
302
+be the concatenation of the G-card on artifact Y, the
303
+directory separator character "/" and the first argument to
304
+the F-card X. Otherwise, the artifact Y is not a valid
305
+directory artifact.
239306
240307
<a name="cluster"></a>
241
-<h2>2.0 Clusters</h2>
308
+<h3>1.3 Clusters Artifacts</h3>
242309
243310
A cluster is an artifact that declares the existence of other artifacts.
244311
Clusters are used during repository synchronization to help
245312
reduce network traffic. As such, clusters are an optimization and
246313
may be removed from a repository without loss or damage to the
247
-underlying project code.
248
-
249
-Clusters follow a syntax that is very similar to manifests.
250
-A Cluster is a line-oriented text file. Newline characters
251
-(ASCII 0x0a) separate the artifact into cards. Each card begins with a single
252
-character "card type". Zero or more arguments may follow
253
-the card type. All arguments are separated from each other
254
-and from the card-type character by a single space
255
-character. There is no surplus white space between arguments
256
-and no leading or trailing whitespace except for the newline
257
-character that acts as the card separator.
258
-All cards of a cluster occur in strict sorted lexicographical order.
259
-No card may be duplicated.
260
-The cluster may not contain additional text or data beyond
261
-what is described here.
262
-Unlike manifests, clusters are never PGP signed.
314
+underlying project code. Clusters may not be PGP clearsigned.
263315
264316
Allowed cards in the cluster are as follows:
265317
266318
<blockquote>
267319
<b>M</b> <i>artifact-id</i><br />
@@ -277,35 +329,32 @@
277329
278330
An example cluster from Fossil can be seen
279331
[/artifact/d03dbdd73a2a8 | here].
280332
281333
<a name="ctrl"></a>
282
-<h2>3.0 Control Artifacts</h2>
283
-
284
-Control artifacts are used to assign properties to other artifacts
285
-within the repository. The basic format of a control artifact is
286
-the same as a manifest or cluster. A control artifact is a text
287
-file divided into cards by newline characters. Each card has a
288
-single-character card type followed by arguments. Spaces separate
289
-the card type and the arguments. No surplus whitespace is allowed.
290
-All cards must occur in strict lexicographical order.
291
-
292
-Allowed cards in a control artifact are as follows:
334
+<h3>1.4 Tag Artifacts</h3>
335
+
336
+Tag artifacts are used to assign properties to other artifacts
337
+within the repository. Tag artifacts where called "control artifacts"
338
+in an earlier version of this document. Though their name has changed
339
+in the documentation, their function has not.
340
+
341
+Allowed cards in a tag artifact are as follows:
293342
294343
<blockquote>
295344
<b>D</b> <i>time-and-date-stamp</i><br />
296345
<b>T</b> (<b>+</b>|<b>-</b>|<b>*</b>)<i>tag-name</i> <i>artifact-id</i> ?<i>value</i>?<br />
297346
<b>U</b> <i>user-name</i><br />
298347
<b>Z</b> <i>checksum</i><br />
299348
</blockquote>
300349
301
-A control artifact must have one D card, one U card, one Z card and
350
+A tag artifact must have one D card, one U card, one Z card and
302351
one or more T cards. No other cards or other text is
303
-allowed in a control artifact. Control artifacts might be PGP
352
+allowed in a control artifact. Tag artifacts might be PGP
304353
clearsigned.
305354
306
-The D card and the Z card of a control artifact are the same
355
+The D card and the Z card of a tag artifact are the same
307356
as in a manifest.
308357
309358
The T card represents a [./branching.wiki#tags | tag or property]
310359
that is applied to
311360
some other artifact. The T card has two or three values. The
@@ -333,20 +382,18 @@
333382
belongs to. Symbolic tags begin with the "sym-" prefix.
334383
335384
The U card is the name of the user that created the control
336385
artifact. The Z card is the usual required artifact checksum.
337386
338
-An example control artifacts can be seen [/info/9d302ccda8 | here].
387
+An example tag artifacts can be seen [/info/9d302ccda8 | here].
339388
340389
341390
<a name="wikichng"></a>
342
-<h2>4.0 Wiki Pages</h2>
391
+<h3>1.5 Wiki Pages</h3>
343392
344
-A wiki page is an artifact with a format similar to manifests,
345
-clusters, and control artifacts. The artifact is divided into
346
-cards by newline characters. The format of each card is as in
347
-manifests, clusters, and control artifacts. Wiki artifacts accept
393
+A wiki artifact defines a single version of a single wiki
394
+page. Wiki artifacts accept
348395
the following card types:
349396
350397
<blockquote>
351398
<b>D</b> <i>time-and-date-stamp</i><br />
352399
<b>L</b> <i>wiki-title</i><br />
@@ -374,11 +421,11 @@
374421
375422
An example wiki artifact can be seen
376423
[/artifact?name=7b2f5fd0e0&txt=1 | here].
377424
378425
<a name="tktchng"></a>
379
-<h2>5.0 Ticket Changes</h2>
426
+<h3>1.6 Ticket Changes</h3>
380427
381428
A ticket-change artifact represents a change to a trouble ticket.
382429
The following cards are allowed on a ticket change artifact:
383430
384431
<blockquote>
@@ -420,11 +467,11 @@
420467
421468
An example ticket-change artifact can be seen
422469
[/artifact/91f1ec6af053 | here].
423470
424471
<a name="attachment"></a>
425
-<h2>6.0 Attachments</h2>
472
+<h3>1.7 Attachments</h3>
426473
427474
An attachment artifact associates some other artifact that is the
428475
attachment (the source artifact) with a ticket or wiki page or
429476
technical note to which
430477
the attachment is connected (the target artifact).
@@ -462,11 +509,11 @@
462509
The Z card is the usual checksum over the rest of the attachment artifact.
463510
The Z card is required.
464511
465512
466513
<a name="event"></a>
467
-<h2>7.0 Technical Notes</h2>
514
+<h3>1.8 Technical Notes</h3>
468515
469516
A technical note or "technote" artifact (formerly known as an "event" artifact)
470517
associates a timeline comment and a page of text
471518
(similar to a wiki page) with a point in time. Technotes can be used
472519
to record project milestones, release notes, blog entries, process
@@ -509,11 +556,11 @@
509556
the other.
510557
511558
A technote might contain one or more T-cards used to set
512559
[./branching.wiki#tags | tags or properties]
513560
on the technote. The format of the T-card is the same as
514
-described in [#ctrl | Control Artifacts] section above, except that the
561
+described in [#ctrl | Tag Artifacts] section above, except that the
515562
second argument is the single character "<b>*</b>" instead of an
516563
artifact ID and the name is always prefaced by "<b>+</b>".
517564
The <b>*</b> in place of the artifact ID indicates that
518565
the tag or property applies to the current artifact. It is not
519566
possible to encode the current artifact ID as part of an artifact,
@@ -531,11 +578,11 @@
531578
532579
The Z card is the required checksum over the rest of the artifact.
533580
534581
535582
<a name="summary"></a>
536
-<h2>8.0 Card Summary</h2>
583
+<h2>2.0 Card Summary</h2>
537584
538585
The following table summarizes the various kinds of cards that appear
539586
on Fossil artifacts. A blank entry means that combination of card and
540587
artifact is not legal. A number or range of numbers indicates the number
541588
of times a card may (or must) appear in the corresponding artifact type.
@@ -543,23 +590,25 @@
543590
or more such cards are required.
544591
545592
<table border=1 width="100%">
546593
<tr>
547594
<th rowspan=2 valign=bottom>Card Format</th>
548
-<th colspan=7>Used By</th>
595
+<th colspan=8>Used By</th>
549596
</tr>
550597
<tr>
551598
<th>Manifest</th>
599
+<th>Directory</th>
552600
<th>Cluster</th>
553
-<th>Control</th>
601
+<th>Tag</th>
554602
<th>Wiki</th>
555603
<th>Ticket</th>
556604
<th>Attachment</th>
557605
<th>Technote</th>
558606
</tr>
559607
<tr>
560608
<td><b>A</b> <i>filename</i> <i>target</i> ?<i>source</i>?</td>
609
+<td>&nbsp;</td>
561610
<td>&nbsp;</td>
562611
<td>&nbsp;</td>
563612
<td>&nbsp;</td>
564613
<td>&nbsp;</td>
565614
<td>&nbsp;</td>
@@ -573,15 +622,18 @@
573622
<td>&nbsp;</td>
574623
<td>&nbsp;</td>
575624
<td>&nbsp;</td>
576625
<td>&nbsp;</td>
577626
<td>&nbsp;</td>
627
+<td>&nbsp;</td>
578628
</tr>
579
-<tr><td>&nbsp;</td><td colspan='7'>* = Required for delta manifests</td></tr>
629
+<tr><td>&nbsp;</td><td colspan='8'>* = Required for delta manifests,
630
+Disallowed for hierarchical manifests.</td></tr>
580631
<tr>
581632
<td><b>C</b> <i>comment-text</i></td>
582633
<td align=center><b>1</b></td>
634
+<td>&nbsp;</td>
583635
<td>&nbsp;</td>
584636
<td>&nbsp;</td>
585637
<td>&nbsp;</td>
586638
<td>&nbsp;</td>
587639
<td align=center><b>0-1</b></td>
@@ -588,10 +640,11 @@
588640
<td align=center><b>0-1</b></td>
589641
</tr>
590642
<tr>
591643
<td><b>D</b> <i>date-time-stamp</i></td>
592644
<td align=center><b>1</b></td>
645
+<td>&nbsp;</td>
593646
<td>&nbsp;</td>
594647
<td align=center><b>1</b></td>
595648
<td align=center><b>1</b></td>
596649
<td align=center><b>1</b></td>
597650
<td align=center><b>1</b></td>
@@ -602,25 +655,39 @@
602655
<td>&nbsp;</td>
603656
<td>&nbsp;</td>
604657
<td>&nbsp;</td>
605658
<td>&nbsp;</td>
606659
<td>&nbsp;</td>
660
+<td>&nbsp;</td>
607661
<td>&nbsp;</td>
608662
<td align=center><b>1</b></td>
609663
</tr>
610664
<tr>
611665
<td><b>F</b> <i>filename</i> ?<i>uuid</i>? ?<i>permissions</i>? ?<i>oldname</i>?</td>
666
+<td align=center><b>0+</b></td>
612667
<td align=center><b>0+</b></td>
613668
<td>&nbsp;</td>
614669
<td>&nbsp;</td>
615670
<td>&nbsp;</td>
616671
<td>&nbsp;</td>
617672
<td>&nbsp;</td>
618673
<td>&nbsp;</td>
619674
</tr>
675
+<tr>
676
+<td><b>G</b> <i>fileame</i>
677
+<td>&nbsp;</td>
678
+<td align=center><b>1</b></td>
679
+<td>&nbsp;</td>
680
+<td>&nbsp;</td>
681
+<td>&nbsp;</td>
682
+<td>&nbsp;</td>
683
+<td>&nbsp;</td>
684
+<td>&nbsp;</td>
685
+</tr>
620686
<tr>
621687
<td><b>J</b> <i>name</i> ?<i>value</i>?</td>
688
+<td>&nbsp;</td>
622689
<td>&nbsp;</td>
623690
<td>&nbsp;</td>
624691
<td>&nbsp;</td>
625692
<td>&nbsp;</td>
626693
<td align=center><b>1+</b></td>
@@ -630,17 +697,19 @@
630697
<tr>
631698
<td><b>K</b> <i>ticket-uuid</i></td>
632699
<td>&nbsp;</td>
633700
<td>&nbsp;</td>
634701
<td>&nbsp;</td>
702
+<td>&nbsp;</td>
635703
<td>&nbsp;</td>
636704
<td align=center><b>1</b></td>
637705
<td>&nbsp;</td>
638706
<td>&nbsp;</td>
639707
</tr>
640708
<tr>
641709
<td><b>L</b> <i>wiki-title</i></td>
710
+<td>&nbsp;</td>
642711
<td>&nbsp;</td>
643712
<td>&nbsp;</td>
644713
<td>&nbsp;</td>
645714
<td align=center><b>1</b></td>
646715
<td>&nbsp;</td>
@@ -647,10 +716,11 @@
647716
<td>&nbsp;</td>
648717
<td>&nbsp;</td>
649718
</tr>
650719
<tr>
651720
<td><b>M</b> <i>uuid</i></td>
721
+<td>&nbsp;</td>
652722
<td>&nbsp;</td>
653723
<td align=center><b>1+</b></td>
654724
<td>&nbsp;</td>
655725
<td>&nbsp;</td>
656726
<td>&nbsp;</td>
@@ -659,19 +729,21 @@
659729
</tr>
660730
<tr>
661731
<td><b>N</b> <i>mimetype</i></td>
662732
<td align=center><b>0-1</b></td>
663733
<td>&nbsp;</td>
734
+<td>&nbsp;</td>
664735
<td>&nbsp;</td>
665736
<td align=center><b>0-1</b></td>
666737
<td>&nbsp;</td>
667738
<td align=center><b>0-1</b></td>
668739
<td align=center><b>0-1</b></td>
669740
</tr>
670741
<tr>
671742
<td><b>P</b> <i>uuid ...</i></td>
672743
<td align=center><b>0-1</b></td>
744
+<td>&nbsp;</td>
673745
<td>&nbsp;</td>
674746
<td>&nbsp;</td>
675747
<td align=center><b>0-1</b></td>
676748
<td>&nbsp;</td>
677749
<td>&nbsp;</td>
@@ -684,10 +756,11 @@
684756
<td>&nbsp;</td>
685757
<td>&nbsp;</td>
686758
<td>&nbsp;</td>
687759
<td>&nbsp;</td>
688760
<td>&nbsp;</td>
761
+<td>&nbsp;</td>
689762
</tr>
690763
<tr>
691764
<td><b>R</b> <i>md5sum</i></td>
692765
<td align=center><b>0-1</b></td>
693766
<td>&nbsp;</td>
@@ -694,13 +767,15 @@
694767
<td>&nbsp;</td>
695768
<td>&nbsp;</td>
696769
<td>&nbsp;</td>
697770
<td>&nbsp;</td>
698771
<td>&nbsp;</td>
772
+<td>&nbsp;</td>
699773
<tr>
700774
<td><b>T</b> (<b>+</b>|<b>*</b>|<b>-</b>)<i>tagname</i> <i>uuid</i> ?<i>value</i>?</td>
701775
<td align=center><b>0+</b></td>
776
+<td>&nbsp;</td>
702777
<td>&nbsp;</td>
703778
<td align=center><b>1+</b></td>
704779
<td>&nbsp;</td>
705780
<td>&nbsp;</td>
706781
<td>&nbsp;</td>
@@ -707,19 +782,21 @@
707782
<td align=center><b>0+</b></td>
708783
</tr>
709784
<tr>
710785
<td><b>U</b> <i>username</i></td>
711786
<td align=center><b>1</b></td>
787
+<td>&nbsp;</td>
712788
<td>&nbsp;</td>
713789
<td align=center><b>1</b></td>
714790
<td align=center><b>1</b></td>
715791
<td align=center><b>1</b></td>
716792
<td align=center><b>0-1</b></td>
717793
<td align=center><b>0-1</b></td>
718794
</tr>
719795
<tr>
720796
<td><b>W</b> <i>size</i></td>
797
+<td>&nbsp;</td>
721798
<td>&nbsp;</td>
722799
<td>&nbsp;</td>
723800
<td>&nbsp;</td>
724801
<td align=center><b>1</b></td>
725802
<td>&nbsp;</td>
@@ -733,21 +810,22 @@
733810
<td align=center><b>1</b></td>
734811
<td align=center><b>1</b></td>
735812
<td align=center><b>1</b></td>
736813
<td align=center><b>1</b></td>
737814
<td align=center><b>1</b></td>
815
+<td align=center><b>1</b></td>
738816
</tr>
739817
</table>
740818
741819
742820
<a name="addenda"></a>
743
-<h2>9.0 Addenda</h2>
821
+<h2>3.0 Addenda</h2>
744822
745
-This section contains additional information which may be useful when
746
-implementing algorithms described above.
823
+This section contains additional information about the low-level artifact
824
+formats of Fossil.
747825
748
-<h3>R Card Hash Calculation</h3>
826
+<h3>3.1 R-Card Hash Calculation</h3>
749827
750828
Given a manifest file named <tt>MF</tt>, the following Bash shell code
751829
demonstrates how to compute the value of the R card in that manifest.
752830
This example uses manifest [28987096ac]. Lines starting with <tt>#</tt> are
753831
shell input and other lines are output. This demonstration assumes that the
@@ -781,5 +859,34 @@
781859
<tt>stat</tt> calls will fail to find such files (which are output in encoded
782860
form here). That approach also won't work for delta manifests. Calculating
783861
the R-card for delta manifests requires traversing both the delta and its baseline in
784862
lexical order of the files, preferring the delta's copy if both contain
785863
a given file.
864
+
865
+<h3>3.2 Different Kinds Of Manifest Artifacts</h3>
866
+
867
+The original (1.0) version of Fossil only supported flat baseline
868
+manifests. That means that all the files of a check-in had to be
869
+listed in every manifest. Because manifests are delta-encoded, there
870
+is not a storage space issue. Fossil was originally designed
871
+specifically to support the SQLite project, and as SQLite has fewer
872
+than 2000 files on any give version, a flat baseline manifest design
873
+worked well there and was simple to implement.
874
+
875
+However, some project (ex: NetBSD) contain a huge number of files in
876
+every version, and even though the manifests compressed will using
877
+delta-compression, many CPU cycles had to be spent to decompress those
878
+manifests. To help make Fossil more efficient for large projects like
879
+NetBSD, the concept of a delta-manifest was added. This helped a lot
880
+but was not a perfect solution.
881
+
882
+Later, the concept of an hierarchical manifest was added. By breaking
883
+up each manifest into many separate subdirectories it is hoped that
884
+the processing of projects with many files can be better optimized.
885
+The hierarchical manifest design also more closely resembles the low-level
886
+file format used by Git, thus making pull and clone from Git repositories
887
+easier.
888
+
889
+In retrospect, it would have been better if Fossil had only
890
+hierarchical manifests. But as there are many legacy repositories
891
+that use flat manifests and delta manifests, all three forms must
892
+be supported moving forward.
786893
--- www/fileformat.wiki
+++ www/fileformat.wiki
@@ -9,11 +9,11 @@
9 searchable, and extensible by people not yet born.
10
11 The global state of a fossil repository is an unordered
12 set of <i>artifacts</i>.
13 An artifact might be a source code file, the text of a wiki page,
14 part of a trouble ticket, or one of several special control artifacts
15 used to show the relationships between other artifacts within the
16 project. Each artifact is normally represented on disk as a separate
17 file. Artifacts can be text or binary.
18
19 In addition to the global state,
@@ -33,26 +33,11 @@
33 No prefixes or meta information is added to an artifact before
34 its hash is computed. The name of an artifact in the repository
35 is exactly the same SHA1 hash that is computed by sha1sum
36 on the file as it exists in your source tree.</p>
37
38 Some artifacts have a particular format which gives them special
39 meaning to fossil. Fossil recognizes:
40
41 <ul>
42 <li> [#manifest | Manifests] </li>
43 <li> [#cluster | Clusters] </li>
44 <li> [#ctrl | Control Artifacts] </li>
45 <li> [#wikichng | Wiki Pages] </li>
46 <li> [#tktchng | Ticket Changes] </li>
47 <li> [#attachment | Attachments] </li>
48 <li> [#event | TechNotes] </li>
49 </ul>
50
51 These seven artifact types are described in the following sections.
52
53 In the current implementation (as of 2009-01-25) the artifacts that
54 make up a fossil repository are stored as delta- and zlib-compressed
55 blobs in an <a href="http://www.sqlite.org/">SQLite</a> database. This
56 is an implementation detail and might change in a future release. For
57 the purpose of this article "file format" means the format of the artifacts,
58 not how the artifacts are stored on disk. It is the artifact format that
@@ -60,42 +45,69 @@
60 disk, though stable, is not intended to live as long as the
61 artifact format.
62
63 All of the artifacts can be extracted from a Fossil repository using
64 the "fossil deconstruct" command.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
66 <a name="manifest"></a>
67 <h2>1.0 The Manifest</h2>
68
69 A manifest defines a check-in or version of the project
70 source tree. The manifest contains a list of artifacts for
71 each file in the project and the corresponding filenames, as
72 well as information such as parent check-ins, the name of the
73 programmer who created the check-in, the date and time when
74 the check-in was created, and any check-in comments associated
75 with the check-in.
76
77 Any artifact in the repository that follows the syntactic rules
78 of a manifest is a manifest. Note that a manifest can
79 be both a real manifest and also a content file, though this
80 is rare.
81
82 A manifest is a text file. Newline characters
83 (ASCII 0x0a) separate the file into "cards".
84 Each card begins with a single
85 character "card type". Zero or more arguments may follow
86 the card type. All arguments are separated from each other
87 and from the card-type character by a single space
88 character. There is no surplus white space between arguments
89 and no leading or trailing whitespace except for the newline
90 character that acts as the card separator.
91
92 All cards of the manifest occur in strict sorted lexicographical order.
93 No card may be duplicated.
94 The entire manifest may be PGP clear-signed, but otherwise it
95 may contain no additional text or data beyond what is described here.
96
97 Allowed cards in the manifest are as follows:
98
99 <blockquote>
100 <b>B</b> <i>baseline-manifest</i><br>
101 <b>C</b> <i>checkin-comment</i><br>
@@ -126,47 +138,73 @@
126 newline (ASCII 0x0a) is "\n" (ASCII 0x5C, x6E). A backslash
127 (ASCII 0x5C) is represented as two backslashes "\\". Apart from
128 space and newline, no other whitespace characters are allowed in
129 the check-in comment. Nor are any unprintable characters allowed
130 in the comment.
 
 
 
 
131
132 A manifest must have exactly one D-card. The sole argument to
133 the D-card is a date-time stamp in the ISO8601 format. The
134 date and time should be in coordinated universal time (UTC).
135 The format one of:
136
137 <blockquote>
138 <i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i><br>
139 <i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i><b>.</b><i>SSS</i>
140 </blockquote>
141
142 A manifest has zero or more F-cards. Each F-card identifies a file
 
143 that is part of the check-in. There are one, two, three, or four
144 arguments. The first argument is the pathname of the file in the
 
145 check-in relative to the root of the project file hierarchy. No ".."
146 or "." directories are allowed within the filename. Space characters
147 are escaped as in C-card comment text. Backslash characters and
148 newlines are not allowed within filenames. The directory separator
149 character is a forward slash (ASCII 0x2F). The second argument to the
150 F-card is the full 40-character lower-case hexadecimal SHA1 hash of
151 the content artifact. The second argument is required for baseline
 
152 manifests but is optional for delta manifests. When the second
153 argument to the F-card is omitted, it means that the file has been
154 deleted relative to the baseline (files removed in baseline manifests
155 versions are <em>not</em> added as F-cards). The optional 3rd argument
156 defines any special access permissions associated with the file. This
157 can be defined as "x" to mean that the file is executable or "l"
158 (small letter ell) to mean a symlink. All files are always readable
159 and writable. This can be expressed by "w" permission if desired but
160 is optional. The file format might be extended with new permission
 
 
161 letters in the future. The optional 4th argument is the name of the
162 same file as it existed in the parent check-in. If the name of the
163 file is unchanged from its parent, then the 4th argument is omitted.
164
165 A manifest has zero or one N-cards. The N-card specifies the mimetype for the
166 text in the comment of the C-card. If the N-card is omitted, a default mimetype
167 is used.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
169 A manifest has zero or one P-cards. Most manifests have one P-card.
170 The P-card has a varying number of arguments that
171 defines other manifests from which the current manifest
172 is derived. Each argument is an 40-character lowercase
@@ -234,34 +272,48 @@
234 a sanity check to prove that the manifest is well-formed and
235 consistent.
236
237 A sample manifest from Fossil itself can be seen
238 [/artifact/28987096ac | here].
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
239
240 <a name="cluster"></a>
241 <h2>2.0 Clusters</h2>
242
243 A cluster is an artifact that declares the existence of other artifacts.
244 Clusters are used during repository synchronization to help
245 reduce network traffic. As such, clusters are an optimization and
246 may be removed from a repository without loss or damage to the
247 underlying project code.
248
249 Clusters follow a syntax that is very similar to manifests.
250 A Cluster is a line-oriented text file. Newline characters
251 (ASCII 0x0a) separate the artifact into cards. Each card begins with a single
252 character "card type". Zero or more arguments may follow
253 the card type. All arguments are separated from each other
254 and from the card-type character by a single space
255 character. There is no surplus white space between arguments
256 and no leading or trailing whitespace except for the newline
257 character that acts as the card separator.
258 All cards of a cluster occur in strict sorted lexicographical order.
259 No card may be duplicated.
260 The cluster may not contain additional text or data beyond
261 what is described here.
262 Unlike manifests, clusters are never PGP signed.
263
264 Allowed cards in the cluster are as follows:
265
266 <blockquote>
267 <b>M</b> <i>artifact-id</i><br />
@@ -277,35 +329,32 @@
277
278 An example cluster from Fossil can be seen
279 [/artifact/d03dbdd73a2a8 | here].
280
281 <a name="ctrl"></a>
282 <h2>3.0 Control Artifacts</h2>
283
284 Control artifacts are used to assign properties to other artifacts
285 within the repository. The basic format of a control artifact is
286 the same as a manifest or cluster. A control artifact is a text
287 file divided into cards by newline characters. Each card has a
288 single-character card type followed by arguments. Spaces separate
289 the card type and the arguments. No surplus whitespace is allowed.
290 All cards must occur in strict lexicographical order.
291
292 Allowed cards in a control artifact are as follows:
293
294 <blockquote>
295 <b>D</b> <i>time-and-date-stamp</i><br />
296 <b>T</b> (<b>+</b>|<b>-</b>|<b>*</b>)<i>tag-name</i> <i>artifact-id</i> ?<i>value</i>?<br />
297 <b>U</b> <i>user-name</i><br />
298 <b>Z</b> <i>checksum</i><br />
299 </blockquote>
300
301 A control artifact must have one D card, one U card, one Z card and
302 one or more T cards. No other cards or other text is
303 allowed in a control artifact. Control artifacts might be PGP
304 clearsigned.
305
306 The D card and the Z card of a control artifact are the same
307 as in a manifest.
308
309 The T card represents a [./branching.wiki#tags | tag or property]
310 that is applied to
311 some other artifact. The T card has two or three values. The
@@ -333,20 +382,18 @@
333 belongs to. Symbolic tags begin with the "sym-" prefix.
334
335 The U card is the name of the user that created the control
336 artifact. The Z card is the usual required artifact checksum.
337
338 An example control artifacts can be seen [/info/9d302ccda8 | here].
339
340
341 <a name="wikichng"></a>
342 <h2>4.0 Wiki Pages</h2>
343
344 A wiki page is an artifact with a format similar to manifests,
345 clusters, and control artifacts. The artifact is divided into
346 cards by newline characters. The format of each card is as in
347 manifests, clusters, and control artifacts. Wiki artifacts accept
348 the following card types:
349
350 <blockquote>
351 <b>D</b> <i>time-and-date-stamp</i><br />
352 <b>L</b> <i>wiki-title</i><br />
@@ -374,11 +421,11 @@
374
375 An example wiki artifact can be seen
376 [/artifact?name=7b2f5fd0e0&txt=1 | here].
377
378 <a name="tktchng"></a>
379 <h2>5.0 Ticket Changes</h2>
380
381 A ticket-change artifact represents a change to a trouble ticket.
382 The following cards are allowed on a ticket change artifact:
383
384 <blockquote>
@@ -420,11 +467,11 @@
420
421 An example ticket-change artifact can be seen
422 [/artifact/91f1ec6af053 | here].
423
424 <a name="attachment"></a>
425 <h2>6.0 Attachments</h2>
426
427 An attachment artifact associates some other artifact that is the
428 attachment (the source artifact) with a ticket or wiki page or
429 technical note to which
430 the attachment is connected (the target artifact).
@@ -462,11 +509,11 @@
462 The Z card is the usual checksum over the rest of the attachment artifact.
463 The Z card is required.
464
465
466 <a name="event"></a>
467 <h2>7.0 Technical Notes</h2>
468
469 A technical note or "technote" artifact (formerly known as an "event" artifact)
470 associates a timeline comment and a page of text
471 (similar to a wiki page) with a point in time. Technotes can be used
472 to record project milestones, release notes, blog entries, process
@@ -509,11 +556,11 @@
509 the other.
510
511 A technote might contain one or more T-cards used to set
512 [./branching.wiki#tags | tags or properties]
513 on the technote. The format of the T-card is the same as
514 described in [#ctrl | Control Artifacts] section above, except that the
515 second argument is the single character "<b>*</b>" instead of an
516 artifact ID and the name is always prefaced by "<b>+</b>".
517 The <b>*</b> in place of the artifact ID indicates that
518 the tag or property applies to the current artifact. It is not
519 possible to encode the current artifact ID as part of an artifact,
@@ -531,11 +578,11 @@
531
532 The Z card is the required checksum over the rest of the artifact.
533
534
535 <a name="summary"></a>
536 <h2>8.0 Card Summary</h2>
537
538 The following table summarizes the various kinds of cards that appear
539 on Fossil artifacts. A blank entry means that combination of card and
540 artifact is not legal. A number or range of numbers indicates the number
541 of times a card may (or must) appear in the corresponding artifact type.
@@ -543,23 +590,25 @@
543 or more such cards are required.
544
545 <table border=1 width="100%">
546 <tr>
547 <th rowspan=2 valign=bottom>Card Format</th>
548 <th colspan=7>Used By</th>
549 </tr>
550 <tr>
551 <th>Manifest</th>
 
552 <th>Cluster</th>
553 <th>Control</th>
554 <th>Wiki</th>
555 <th>Ticket</th>
556 <th>Attachment</th>
557 <th>Technote</th>
558 </tr>
559 <tr>
560 <td><b>A</b> <i>filename</i> <i>target</i> ?<i>source</i>?</td>
 
561 <td>&nbsp;</td>
562 <td>&nbsp;</td>
563 <td>&nbsp;</td>
564 <td>&nbsp;</td>
565 <td>&nbsp;</td>
@@ -573,15 +622,18 @@
573 <td>&nbsp;</td>
574 <td>&nbsp;</td>
575 <td>&nbsp;</td>
576 <td>&nbsp;</td>
577 <td>&nbsp;</td>
 
578 </tr>
579 <tr><td>&nbsp;</td><td colspan='7'>* = Required for delta manifests</td></tr>
 
580 <tr>
581 <td><b>C</b> <i>comment-text</i></td>
582 <td align=center><b>1</b></td>
 
583 <td>&nbsp;</td>
584 <td>&nbsp;</td>
585 <td>&nbsp;</td>
586 <td>&nbsp;</td>
587 <td align=center><b>0-1</b></td>
@@ -588,10 +640,11 @@
588 <td align=center><b>0-1</b></td>
589 </tr>
590 <tr>
591 <td><b>D</b> <i>date-time-stamp</i></td>
592 <td align=center><b>1</b></td>
 
593 <td>&nbsp;</td>
594 <td align=center><b>1</b></td>
595 <td align=center><b>1</b></td>
596 <td align=center><b>1</b></td>
597 <td align=center><b>1</b></td>
@@ -602,25 +655,39 @@
602 <td>&nbsp;</td>
603 <td>&nbsp;</td>
604 <td>&nbsp;</td>
605 <td>&nbsp;</td>
606 <td>&nbsp;</td>
 
607 <td>&nbsp;</td>
608 <td align=center><b>1</b></td>
609 </tr>
610 <tr>
611 <td><b>F</b> <i>filename</i> ?<i>uuid</i>? ?<i>permissions</i>? ?<i>oldname</i>?</td>
 
612 <td align=center><b>0+</b></td>
613 <td>&nbsp;</td>
614 <td>&nbsp;</td>
615 <td>&nbsp;</td>
616 <td>&nbsp;</td>
617 <td>&nbsp;</td>
618 <td>&nbsp;</td>
619 </tr>
 
 
 
 
 
 
 
 
 
 
 
620 <tr>
621 <td><b>J</b> <i>name</i> ?<i>value</i>?</td>
 
622 <td>&nbsp;</td>
623 <td>&nbsp;</td>
624 <td>&nbsp;</td>
625 <td>&nbsp;</td>
626 <td align=center><b>1+</b></td>
@@ -630,17 +697,19 @@
630 <tr>
631 <td><b>K</b> <i>ticket-uuid</i></td>
632 <td>&nbsp;</td>
633 <td>&nbsp;</td>
634 <td>&nbsp;</td>
 
635 <td>&nbsp;</td>
636 <td align=center><b>1</b></td>
637 <td>&nbsp;</td>
638 <td>&nbsp;</td>
639 </tr>
640 <tr>
641 <td><b>L</b> <i>wiki-title</i></td>
 
642 <td>&nbsp;</td>
643 <td>&nbsp;</td>
644 <td>&nbsp;</td>
645 <td align=center><b>1</b></td>
646 <td>&nbsp;</td>
@@ -647,10 +716,11 @@
647 <td>&nbsp;</td>
648 <td>&nbsp;</td>
649 </tr>
650 <tr>
651 <td><b>M</b> <i>uuid</i></td>
 
652 <td>&nbsp;</td>
653 <td align=center><b>1+</b></td>
654 <td>&nbsp;</td>
655 <td>&nbsp;</td>
656 <td>&nbsp;</td>
@@ -659,19 +729,21 @@
659 </tr>
660 <tr>
661 <td><b>N</b> <i>mimetype</i></td>
662 <td align=center><b>0-1</b></td>
663 <td>&nbsp;</td>
 
664 <td>&nbsp;</td>
665 <td align=center><b>0-1</b></td>
666 <td>&nbsp;</td>
667 <td align=center><b>0-1</b></td>
668 <td align=center><b>0-1</b></td>
669 </tr>
670 <tr>
671 <td><b>P</b> <i>uuid ...</i></td>
672 <td align=center><b>0-1</b></td>
 
673 <td>&nbsp;</td>
674 <td>&nbsp;</td>
675 <td align=center><b>0-1</b></td>
676 <td>&nbsp;</td>
677 <td>&nbsp;</td>
@@ -684,10 +756,11 @@
684 <td>&nbsp;</td>
685 <td>&nbsp;</td>
686 <td>&nbsp;</td>
687 <td>&nbsp;</td>
688 <td>&nbsp;</td>
 
689 </tr>
690 <tr>
691 <td><b>R</b> <i>md5sum</i></td>
692 <td align=center><b>0-1</b></td>
693 <td>&nbsp;</td>
@@ -694,13 +767,15 @@
694 <td>&nbsp;</td>
695 <td>&nbsp;</td>
696 <td>&nbsp;</td>
697 <td>&nbsp;</td>
698 <td>&nbsp;</td>
 
699 <tr>
700 <td><b>T</b> (<b>+</b>|<b>*</b>|<b>-</b>)<i>tagname</i> <i>uuid</i> ?<i>value</i>?</td>
701 <td align=center><b>0+</b></td>
 
702 <td>&nbsp;</td>
703 <td align=center><b>1+</b></td>
704 <td>&nbsp;</td>
705 <td>&nbsp;</td>
706 <td>&nbsp;</td>
@@ -707,19 +782,21 @@
707 <td align=center><b>0+</b></td>
708 </tr>
709 <tr>
710 <td><b>U</b> <i>username</i></td>
711 <td align=center><b>1</b></td>
 
712 <td>&nbsp;</td>
713 <td align=center><b>1</b></td>
714 <td align=center><b>1</b></td>
715 <td align=center><b>1</b></td>
716 <td align=center><b>0-1</b></td>
717 <td align=center><b>0-1</b></td>
718 </tr>
719 <tr>
720 <td><b>W</b> <i>size</i></td>
 
721 <td>&nbsp;</td>
722 <td>&nbsp;</td>
723 <td>&nbsp;</td>
724 <td align=center><b>1</b></td>
725 <td>&nbsp;</td>
@@ -733,21 +810,22 @@
733 <td align=center><b>1</b></td>
734 <td align=center><b>1</b></td>
735 <td align=center><b>1</b></td>
736 <td align=center><b>1</b></td>
737 <td align=center><b>1</b></td>
 
738 </tr>
739 </table>
740
741
742 <a name="addenda"></a>
743 <h2>9.0 Addenda</h2>
744
745 This section contains additional information which may be useful when
746 implementing algorithms described above.
747
748 <h3>R Card Hash Calculation</h3>
749
750 Given a manifest file named <tt>MF</tt>, the following Bash shell code
751 demonstrates how to compute the value of the R card in that manifest.
752 This example uses manifest [28987096ac]. Lines starting with <tt>#</tt> are
753 shell input and other lines are output. This demonstration assumes that the
@@ -781,5 +859,34 @@
781 <tt>stat</tt> calls will fail to find such files (which are output in encoded
782 form here). That approach also won't work for delta manifests. Calculating
783 the R-card for delta manifests requires traversing both the delta and its baseline in
784 lexical order of the files, preferring the delta's copy if both contain
785 a given file.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
786
--- www/fileformat.wiki
+++ www/fileformat.wiki
@@ -9,11 +9,11 @@
9 searchable, and extensible by people not yet born.
10
11 The global state of a fossil repository is an unordered
12 set of <i>artifacts</i>.
13 An artifact might be a source code file, the text of a wiki page,
14 part of a trouble ticket, or one of several special artifacts
15 used to show the relationships between other artifacts within the
16 project. Each artifact is normally represented on disk as a separate
17 file. Artifacts can be text or binary.
18
19 In addition to the global state,
@@ -33,26 +33,11 @@
33 No prefixes or meta information is added to an artifact before
34 its hash is computed. The name of an artifact in the repository
35 is exactly the same SHA1 hash that is computed by sha1sum
36 on the file as it exists in your source tree.</p>
37
38 In the current implementation (as of 2015-12-23) the artifacts that
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39 make up a fossil repository are stored as delta- and zlib-compressed
40 blobs in an <a href="http://www.sqlite.org/">SQLite</a> database. This
41 is an implementation detail and might change in a future release. For
42 the purpose of this article "file format" means the format of the artifacts,
43 not how the artifacts are stored on disk. It is the artifact format that
@@ -60,42 +45,69 @@
45 disk, though stable, is not intended to live as long as the
46 artifact format.
47
48 All of the artifacts can be extracted from a Fossil repository using
49 the "fossil deconstruct" command.
50
51 <h2>1.0 Special Artifacts</h2>
52
53 Some artifacts have a particular format which gives them special
54 meaning to fossil. Fossil recognizes:
55
56 <ul>
57 <li> [#manifest | Manifests] </li>
58 <li> [#directory | Directories] </li>
59 <li> [#cluster | Clusters] </li>
60 <li> [#ctrl | Tags] </li>
61 <li> [#wikichng | Wiki Pages] </li>
62 <li> [#tktchng | Ticket Changes] </li>
63 <li> [#attachment | Attachments] </li>
64 <li> [#event | TechNotes] </li>
65 </ul>
66
67 Any artifact is not one of the above eight special artifacts is a
68 "content" artifact. Every distinct version of every file under
69 management is a content artifact, as are attachments to wiki pages
70 and tickets.
71
72 Any artifact that follows the appropriate syntactic rules is a special
73 artifact. It is possible for the same artifact to be used as both
74 a special artifact and a content artifact, thought this is rare and
75 probably undesirable. (Future versions of Fossil might restrict attempts
76 to check-in special artifacts as content files.)
77 To prevent accidental occurrences of the same artifact being used as both
78 a special artifact and a content artifact, the syntactic rules for
79 special artifacts are very strict.
80
81 All special artifacts are pure UTF8 text. Newline characters
82 (ASCII 0x0a) separate the artifact into "cards".
83 Each card begins with a single
84 character "card type". Zero or more arguments may follow
85 the card type. All arguments are separated from each other
86 and from the card-type character by a single space
87 character (ASCII 0x20). There is no surplus white space between arguments
88 and no leading or trailing whitespace except for the newline
89 character that acts as the card separator.
90
91 All cards of a special artifact occur in strict sorted lexicographical order.
92 No card may be duplicated.
93 Some special artifacts (example: [#manifest|manifests])
94 may be PGP clear-signed, but otherwise special artifacts
95 may contain no additional text or data.
96
97
98 <a name="manifest"></a>
99 <h2>1.1 The Manifest Artifact</h2>
100
101 A manifest defines a check-in or version of the project
102 source tree. The manifest contains a list of artifacts for
103 each file in the project and the corresponding filenames, as
104 well as information such as parent check-ins, the name of the
105 programmer who created the check-in, the date and time when
106 the check-in was created, and any check-in comments associated
107 with the check-in.
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109 Allowed cards in the manifest are as follows:
110
111 <blockquote>
112 <b>B</b> <i>baseline-manifest</i><br>
113 <b>C</b> <i>checkin-comment</i><br>
@@ -126,47 +138,73 @@
138 newline (ASCII 0x0a) is "\n" (ASCII 0x5C, x6E). A backslash
139 (ASCII 0x5C) is represented as two backslashes "\\". Apart from
140 space and newline, no other whitespace characters are allowed in
141 the check-in comment. Nor are any unprintable characters allowed
142 in the comment.
143
144 A manifest has zero or one N-cards. The N-card specifies the mimetype for the
145 text in the comment of the C-card. If the N-card is omitted, a default mimetype
146 is used.
147
148 A manifest must have exactly one D-card. The sole argument to
149 the D-card is a date-time stamp in the ISO8601 format. The
150 date and time should be in coordinated universal time (UTC).
151 The format must be one of:
152
153 <blockquote>
154 <i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i><br>
155 <i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i><b>.</b><i>SSS</i>
156 </blockquote>
157
158 A manifest has zero or more F-cards. Each F-card identifies a file or
159 subdirectory
160 that is part of the check-in. There are one, two, three, or four
161 arguments. The first argument is the pathname of the file or
162 subdirectory in the
163 check-in relative to the root of the project file hierarchy. No ".."
164 or "." directories are allowed within the filename. Space characters
165 are escaped as in C-card comment text. Backslash characters and
166 newlines are not allowed within filenames. The directory separator
167 character is a forward slash (ASCII 0x2F). The second argument to the
168 F-card is the full 40-character lower-case hexadecimal SHA1 hash of
169 the content artifact, or of the [#directory|directory artifact] if
170 the "d" permission is present. The second argument is required for baseline
171 manifests but is optional for delta manifests. When the second
172 argument to the F-card is omitted, it means that the file has been
173 deleted relative to the baseline (files removed in baseline manifests
174 versions are <em>not</em> added as F-cards). The optional 3rd argument
175 defines any special access permissions associated with the file. This
176 can be defined as "x" to mean that the file is executable or "l"
177 (small letter ell) to mean a symlink or "d" to mean the entry describes
178 a subdirectory rather than a file. All files and subdirectories
179 are always readable and writable. This can be expressed by "w"
180 permission if desired but the "w" permission is optional and is ignored
181 by Fossil. The file format might be extended with new permission
182 letters in the future. The optional 4th argument is the name of the
183 same file as it existed in the parent check-in. If the name of the
184 file is unchanged from its parent, then the 4th argument is omitted.
185
186 Manifests may be either flat or hierarchical. A flat manifest lists
187 all files in the check-in, including all files in subdirectories. A
188 flat manifest may not include F-cards with the "d" permission. An
189 heirarchical manifest only lists the files or subdirectories at the
190 top-level of the check-in. An heirarchical manifest may not include
191 an F-card entries that have a directory separator character ("/").
192 An heirarchical manifest may not be a delta-manifest (it may not have
193 a B-card) nor may it be used as a baseline-manifest by some other
194 delta-manifest. Hierarchical manifests
195 are only recognized by Fossil versions 1.35 and later. Repositories
196 that contain hierarchical manifests will cause problems for earlier
197 versions of Fossil.
198
199 When an F-card refers to a subdirectory (that is to say, when the
200 F-card is part of an hierarchical manifest and contains the "d"
201 permission) then the referenced directory artifact must be a
202 [#directory|well-formed directory artifact] that contains a
203 G-card that exactly matches the name of the subdirectory as assigned
204 by the F-card. If these conditions are not met, then the artifact is
205 not a valid manifest.
206
207 A manifest has zero or one P-cards. Most manifests have one P-card.
208 The P-card has a varying number of arguments that
209 defines other manifests from which the current manifest
210 is derived. Each argument is an 40-character lowercase
@@ -234,34 +272,48 @@
272 a sanity check to prove that the manifest is well-formed and
273 consistent.
274
275 A sample manifest from Fossil itself can be seen
276 [/artifact/28987096ac | here].
277
278 <a name="directory"></a>
279 <h3>1.2 Directory Artifacts</h3>
280
281 A directory artifact describes the files and subdirectories within a
282 single directory of an hierarchical manifest. Directory artifacts
283 are only recognized by Fossil version 1.35 and later (circa 2015-12-23).
284
285 Directory artifacts contain zero or more F-cards and exactly one Z-card,
286 in the same format as a manifest. A directory artifact also contains
287 exactly one G-card with a single argument that is the pathname
288 of the directory relative to the root of the repository.
289 The format of the directory name in a G-card is
290 the same as the format of a filename in an F-card.
291
292 The F-cards in a directory artifact may not contain directory separator
293 characters. The content of subdirectories must be expressed using
294 additional directory artifacts referenced by F-cards with the "d"
295 permission. All F-cards in a directory artifact must contain at least
296 two arguments.
297
298 When an F-card X of directory artifact Y refers to
299 subdirectory Z (that is to say, when F-card X contains
300 the "d" permission and the second argument on X is the SHA1
301 hash of directory artifact Z) then the G-card of Z must
302 be the concatenation of the G-card on artifact Y, the
303 directory separator character "/" and the first argument to
304 the F-card X. Otherwise, the artifact Y is not a valid
305 directory artifact.
306
307 <a name="cluster"></a>
308 <h3>1.3 Clusters Artifacts</h3>
309
310 A cluster is an artifact that declares the existence of other artifacts.
311 Clusters are used during repository synchronization to help
312 reduce network traffic. As such, clusters are an optimization and
313 may be removed from a repository without loss or damage to the
314 underlying project code. Clusters may not be PGP clearsigned.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
315
316 Allowed cards in the cluster are as follows:
317
318 <blockquote>
319 <b>M</b> <i>artifact-id</i><br />
@@ -277,35 +329,32 @@
329
330 An example cluster from Fossil can be seen
331 [/artifact/d03dbdd73a2a8 | here].
332
333 <a name="ctrl"></a>
334 <h3>1.4 Tag Artifacts</h3>
335
336 Tag artifacts are used to assign properties to other artifacts
337 within the repository. Tag artifacts where called "control artifacts"
338 in an earlier version of this document. Though their name has changed
339 in the documentation, their function has not.
340
341 Allowed cards in a tag artifact are as follows:
 
 
 
342
343 <blockquote>
344 <b>D</b> <i>time-and-date-stamp</i><br />
345 <b>T</b> (<b>+</b>|<b>-</b>|<b>*</b>)<i>tag-name</i> <i>artifact-id</i> ?<i>value</i>?<br />
346 <b>U</b> <i>user-name</i><br />
347 <b>Z</b> <i>checksum</i><br />
348 </blockquote>
349
350 A tag artifact must have one D card, one U card, one Z card and
351 one or more T cards. No other cards or other text is
352 allowed in a control artifact. Tag artifacts might be PGP
353 clearsigned.
354
355 The D card and the Z card of a tag artifact are the same
356 as in a manifest.
357
358 The T card represents a [./branching.wiki#tags | tag or property]
359 that is applied to
360 some other artifact. The T card has two or three values. The
@@ -333,20 +382,18 @@
382 belongs to. Symbolic tags begin with the "sym-" prefix.
383
384 The U card is the name of the user that created the control
385 artifact. The Z card is the usual required artifact checksum.
386
387 An example tag artifacts can be seen [/info/9d302ccda8 | here].
388
389
390 <a name="wikichng"></a>
391 <h3>1.5 Wiki Pages</h3>
392
393 A wiki artifact defines a single version of a single wiki
394 page. Wiki artifacts accept
 
 
395 the following card types:
396
397 <blockquote>
398 <b>D</b> <i>time-and-date-stamp</i><br />
399 <b>L</b> <i>wiki-title</i><br />
@@ -374,11 +421,11 @@
421
422 An example wiki artifact can be seen
423 [/artifact?name=7b2f5fd0e0&txt=1 | here].
424
425 <a name="tktchng"></a>
426 <h3>1.6 Ticket Changes</h3>
427
428 A ticket-change artifact represents a change to a trouble ticket.
429 The following cards are allowed on a ticket change artifact:
430
431 <blockquote>
@@ -420,11 +467,11 @@
467
468 An example ticket-change artifact can be seen
469 [/artifact/91f1ec6af053 | here].
470
471 <a name="attachment"></a>
472 <h3>1.7 Attachments</h3>
473
474 An attachment artifact associates some other artifact that is the
475 attachment (the source artifact) with a ticket or wiki page or
476 technical note to which
477 the attachment is connected (the target artifact).
@@ -462,11 +509,11 @@
509 The Z card is the usual checksum over the rest of the attachment artifact.
510 The Z card is required.
511
512
513 <a name="event"></a>
514 <h3>1.8 Technical Notes</h3>
515
516 A technical note or "technote" artifact (formerly known as an "event" artifact)
517 associates a timeline comment and a page of text
518 (similar to a wiki page) with a point in time. Technotes can be used
519 to record project milestones, release notes, blog entries, process
@@ -509,11 +556,11 @@
556 the other.
557
558 A technote might contain one or more T-cards used to set
559 [./branching.wiki#tags | tags or properties]
560 on the technote. The format of the T-card is the same as
561 described in [#ctrl | Tag Artifacts] section above, except that the
562 second argument is the single character "<b>*</b>" instead of an
563 artifact ID and the name is always prefaced by "<b>+</b>".
564 The <b>*</b> in place of the artifact ID indicates that
565 the tag or property applies to the current artifact. It is not
566 possible to encode the current artifact ID as part of an artifact,
@@ -531,11 +578,11 @@
578
579 The Z card is the required checksum over the rest of the artifact.
580
581
582 <a name="summary"></a>
583 <h2>2.0 Card Summary</h2>
584
585 The following table summarizes the various kinds of cards that appear
586 on Fossil artifacts. A blank entry means that combination of card and
587 artifact is not legal. A number or range of numbers indicates the number
588 of times a card may (or must) appear in the corresponding artifact type.
@@ -543,23 +590,25 @@
590 or more such cards are required.
591
592 <table border=1 width="100%">
593 <tr>
594 <th rowspan=2 valign=bottom>Card Format</th>
595 <th colspan=8>Used By</th>
596 </tr>
597 <tr>
598 <th>Manifest</th>
599 <th>Directory</th>
600 <th>Cluster</th>
601 <th>Tag</th>
602 <th>Wiki</th>
603 <th>Ticket</th>
604 <th>Attachment</th>
605 <th>Technote</th>
606 </tr>
607 <tr>
608 <td><b>A</b> <i>filename</i> <i>target</i> ?<i>source</i>?</td>
609 <td>&nbsp;</td>
610 <td>&nbsp;</td>
611 <td>&nbsp;</td>
612 <td>&nbsp;</td>
613 <td>&nbsp;</td>
614 <td>&nbsp;</td>
@@ -573,15 +622,18 @@
622 <td>&nbsp;</td>
623 <td>&nbsp;</td>
624 <td>&nbsp;</td>
625 <td>&nbsp;</td>
626 <td>&nbsp;</td>
627 <td>&nbsp;</td>
628 </tr>
629 <tr><td>&nbsp;</td><td colspan='8'>* = Required for delta manifests,
630 Disallowed for hierarchical manifests.</td></tr>
631 <tr>
632 <td><b>C</b> <i>comment-text</i></td>
633 <td align=center><b>1</b></td>
634 <td>&nbsp;</td>
635 <td>&nbsp;</td>
636 <td>&nbsp;</td>
637 <td>&nbsp;</td>
638 <td>&nbsp;</td>
639 <td align=center><b>0-1</b></td>
@@ -588,10 +640,11 @@
640 <td align=center><b>0-1</b></td>
641 </tr>
642 <tr>
643 <td><b>D</b> <i>date-time-stamp</i></td>
644 <td align=center><b>1</b></td>
645 <td>&nbsp;</td>
646 <td>&nbsp;</td>
647 <td align=center><b>1</b></td>
648 <td align=center><b>1</b></td>
649 <td align=center><b>1</b></td>
650 <td align=center><b>1</b></td>
@@ -602,25 +655,39 @@
655 <td>&nbsp;</td>
656 <td>&nbsp;</td>
657 <td>&nbsp;</td>
658 <td>&nbsp;</td>
659 <td>&nbsp;</td>
660 <td>&nbsp;</td>
661 <td>&nbsp;</td>
662 <td align=center><b>1</b></td>
663 </tr>
664 <tr>
665 <td><b>F</b> <i>filename</i> ?<i>uuid</i>? ?<i>permissions</i>? ?<i>oldname</i>?</td>
666 <td align=center><b>0+</b></td>
667 <td align=center><b>0+</b></td>
668 <td>&nbsp;</td>
669 <td>&nbsp;</td>
670 <td>&nbsp;</td>
671 <td>&nbsp;</td>
672 <td>&nbsp;</td>
673 <td>&nbsp;</td>
674 </tr>
675 <tr>
676 <td><b>G</b> <i>fileame</i>
677 <td>&nbsp;</td>
678 <td align=center><b>1</b></td>
679 <td>&nbsp;</td>
680 <td>&nbsp;</td>
681 <td>&nbsp;</td>
682 <td>&nbsp;</td>
683 <td>&nbsp;</td>
684 <td>&nbsp;</td>
685 </tr>
686 <tr>
687 <td><b>J</b> <i>name</i> ?<i>value</i>?</td>
688 <td>&nbsp;</td>
689 <td>&nbsp;</td>
690 <td>&nbsp;</td>
691 <td>&nbsp;</td>
692 <td>&nbsp;</td>
693 <td align=center><b>1+</b></td>
@@ -630,17 +697,19 @@
697 <tr>
698 <td><b>K</b> <i>ticket-uuid</i></td>
699 <td>&nbsp;</td>
700 <td>&nbsp;</td>
701 <td>&nbsp;</td>
702 <td>&nbsp;</td>
703 <td>&nbsp;</td>
704 <td align=center><b>1</b></td>
705 <td>&nbsp;</td>
706 <td>&nbsp;</td>
707 </tr>
708 <tr>
709 <td><b>L</b> <i>wiki-title</i></td>
710 <td>&nbsp;</td>
711 <td>&nbsp;</td>
712 <td>&nbsp;</td>
713 <td>&nbsp;</td>
714 <td align=center><b>1</b></td>
715 <td>&nbsp;</td>
@@ -647,10 +716,11 @@
716 <td>&nbsp;</td>
717 <td>&nbsp;</td>
718 </tr>
719 <tr>
720 <td><b>M</b> <i>uuid</i></td>
721 <td>&nbsp;</td>
722 <td>&nbsp;</td>
723 <td align=center><b>1+</b></td>
724 <td>&nbsp;</td>
725 <td>&nbsp;</td>
726 <td>&nbsp;</td>
@@ -659,19 +729,21 @@
729 </tr>
730 <tr>
731 <td><b>N</b> <i>mimetype</i></td>
732 <td align=center><b>0-1</b></td>
733 <td>&nbsp;</td>
734 <td>&nbsp;</td>
735 <td>&nbsp;</td>
736 <td align=center><b>0-1</b></td>
737 <td>&nbsp;</td>
738 <td align=center><b>0-1</b></td>
739 <td align=center><b>0-1</b></td>
740 </tr>
741 <tr>
742 <td><b>P</b> <i>uuid ...</i></td>
743 <td align=center><b>0-1</b></td>
744 <td>&nbsp;</td>
745 <td>&nbsp;</td>
746 <td>&nbsp;</td>
747 <td align=center><b>0-1</b></td>
748 <td>&nbsp;</td>
749 <td>&nbsp;</td>
@@ -684,10 +756,11 @@
756 <td>&nbsp;</td>
757 <td>&nbsp;</td>
758 <td>&nbsp;</td>
759 <td>&nbsp;</td>
760 <td>&nbsp;</td>
761 <td>&nbsp;</td>
762 </tr>
763 <tr>
764 <td><b>R</b> <i>md5sum</i></td>
765 <td align=center><b>0-1</b></td>
766 <td>&nbsp;</td>
@@ -694,13 +767,15 @@
767 <td>&nbsp;</td>
768 <td>&nbsp;</td>
769 <td>&nbsp;</td>
770 <td>&nbsp;</td>
771 <td>&nbsp;</td>
772 <td>&nbsp;</td>
773 <tr>
774 <td><b>T</b> (<b>+</b>|<b>*</b>|<b>-</b>)<i>tagname</i> <i>uuid</i> ?<i>value</i>?</td>
775 <td align=center><b>0+</b></td>
776 <td>&nbsp;</td>
777 <td>&nbsp;</td>
778 <td align=center><b>1+</b></td>
779 <td>&nbsp;</td>
780 <td>&nbsp;</td>
781 <td>&nbsp;</td>
@@ -707,19 +782,21 @@
782 <td align=center><b>0+</b></td>
783 </tr>
784 <tr>
785 <td><b>U</b> <i>username</i></td>
786 <td align=center><b>1</b></td>
787 <td>&nbsp;</td>
788 <td>&nbsp;</td>
789 <td align=center><b>1</b></td>
790 <td align=center><b>1</b></td>
791 <td align=center><b>1</b></td>
792 <td align=center><b>0-1</b></td>
793 <td align=center><b>0-1</b></td>
794 </tr>
795 <tr>
796 <td><b>W</b> <i>size</i></td>
797 <td>&nbsp;</td>
798 <td>&nbsp;</td>
799 <td>&nbsp;</td>
800 <td>&nbsp;</td>
801 <td align=center><b>1</b></td>
802 <td>&nbsp;</td>
@@ -733,21 +810,22 @@
810 <td align=center><b>1</b></td>
811 <td align=center><b>1</b></td>
812 <td align=center><b>1</b></td>
813 <td align=center><b>1</b></td>
814 <td align=center><b>1</b></td>
815 <td align=center><b>1</b></td>
816 </tr>
817 </table>
818
819
820 <a name="addenda"></a>
821 <h2>3.0 Addenda</h2>
822
823 This section contains additional information about the low-level artifact
824 formats of Fossil.
825
826 <h3>3.1 R-Card Hash Calculation</h3>
827
828 Given a manifest file named <tt>MF</tt>, the following Bash shell code
829 demonstrates how to compute the value of the R card in that manifest.
830 This example uses manifest [28987096ac]. Lines starting with <tt>#</tt> are
831 shell input and other lines are output. This demonstration assumes that the
@@ -781,5 +859,34 @@
859 <tt>stat</tt> calls will fail to find such files (which are output in encoded
860 form here). That approach also won't work for delta manifests. Calculating
861 the R-card for delta manifests requires traversing both the delta and its baseline in
862 lexical order of the files, preferring the delta's copy if both contain
863 a given file.
864
865 <h3>3.2 Different Kinds Of Manifest Artifacts</h3>
866
867 The original (1.0) version of Fossil only supported flat baseline
868 manifests. That means that all the files of a check-in had to be
869 listed in every manifest. Because manifests are delta-encoded, there
870 is not a storage space issue. Fossil was originally designed
871 specifically to support the SQLite project, and as SQLite has fewer
872 than 2000 files on any give version, a flat baseline manifest design
873 worked well there and was simple to implement.
874
875 However, some project (ex: NetBSD) contain a huge number of files in
876 every version, and even though the manifests compressed will using
877 delta-compression, many CPU cycles had to be spent to decompress those
878 manifests. To help make Fossil more efficient for large projects like
879 NetBSD, the concept of a delta-manifest was added. This helped a lot
880 but was not a perfect solution.
881
882 Later, the concept of an hierarchical manifest was added. By breaking
883 up each manifest into many separate subdirectories it is hoped that
884 the processing of projects with many files can be better optimized.
885 The hierarchical manifest design also more closely resembles the low-level
886 file format used by Git, thus making pull and clone from Git repositories
887 easier.
888
889 In retrospect, it would have been better if Fossil had only
890 hierarchical manifests. But as there are many legacy repositories
891 that use flat manifests and delta manifests, all three forms must
892 be supported moving forward.
893

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button