| | @@ -1,95 +1,179 @@ |
| 1 | 1 | <html> |
| 2 | 2 | <head> |
| 3 | | -<title>Fossil File Formats</title> |
| 3 | +<title>Fossil File Format</title> |
| 4 | 4 | </head> |
| 5 | 5 | <body bgcolor="white"> |
| 6 | 6 | <h1 align="center"> |
| 7 | 7 | Fossil File Formats |
| 8 | 8 | </h1> |
| 9 | 9 | |
| 10 | 10 | <p> |
| 11 | 11 | The global state of a fossil repository is determined by an unordered |
| 12 | | -set of content files. Each of these files has a format which is defined |
| 13 | | -by this document. |
| 14 | | -</p> |
| 15 | | - |
| 16 | | -<h2>1.0 General Formatting Rules</h2> |
| 17 | | - |
| 18 | | -<p> |
| 19 | | -Fossil content files consist of a header, a blank line, and optional |
| 20 | | -content. |
| 21 | | -</p> |
| 22 | | - |
| 23 | | -<p> |
| 24 | | -The header is divided into "properties" by newline ('\n', 0x0a) |
| 25 | | -characters. Each header property is divided into tokens by space (' ', 0x20) |
| 26 | | -characters. The first token of each property is the property name. |
| 27 | | -Subsequent tokens (if any) are arguments to the property. |
| 28 | | -</p> |
| 29 | | - |
| 30 | | -<p> |
| 31 | | -The blank line that separates the header from the content can be |
| 32 | | -thought of as a property line that contains no tokens. Everything |
| 33 | | -that follows the newline character that terminates the blank line |
| 34 | | -is content. The blank line is always present but the content is |
| 35 | | -optional. |
| 36 | | -</p> |
| 37 | | - |
| 38 | | -<p> |
| 39 | | -All tokens in a property line are encoded to escape special characters. |
| 40 | | -The encoding is as follows: |
| 41 | | -</p> |
| 42 | | - |
| 43 | | -<blockquote> |
| 44 | | -<table border="1"> |
| 45 | | -<tr><th>Input Character</th><th>Encoded As</th></tr> |
| 46 | | -<tr><td align="center"> space (0x20) </td><td align="center"> \s </td></tr> |
| 47 | | -<tr><td align="center"> newline (0x0A) </td><td align="center"> \n </td></tr> |
| 48 | | -<tr><td align="center"> carriage return (0x0D) </td><td align="center"> \r </td></tr> |
| 49 | | -<tr><td align="center"> tab (0x09) </td><td align="center"> \t </td></tr> |
| 50 | | -<tr><td align="center"> vertical tab (0x0B) </td><td align="center"> \v </td></tr> |
| 51 | | -<tr><td align="center"> formfeed (0x0C) </td><td align="center"> \f </td></tr> |
| 52 | | -<tr><td align="center"> nul (0x00) </td><td align="center"> \0 </td></tr> |
| 53 | | -<tr><td align="center"> backslash (0x5C) </td><td align="center"> \\ </td></tr> |
| 54 | | -</table> |
| 55 | | -</blockquote> |
| 56 | | - |
| 57 | | -<p> |
| 58 | | -Characters other than the ones shown in the table above are passed through |
| 59 | | -the encoder without change. |
| 60 | | -</p> |
| 61 | | - |
| 62 | | -<p> |
| 63 | | -All properties names are unpunctuated lower-case ASCII strings. |
| 64 | | -The properties appear in the header in sorted order (using |
| 65 | | -memcpy() as the comparision function) except for the "signature" |
| 66 | | -property which always occurs first. |
| 67 | | -</p> |
| 68 | | - |
| 69 | | -<h2>2.0 Common Properties</h2> |
| 70 | | - |
| 71 | | -<p> |
| 72 | | -Every content file has a "time" property. The argument to the |
| 73 | | -time property is an integer which is the number of seconds since |
| 74 | | -1970 UTC when the content file was created. For example: |
| 75 | | -</p> |
| 76 | | - |
| 77 | | -<blockquote> |
| 78 | | -time 1181404746 |
| 79 | | -</blockquote> |
| 80 | | - |
| 81 | | -<p> |
| 82 | | -Every content file has a "type" property. The argument to the |
| 83 | | -type property defines the purpose of the content file. The |
| 84 | | -argument can be strings like "version", "folder", "file", or "user". |
| 85 | | -</p> |
| 86 | | - |
| 87 | | -<p> |
| 88 | | -The first property of a content file is the digital signature. The |
| 89 | | -name of the signature property is "signature". There are two arguments. |
| 90 | | -The first argument is the SHA256 hash of the content file that defines |
| 91 | | -the user who signed this file. User records themselves are self-signed |
| 92 | | -and so the first argument is simply "*" for user records. The second |
| 93 | | -argument is the digital signature of an SHA256 hash of the entire |
| 94 | | -file (header and content) except for the signature line itself. |
| 95 | | -</p> |
| 12 | +set of files. Some files used to represent wiki pages, trouble tickets, |
| 13 | +and the special "manifest" file has a specific and well-defined format. |
| 14 | +Other files are just the content of the files. Files can be text or |
| 15 | +binary. |
| 16 | +</p> |
| 17 | + |
| 18 | +<p> |
| 19 | +Each file in the repository is named by its SHA1 hash. |
| 20 | +Some files have a particular format which qualifies them |
| 21 | +as "manifests". A manifest assigns filenames to a subset |
| 22 | +of the files in the repository, in order to provide a |
| 23 | +snapshot of the state of the project at a point in time. |
| 24 | +Each manifest file corresponds to a version or baseline |
| 25 | +of the project. |
| 26 | +</p> |
| 27 | + |
| 28 | +<h2>1.0 The Manifest File</h2> |
| 29 | + |
| 30 | +<p> |
| 31 | +Any file in the repository that follows the syntactic rules |
| 32 | +of a manifest is a manifest. Note that a manifest can |
| 33 | +be both a real manifest and also a content file, though this |
| 34 | +is rare. |
| 35 | +</p> |
| 36 | + |
| 37 | +<p> |
| 38 | +A manifest is a line-oriented text file. Newline characters |
| 39 | +(ASCII 0x0a) separate lines. Each line begins with a single |
| 40 | +character "line type". Zero or more arguments may follow |
| 41 | +the line type. All arguments are separated from each other |
| 42 | +and from the line-type character by a single space |
| 43 | +character. There is no surplus white space between arguments |
| 44 | +and no leading or trailing whitespace except for the newline |
| 45 | +character that acts as the line separator. |
| 46 | +</p> |
| 47 | + |
| 48 | +<p> |
| 49 | +All lines of the manifest occur in strict sorted lexigraphical order. |
| 50 | +No line may be duplicated. |
| 51 | +The entire manifest file may be PGP clear-signed, but otherwise it |
| 52 | +may contain no additional text or data beyond what is described here. |
| 53 | +</p> |
| 54 | + |
| 55 | +<p> |
| 56 | +Allowed lines in the manifest are as follows: |
| 57 | +</p> |
| 58 | + |
| 59 | +<blockquote> |
| 60 | +<b>C</b> <i>checkin-comment</i><br> |
| 61 | +<b>D</b> <i>time-and-date-stamp</i><br> |
| 62 | +<b>F</b> <i>filename</i> <i>SHA1-hash</i><br> |
| 63 | +<b>P</b> <i>SHA1-hash</i>+<br> |
| 64 | +<b>R</b> <i>repository-checksum</i><br> |
| 65 | +<b>U</b> <i>user-login</i><br> |
| 66 | +<b>Z</b> <i>manifest-checksum</i> |
| 67 | +</blockquote> |
| 68 | + |
| 69 | +<p> |
| 70 | +A manifest must have exactly one C-line. The sole argument to |
| 71 | +the C-line is a check-in comment that describes the baseline that |
| 72 | +the manifest defines. The check-in comment is text. The following |
| 73 | +escape sequences are applied to the text: |
| 74 | +A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73). A |
| 75 | +newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E). A backslash |
| 76 | +(ASCII 0x5C) is represented as two backslashes "\\". Apart from |
| 77 | +space and newline, no other whitespace characters are allowed in |
| 78 | +the check-in comment. Nor are any unprintable characters allowed |
| 79 | +in the comment. |
| 80 | +</p> |
| 81 | + |
| 82 | +<p> |
| 83 | +A manifest must have exactly one D-line. The sole argument to |
| 84 | +the D-line is a date-time stamp in the ISO8601 format. The |
| 85 | +date and time should be in coordinated universal time (UTC). |
| 86 | +The format is: |
| 87 | +</p> |
| 88 | + |
| 89 | +<blockquote> |
| 90 | +<i>YYYY</i><b>-</b><i>MM</i><b>-</b><i>DD</i><b>T</b><i>HH</i><b>:</b><i>MM</i><b>:</b><i>SS</i> |
| 91 | +</blockquote> |
| 92 | + |
| 93 | +<p> |
| 94 | +A manifest has zero or more F-lines. Each F-line defines a file |
| 95 | +(other than the manifest itself) which is part of the baseline that |
| 96 | +the manifest defines. There are two arguments. The first argment |
| 97 | +is the pathname of the file in the baseline relative to the root |
| 98 | +of the project file hierarchy. No ".." or "." directories are allowed |
| 99 | +within the filename. Space characters are escaped as in C-line |
| 100 | +comment text. Backslash characters and newlines are not allowed |
| 101 | +within filenames. The directory separator character is a forward |
| 102 | +slash (ASCII 0x2F). The second argument to the F-line is the |
| 103 | +full 40-character hexadecimal SHA1 hash of the file content. |
| 104 | +Upper-case letters ABCDEF are used for the higher digits of the |
| 105 | +hexadecimal. |
| 106 | +</p> |
| 107 | + |
| 108 | +<p> |
| 109 | +A manifest has zero or one P-lines. Most manifests have one P-line. |
| 110 | +The P-line has a varying number of arguments that |
| 111 | +defines other manifests from which the current manifest |
| 112 | +is derived. Each argument is an 40-character uppercase |
| 113 | +hexadecimal SHA1 of the predecessor manifest. All arguments |
| 114 | +to the P-line must be unique to that line. |
| 115 | +The first predecessor is the manifests direct ancestor. |
| 116 | +Other arguments define manifests with which the first was |
| 117 | +merged to yield the current manifest. Most manifests have |
| 118 | +a P-line with a single argument. The first manifest in the |
| 119 | +project has no ancestors and thus has no P-line. |
| 120 | +</p> |
| 121 | + |
| 122 | +<p> |
| 123 | +A manifest may optionally have a single R-line. The R-line has |
| 124 | +a single argument which is the MD5 checksum of all files in |
| 125 | +the baseline except the manifest itself. The checksum is expressed |
| 126 | +as 32-characters of uppercase hexadecimal. The checksum is |
| 127 | +computed as follows: For each file in the baseline (except for |
| 128 | +the manifest itself) in strict sorted lexigraphical order, |
| 129 | +take the pathname of the file relative to the root of the |
| 130 | +repository, append a single space (ASCII 0x20), the |
| 131 | +size of the file in ASCII decimal, a single newline |
| 132 | +character (ASCII 0x0A), and the complete text of the file. |
| 133 | +Compute the MD5 checksum of the the result. |
| 134 | +</p> |
| 135 | + |
| 136 | +<p> |
| 137 | +Each manifest has a single U-line. The argument to the U-line is |
| 138 | +the login of the user who created the manifest. The login name |
| 139 | +is encoded using the same character escapes as is used for the |
| 140 | +check-in comment argument to the C-line. |
| 141 | +</p> |
| 142 | + |
| 143 | +<p> |
| 144 | +A manifest has an option Z-line as its last line. The argument |
| 145 | +to the Z-line is a 32-character uppercase hexadecimal MD5 hash |
| 146 | +of all prior lines of the manifest up to and including the newline |
| 147 | +character that immediately preceeds the "Z". The Z-line is just |
| 148 | +a sanity check to prove that the manifest is well-formed and |
| 149 | +consistent. |
| 150 | +</p> |
| 151 | + |
| 152 | +<h2>2.0 Trouble Tickets</h2> |
| 153 | + |
| 154 | +<p> |
| 155 | +Each trouble ticket is a file in the repository and appears in |
| 156 | +a manifest for every baseline in which the ticket exists. |
| 157 | +Trouble tickets occur in a specific subdirectory of the file |
| 158 | +heirarchy. The name of the subdirectory that contains tickets |
| 159 | +is part of the local state of each repository. The filename |
| 160 | +of each trouble ticket has a ".tkt" suffix. The trouble ticket |
| 161 | +has a particular file format defined below. |
| 162 | +</p> |
| 163 | + |
| 164 | +<i>To be continued...</i> |
| 165 | + |
| 166 | +<h2>3.0 Wiki Pages</h2> |
| 167 | + |
| 168 | +<p> |
| 169 | +Each wiki is a file in the repository and appears in |
| 170 | +a manifest for every baseline in which that wiki page exists. |
| 171 | +Wiki pages occur in a specific subdirectory of the file |
| 172 | +heirarchy. The name of the subdirectory that contains wiki pages |
| 173 | +is part of the local state of each repository. The filename |
| 174 | +of each wiki page has a ".wiki" suffix. The base name of |
| 175 | +the file is the name of the wiki page. The wiki pages |
| 176 | +have a particular file format defined below. |
| 177 | +</p> |
| 178 | + |
| 179 | +<i>To be continued...</i> |
| 96 | 180 | |