| | @@ -5,11 +5,28 @@ |
| 5 | 5 | command is run on the client repository. A URL for the server repository |
| 6 | 6 | is specified as part of the command. This document describes what happens |
| 7 | 7 | behind the scenes in order to synchronize the information on the two |
| 8 | 8 | repositories.</p> |
| 9 | 9 | |
| 10 | | -<h2>1.0 Transport</h2> |
| 10 | +<h2>1.0 Overview</h2> |
| 11 | + |
| 12 | +<p>The global state of a fossil repository consists of an unordered |
| 13 | +collection of artifacts. Each artifact is identified by its SHA1 hash. |
| 14 | +Synchronization is simply the process of sharing artifacts between |
| 15 | +servers so that all servers have copies of all artifacts. Because |
| 16 | +artifacts are unordered, the order in which artifacts are received |
| 17 | +at a server is inconsequential. It is assumed that the SHA1 hashes |
| 18 | +of artifacts are unique - that every artifact has a different SHA1 hash. |
| 19 | +To first approximation, synchronization proceeds by sharing lists |
| 20 | +SHA1 hashes of available artifacts, then sharing those artifacts that |
| 21 | +are not found on one side or the other of the connection. In practice, |
| 22 | +a repository might contain millions of artifacts. The list of |
| 23 | +SHA1 hashes for this many artifacts can be large. So optimizations are |
| 24 | +employed that usually reduce the number of SHA1 hashes that need to be |
| 25 | +shared to a few hundred.</p> |
| 26 | + |
| 27 | +<h2>2.0 Transport</h2> |
| 11 | 28 | |
| 12 | 29 | <p>All communication between client and server is via HTTP requests. |
| 13 | 30 | The server is listening for incoming HTTP requests. The client |
| 14 | 31 | issues one or more HTTP requests and receives replies for each |
| 15 | 32 | request.</p> |
| | @@ -25,11 +42,11 @@ |
| 25 | 42 | <p>A single push, pull, or sync might involve multiple HTTP requests. |
| 26 | 43 | The client maintains state between all requests. But on the server |
| 27 | 44 | side, each request is independent. The server does not preserve |
| 28 | 45 | any information about the client from one request to the next.</p> |
| 29 | 46 | |
| 30 | | -<h3>1.1 Server Identification</h3> |
| 47 | +<h3>2.1 Server Identification</h3> |
| 31 | 48 | |
| 32 | 49 | <p>The server is identified by a URL argument that accompanies the |
| 33 | 50 | push, pull, or sync command on the client. (As a convenience to |
| 34 | 51 | users, the URL can be omitted on the client command and the same URL |
| 35 | 52 | from the most recent push, pull, or sync will be reused. This saves |
| | @@ -49,11 +66,11 @@ |
| 49 | 66 | |
| 50 | 67 | <blockquote> |
| 51 | 68 | http://fossil-scm.hwaci.com/fossil/xfer |
| 52 | 69 | </blockquote> |
| 53 | 70 | |
| 54 | | -<h3>1.2 HTTP Request Format</h3> |
| 71 | +<h3>2.2 HTTP Request Format</h3> |
| 55 | 72 | |
| 56 | 73 | <p>The client always sends a POST request to the server. The |
| 57 | 74 | general format of the POST request is as follows:</p> |
| 58 | 75 | |
| 59 | 76 | <blockquote><pre> |
| | @@ -87,17 +104,17 @@ |
| 87 | 104 | </pre></blockquote> |
| 88 | 105 | |
| 89 | 106 | <p>The content type of the reply is always the same as the content type |
| 90 | 107 | of the request.</p> |
| 91 | 108 | |
| 92 | | -<h2>2.0 Fossil Synchronization Content</h2> |
| 109 | +<h2>3.0 Fossil Synchronization Content</h2> |
| 93 | 110 | |
| 94 | 111 | <p>A synchronization request between a client and server consists of |
| 95 | 112 | one or more HTTP requests as described in the previous section. This |
| 96 | 113 | section details the "x-fossil" content type.</p> |
| 97 | 114 | |
| 98 | | -<h3>2.1 Line-oriented Format</h3> |
| 115 | +<h3>3.1 Line-oriented Format</h3> |
| 99 | 116 | |
| 100 | 117 | <p>The x-fossil content type consists of zero or more "cards". Cards |
| 101 | 118 | are separate by the newline character ("\n"). Leading and trailing |
| 102 | 119 | whitespace on a card is ignored. Blank cards are ignored.</p> |
| 103 | 120 | |
| | @@ -105,11 +122,11 @@ |
| 105 | 122 | The first token on each card is the operator. Subsequent tokens |
| 106 | 123 | are arguments. The set of operators understood by servers is slightly |
| 107 | 124 | different from the operators understood by clients, though the two |
| 108 | 125 | are very similar.</p> |
| 109 | 126 | |
| 110 | | -<h3>2.2 Login Cards</h3> |
| 127 | +<h3>3.2 Login Cards</h3> |
| 111 | 128 | |
| 112 | 129 | <p>Every message from client to server begins with one or more login |
| 113 | 130 | cards. Each login card has the following format:</p> |
| 114 | 131 | |
| 115 | 132 | <blockquote> |
| | @@ -131,11 +148,11 @@ |
| 131 | 148 | |
| 132 | 149 | <p>Privileges are cumulative. There can be multiple successful |
| 133 | 150 | login cards. The session privileges are the bit-wise OR of the |
| 134 | 151 | privileges of each individual login.</p> |
| 135 | 152 | |
| 136 | | -<h3>2.3 File Cards</h3> |
| 153 | +<h3>3.3 File Cards</h3> |
| 137 | 154 | |
| 138 | 155 | <p>Repository content records or files are transferred using |
| 139 | 156 | a "file" card. File cards come in two different formats depending |
| 140 | 157 | on whether the file is sent directly or as a delta from some |
| 141 | 158 | other file.</p> |
| | @@ -165,11 +182,11 @@ |
| 165 | 182 | <p>File cards are sent in both directions: client to server and |
| 166 | 183 | server to client. A delta might be sent before the source of |
| 167 | 184 | the delta, so both client and server should remember deltas |
| 168 | 185 | and be able to apply them when their source arrives.</p> |
| 169 | 186 | |
| 170 | | -<h3>2.4 Push and Pull Cards</h3> |
| 187 | +<h3>3.4 Push and Pull Cards</h3> |
| 171 | 188 | |
| 172 | 189 | <p>Among of the first cards in a client-to-server message are |
| 173 | 190 | the push and pull cards. The push card tell the server that |
| 174 | 191 | the client is pushing content. The pull card tell the server |
| 175 | 192 | that the client wants to pull content. In the event of a sync, |
| | @@ -190,11 +207,11 @@ |
| 190 | 207 | |
| 191 | 208 | <p>The server will also send a push card back to the client |
| 192 | 209 | during a clone. This is how the client determines what project |
| 193 | 210 | code to put in the new repository it is constructing.</p> |
| 194 | 211 | |
| 195 | | -<h3>2.5 Clone Cards</h3> |
| 212 | +<h3>3.5 Clone Cards</h3> |
| 196 | 213 | |
| 197 | 214 | <p>A clone card works like a pull card in that it is sent from |
| 198 | 215 | client to server in order to tell the server that the client |
| 199 | 216 | wants to pull content. But unlike the pull card, the clone |
| 200 | 217 | card has no arguments.</p> |
| | @@ -205,11 +222,11 @@ |
| 205 | 222 | |
| 206 | 223 | <p>In response to a clone message, the server also sends the client |
| 207 | 224 | a push message so that the client can discover the projectcode for |
| 208 | 225 | this project.</p> |
| 209 | 226 | |
| 210 | | -<h3>2.6 Igot Cards</h3> |
| 227 | +<h3>3.6 Igot Cards</h3> |
| 211 | 228 | |
| 212 | 229 | <p>An igot card can be sent from either client to server or from |
| 213 | 230 | server to client in order to indicate that the sender holds a copy |
| 214 | 231 | of a particular file. The format is:</p> |
| 215 | 232 | |
| | @@ -221,11 +238,11 @@ |
| 221 | 238 | the sender possesses. |
| 222 | 239 | The receiver of an igot card will typically check to see if |
| 223 | 240 | it also holds the same file and if not it will request the file |
| 224 | 241 | using a gimme card in either the reply or in the next message.</p> |
| 225 | 242 | |
| 226 | | -<h3>2.7 Gimme Cards</h3> |
| 243 | +<h3>3.7 Gimme Cards</h3> |
| 227 | 244 | |
| 228 | 245 | <p>A gimme card is sent from either client to server or from server |
| 229 | 246 | to client. The gimme card asks the receiver to send a particular |
| 230 | 247 | file back to the sender. The format of a gimme card is this:</p> |
| 231 | 248 | |
| | @@ -236,11 +253,11 @@ |
| 236 | 253 | <p>The argument to the gimme card is the UUID of the file that |
| 237 | 254 | the sender wants. The receiver will typically respond to a |
| 238 | 255 | gimme card by sending a file card in its reply or in the next |
| 239 | 256 | message.</p> |
| 240 | 257 | |
| 241 | | -<h3>2.8 Cookie Cards</h3> |
| 258 | +<h3>3.8 Cookie Cards</h3> |
| 242 | 259 | |
| 243 | 260 | <p>A cookie card can be used by a server to record a small amount |
| 244 | 261 | of state information on a client. The server sends a cookie to the |
| 245 | 262 | client. The client sends the same cookie back to the server on |
| 246 | 263 | its next request. The cookie card has a single argument which |
| | @@ -256,11 +273,11 @@ |
| 256 | 273 | cookie and the server must structure the cookie payload in such |
| 257 | 274 | a way that it can tell if the cookie it sees is its own cookie or |
| 258 | 275 | a cookie from another server. (Typically the server will embed |
| 259 | 276 | its servercode as part of the cookie.)</p> |
| 260 | 277 | |
| 261 | | -<h3>2.9 Error Cards</h3> |
| 278 | +<h3>3.9 Error Cards</h3> |
| 262 | 279 | |
| 263 | 280 | <p>If the server discovers anything wrong with a request, it generates |
| 264 | 281 | an error card in its reply. When the client sees the error card, |
| 265 | 282 | it displays an error message to the user and aborts the sync |
| 266 | 283 | operation. An error card looks like this:</p> |
| | @@ -276,16 +293,16 @@ |
| 276 | 293 | (ASCII 0x5C) is represented as two backslashes "\\". Apart from |
| 277 | 294 | space and newline, no other whitespace characters nor any |
| 278 | 295 | unprintable characters are allowed in |
| 279 | 296 | the error message.</p> |
| 280 | 297 | |
| 281 | | -<h3>2.10 Unknown Cards</h3> |
| 298 | +<h3>3.10 Unknown Cards</h3> |
| 282 | 299 | |
| 283 | 300 | <p>If either the client or the server sees a card that is not |
| 284 | 301 | described above, then it generates an error and aborts.</p> |
| 285 | 302 | |
| 286 | | -<h2>3.0 Phantoms And Clusters</h2> |
| 303 | +<h2>4.0 Phantoms And Clusters</h2> |
| 287 | 304 | |
| 288 | 305 | <p>When a repository knows that a file exists and knows the UUID of |
| 289 | 306 | that file, but it does not know the file content, then it stores that |
| 290 | 307 | file as a "phantom". A repository will typically create a phantom when |
| 291 | 308 | it receives an igot card for a file that it does not hold or when it |
| | @@ -316,11 +333,11 @@ |
| 316 | 333 | exactly is not a cluster. There must be no extra whitespace in |
| 317 | 334 | the file. There must be one or more M cards. There must be a |
| 318 | 335 | single Z card with a correct MD5 checksum. And all cards must |
| 319 | 336 | be in strict lexicographical order.</p> |
| 320 | 337 | |
| 321 | | -<h3>3.1 The Unclustered Table</h3> |
| 338 | +<h3>4.1 The Unclustered Table</h3> |
| 322 | 339 | |
| 323 | 340 | <p>Every repository maintains a table named "<b>unclustered</b>" |
| 324 | 341 | which records the identity of every file and phantom it holds that is not |
| 325 | 342 | mentioned in a cluster. The entries in the unclustered table can |
| 326 | 343 | be thought of as leaves on a tree of files. Some of the unclustered |
| | @@ -327,13 +344,13 @@ |
| 327 | 344 | files will be clusters. Those clusters may contain other clusters, |
| 328 | 345 | which might contain still more clusters, and so forth. Beginning |
| 329 | 346 | with the files in the unclustered table, one can follow the chain |
| 330 | 347 | of clusters to find every file in the repository.</p> |
| 331 | 348 | |
| 332 | | -<h2>4.0 Synchronization Strategies</h2> |
| 349 | +<h2>5.0 Synchronization Strategies</h2> |
| 333 | 350 | |
| 334 | | -<h3>4.1 Pull</h3> |
| 351 | +<h3>5.1 Pull</h3> |
| 335 | 352 | |
| 336 | 353 | <p>A typical pull operation proceeds as shown below. Details |
| 337 | 354 | of the actual implementation may very slightly but the gist of |
| 338 | 355 | a pull is captured in the following steps:</p> |
| 339 | 356 | |
| | @@ -381,11 +398,11 @@ |
| 381 | 398 | protocol will continue to work even if there are multiple servers |
| 382 | 399 | or if servers and clients sometimes change roles. The only negative |
| 383 | 400 | effects of these unusual arrangements is that more than the minimum |
| 384 | 401 | number of clusters might be generated.</p> |
| 385 | 402 | |
| 386 | | -<h3>4.2 Push</h3> |
| 403 | +<h3>5.2 Push</h3> |
| 387 | 404 | |
| 388 | 405 | <p>A typical push operation proceeds roughly as shown below. As |
| 389 | 406 | with a pull, the actual implementation may vary slightly.</p> |
| 390 | 407 | |
| 391 | 408 | <ol> |
| | @@ -415,19 +432,19 @@ |
| 415 | 432 | server knows all files that exist on the client. Also, as with |
| 416 | 433 | pull, the client attempts to keep the size of the request from |
| 417 | 434 | growing too large by suppressing file cards once the |
| 418 | 435 | size of the request reaches 1MB.</p> |
| 419 | 436 | |
| 420 | | -<h3>4.3 Sync</h3> |
| 437 | +<h3>5.3 Sync</h3> |
| 421 | 438 | |
| 422 | 439 | <p>A sync is just a pull and a push that happen at the same time. |
| 423 | 440 | The first three steps of a pull are combined with the first five steps |
| 424 | 441 | of a push. Steps (4) through (7) of a pull are combined with steps |
| 425 | 442 | (5) through (8) of a push. And steps (8) through (10) of a pull |
| 426 | 443 | are combined with step (9) of a push.</p> |
| 427 | 444 | |
| 428 | | -<h2>5.0 Summary</h2> |
| 445 | +<h2>6.0 Summary</h2> |
| 429 | 446 | |
| 430 | 447 | <p>Here are the key points of the synchronization protocol:</p> |
| 431 | 448 | |
| 432 | 449 | <ol> |
| 433 | 450 | <li>The client sends one or more PUSH HTTP requests to the server. |
| 434 | 451 | |