Fossil SCM
Expanded on the discussion of forking vs branching in www/branching.wiki.
Commit
b882c623868311dff805ecb8ea48bfc23d2b4e100aea29c83920fee207a5e17a
Parent
e3c55977e21cf45…
1 file changed
+115
-52
+115
-52
| --- www/branching.wiki | ||
| +++ www/branching.wiki | ||
| @@ -1,10 +1,10 @@ | ||
| 1 | 1 | <title>Branching, Forking, Merging, and Tagging</title> |
| 2 | 2 | <h2>Background</h2> |
| 3 | 3 | |
| 4 | 4 | In a simple and perfect world, the development of a project would proceed |
| 5 | -linearly, as shown in figure 1. | |
| 5 | +linearly, as shown in Figure 1. | |
| 6 | 6 | |
| 7 | 7 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 8 | 8 | <tr><td align="center"> |
| 9 | 9 | <img src="branch01.svg"><br> |
| 10 | 10 | Figure 1 |
| @@ -15,21 +15,20 @@ | ||
| 15 | 15 | check-in numbers would be long hexadecimal hashes since it is not possible |
| 16 | 16 | to allocate collision-free sequential numbers in a distributed system. |
| 17 | 17 | But as sequential numbers are easier to read, we will substitute them for |
| 18 | 18 | the long hashes in this document. |
| 19 | 19 | |
| 20 | -The arrows in figure 1 show the evolution of a project. The initial | |
| 20 | +The arrows in Figure 1 show the evolution of a project. The initial | |
| 21 | 21 | check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2 |
| 22 | 22 | was created by making edits to check-in 1 and then committing those edits. |
| 23 | 23 | We say that 2 is a <i>child</i> of 1 |
| 24 | 24 | and that 1 is a <i>parent</i> of 2. |
| 25 | 25 | Check-in 3 is derived from check-in 2, making |
| 26 | 26 | 3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1 |
| 27 | 27 | and 2 are both <i>ancestors</i> of 3. |
| 28 | 28 | |
| 29 | -<a name="dag"></a> | |
| 30 | -<h2>DAGs</h2> | |
| 29 | +<h2 id="dag">DAGs</h2> | |
| 31 | 30 | |
| 32 | 31 | The graph of check-ins is a |
| 33 | 32 | [http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph] |
| 34 | 33 | commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG |
| 35 | 34 | since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since |
| @@ -36,50 +35,70 @@ | ||
| 36 | 35 | it has no descendants. (We will give a more precise definition later of |
| 37 | 36 | "leaf.") |
| 38 | 37 | |
| 39 | 38 | Alas, reality often interferes with the simple linear development of a |
| 40 | 39 | project. Suppose two programmers make independent modifications to check-in 2. |
| 41 | -After both changes are committed, the check-in graph looks like figure 2: | |
| 40 | +After both changes are committed, the check-in graph looks like Figure 2: | |
| 42 | 41 | |
| 43 | 42 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 44 | 43 | <tr><td align="center"> |
| 45 | 44 | <img src="branch02.svg"><br> |
| 46 | 45 | Figure 2 |
| 47 | 46 | </td></tr></table> |
| 48 | 47 | |
| 49 | -The graph in figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has | |
| 48 | +The graph in Figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has | |
| 50 | 49 | two children, check-ins 3 and 4. We call this state a <i>fork</i>. |
| 51 | 50 | |
| 52 | 51 | Fossil tries to prevent forks. Suppose two programmers named Alice and |
| 53 | 52 | Bob are each editing check-in 2 separately. Alice finishes her edits |
| 54 | 53 | first and commits her changes, resulting in check-in 3. Later, when Bob |
| 55 | -attempts to commit his changes, fossil verifies that check-in 2 is still | |
| 54 | +attempts to commit his changes, Fossil verifies that check-in 2 is still | |
| 56 | 55 | a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit |
| 57 | 56 | attempt with a message "would fork." This allows Bob to do a "fossil |
| 58 | 57 | update" which pulls in Alice's changes, merging them into his own |
| 59 | 58 | changes. After merging, Bob commits check-in 4 as a child of check-in 3. |
| 60 | -The result is a linear graph as shown in figure 1. This is how CVS | |
| 61 | -works. This is also how fossil works in [./concepts.wiki#workflow | | |
| 59 | +The result is a linear graph as shown in Figure 1. This is how CVS | |
| 60 | +works. This is also how Fossil works in [./concepts.wiki#workflow | | |
| 62 | 61 | "autosync"] mode. |
| 63 | 62 | |
| 64 | 63 | But perhaps Bob is off-network when he does his commit, so he |
| 65 | 64 | has no way of knowing that Alice has already committed her changes. |
| 66 | 65 | Or, it could be that Bob has turned off "autosync" mode in Fossil. Or, |
| 67 | 66 | maybe Bob just doesn't want to merge in Alice's changes before he has |
| 68 | 67 | saved his own, so he forces the commit to occur using the "--allow-fork" |
| 69 | -option to the fossil <b>commit</b> command. For any of these reasons, | |
| 68 | +option to the <b>fossil commit</b> command. For any of these reasons, | |
| 70 | 69 | two commits against check-in 2 have occurred and now the DAG has two leaves. |
| 71 | 70 | |
| 72 | 71 | So which version of the project is the "latest" in the sense of having |
| 73 | 72 | the most features and the most bug fixes? When there is more than |
| 74 | -one leaf in the graph, you don't really know. So we like to have | |
| 75 | -graphs with a single leaf. | |
| 73 | +one leaf in the graph, you don't really know, so we like to have | |
| 74 | +check-in graphs with a single leaf. | |
| 75 | + | |
| 76 | +Fossil resolves such problems using the check-in time on the leaves to | |
| 77 | +decide which leaf to use as the parent of new leaves. When a branch is | |
| 78 | +forked as in Figure 2, Fossil will choose check-in 4 as the parent for a | |
| 79 | +later check-in 5, but <i>only</i> if it has sync'd that check-in down | |
| 80 | +into the local repository. If autosync is disabled or the user is | |
| 81 | +off-network when that fifth check-in occurs, so that check-in 3 is the | |
| 82 | +latest on that branch at the time within that clone of the repository, | |
| 83 | +Fossil will make check-in 3 the parent of check-in 5! | |
| 84 | + | |
| 85 | +Fossil also uses a forked branch's leaf check-in timestamps when | |
| 86 | +checking out that branch: it gives you the fork with the latest | |
| 87 | +check-in, which in turn selects which parent your next check-in will be | |
| 88 | +a child of. This situation means development on that branch can fork | |
| 89 | +into two independent lines of development, based solely on which branch | |
| 90 | +tip is newer at the time the next user starts his work on it. Because | |
| 91 | +of this, we strongly recommend that you do not intentionally create | |
| 92 | +forks on branches with "--allow-fork" if that branch is used by many | |
| 93 | +people over a long period of time. (Prime example: trunk.) | |
| 76 | 94 | |
| 77 | -To resolve this situation, Alice can use the fossil <b>merge</b> command | |
| 78 | -to merge in Bob's changes in her local copy of check-in 3. Then she | |
| 79 | -can commit the results as check-in 5. This results in a DAG as shown | |
| 80 | -in figure 3. | |
| 95 | +Let us return to Figure 2. To resolve such situations before they can | |
| 96 | +become a real problem, Alice can use the <b>fossil merge</b> command to | |
| 97 | +merge Bob's changes into her local copy of check-in 3. Then she can | |
| 98 | +commit the results as check-in 5. This results in a DAG as shown in | |
| 99 | +Figure 3. | |
| 81 | 100 | |
| 82 | 101 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 83 | 102 | <tr><td align="center"> |
| 84 | 103 | <img src="branch03.svg"><br> |
| 85 | 104 | Figure 3 |
| @@ -87,38 +106,41 @@ | ||
| 87 | 106 | |
| 88 | 107 | Check-in 5 is a child of check-in 3 because it was created by editing |
| 89 | 108 | check-in 3. But check-in 5 also inherits the changes from check-in 4 by |
| 90 | 109 | virtue of the merge. So we say that check-in 5 is a <i>merge child</i> |
| 91 | 110 | of check-in 4 and that it is a <i>direct child</i> of check-in 3. |
| 92 | -The graph is now back to a single leaf (check-in 5). | |
| 111 | +The graph is now back to a single leaf, check-in 5. | |
| 93 | 112 | |
| 94 | -We have already seen that if fossil is in autosync mode then Bob would | |
| 113 | +We have already seen that if Fossil is in autosync mode then Bob would | |
| 95 | 114 | have been warned about the potential fork the first time he tried to |
| 96 | 115 | commit check-in 4. If Bob had updated his local check-out to merge in |
| 97 | 116 | Alice's check-in 3 changes, then committed, then the fork would have |
| 98 | 117 | never occurred. The resulting graph would have been linear, as shown |
| 99 | -in figure 1. Really the graph of figure 1 is a subset of figure 3. | |
| 100 | -Hold your hand over the check-in 4 circle of figure 3 and then figure | |
| 101 | -3 looks exactly like figure 1 (except that the leaf has a different check-in | |
| 102 | -number, but that is just a notational difference - the two check-ins have | |
| 103 | -exactly the same content). In other words, figure 3 is really a superset | |
| 104 | -of figure 1. The check-in 4 of figure 3 captures additional state which | |
| 105 | -is omitted from figure 1. Check-in 4 of figure 3 holds a copy | |
| 106 | -of Bob's local checkout before he merged in Alice's changes. That snapshot | |
| 107 | -of Bob's changes, which is independent of Alice's changes, is omitted from figure 1. | |
| 108 | -Some people say that the approach taken in figure 3 is better because it | |
| 109 | -preserves this extra intermediate state. Others say that the approach | |
| 110 | -taken in figure 1 is better because it is much easier to visualize a | |
| 111 | -linear line of development and because the merging happens automatically | |
| 112 | -instead of as a separate manual step. We will not take sides in that | |
| 113 | -debate. We will simply point out that fossil enables you to do it either way. | |
| 114 | - | |
| 115 | -<h2>Forking Versus Branching</h2> | |
| 118 | +in Figure 1. | |
| 119 | + | |
| 120 | +Realize that the graph of Figure 1 is a subset of Figure 3. Hold your | |
| 121 | +hand over the check-in 4 circle of Figure 3 and then Figure 3 looks | |
| 122 | +exactly like Figure 1, except that the leaf has a different check-in | |
| 123 | +number, but that is just a notational difference — the two check-ins | |
| 124 | +have exactly the same content. In other words, Figure 3 is really a | |
| 125 | +superset of Figure 1. The check-in 4 of Figure 3 captures additional | |
| 126 | +state which is omitted from Figure 1. Check-in 4 of Figure 3 holds a | |
| 127 | +copy of Bob's local checkout before he merged in Alice's changes. That | |
| 128 | +snapshot of Bob's changes, which is independent of Alice's changes, is | |
| 129 | +omitted from Figure 1. Some people say that the approach taken in | |
| 130 | +Figure 3 is better because it preserves this extra intermediate state. | |
| 131 | +Others say that the approach taken in Figure 1 is better because it is | |
| 132 | +much easier to visualize a linear line of development and because the | |
| 133 | +merging happens automatically instead of as a separate manual step. We | |
| 134 | +will not take sides in that debate. We will simply point out that | |
| 135 | +Fossil enables you to do it either way. | |
| 136 | + | |
| 137 | +<h2 id="branching">The Alternative to Forking: Branching</h2> | |
| 116 | 138 | |
| 117 | 139 | Having more than one leaf in the check-in DAG is called a "fork." This |
| 118 | 140 | is usually undesirable and either avoided entirely, |
| 119 | -as in figure 1, or else quickly resolved as shown in figure 3. | |
| 141 | +as in Figure 1, or else quickly resolved as shown in Figure 3. | |
| 120 | 142 | But sometimes, one does want to have multiple leaves. For example, a project |
| 121 | 143 | might have one leaf that is the latest version of the project under |
| 122 | 144 | development and another leaf that is the latest version that has been |
| 123 | 145 | tested. |
| 124 | 146 | When multiple leaves are desirable, we call this <i>branching</i> |
| @@ -130,11 +152,11 @@ | ||
| 130 | 152 | <tr><td align="center"> |
| 131 | 153 | <img src="branch04.svg"><br> |
| 132 | 154 | Figure 4 |
| 133 | 155 | </td></tr></table> |
| 134 | 156 | |
| 135 | -The hypothetical scenario of figure 4 is this: The project starts and | |
| 157 | +The hypothetical scenario of Figure 4 is this: The project starts and | |
| 136 | 158 | progresses to a point where (at check-in 2) |
| 137 | 159 | it is ready to enter testing for its first release. |
| 138 | 160 | In a real project, of course, there might be hundreds or thousands of |
| 139 | 161 | check-ins before a project reaches this point, but for simplicity of |
| 140 | 162 | presentation we will say that the project is ready after check-in 2. |
| @@ -147,35 +169,76 @@ | ||
| 147 | 169 | the bug fixes implemented by the testing team. So periodically, the |
| 148 | 170 | changes in the test branch are merged into the dev branch. This is |
| 149 | 171 | shown by the dashed merge arrows between check-ins 6 and 7 and between |
| 150 | 172 | check-ins 9 and 10. |
| 151 | 173 | |
| 152 | -In both figures 2 and 4, check-in 2 has two children. In figure 2, | |
| 174 | +In both Figures 2 and 4, check-in 2 has two children. In Figure 2, | |
| 153 | 175 | we call this a "fork." In diagram 4, we call it a "branch." What is |
| 154 | -the difference? As far as the internal fossil data structures are | |
| 176 | +the difference? As far as the internal Fossil data structures are | |
| 155 | 177 | concerned, there is no difference. The distinction is in the intent. |
| 156 | -In figure 2, the fact that check-in 2 has multiple children is an | |
| 157 | -accident that stems from concurrent development. In figure 4, giving | |
| 178 | +In Figure 2, the fact that check-in 2 has multiple children is an | |
| 179 | +accident that stems from concurrent development. In Figure 4, giving | |
| 158 | 180 | check-in 2 multiple children is a deliberate act. So, to a good |
| 159 | 181 | approximation, we define forking to be by accident and branching to |
| 160 | 182 | be by intent. Apart from that, they are the same. |
| 161 | 183 | |
| 162 | -<a name="tags"></a> | |
| 163 | -<h2>Tags And Properties</h2> | |
| 184 | +<h2 id="forking">Justifications For Forking</h2> | |
| 185 | + | |
| 186 | +The primary cases where forking is justified are all when it is done purely | |
| 187 | +in software in order to avoid losing information: | |
| 188 | + | |
| 189 | +<ol> | |
| 190 | + <li><p>By Fossil itself when two users check in children to the same | |
| 191 | + leaf of a branch, as in Figure 2. If they're doing it because | |
| 192 | + autosync is disabled on one or both of the repositories, Fossil has | |
| 193 | + no way of knowing that it is creating a fork until the two | |
| 194 | + repositories are later sync'd manually.</p></li> | |
| 195 | + | |
| 196 | + <li><p>By Fossil when the cloning hierarchy is more than 2 levels | |
| 197 | + deep. If your master repository is cloned by user A and then user B | |
| 198 | + clones from user A's repository, check-ins to user B's repo do not | |
| 199 | + check the master repo before allowing the check-in even with | |
| 200 | + autosync enabled. It isn't until user A syncs her repo with the | |
| 201 | + master repo that an inadvertent fork can be detected. | |
| 202 | + <br><br> | |
| 203 | + Because of this, we recommend that if you're using Fossil in a | |
| 204 | + distributed way like this, that check-ins be made only to the master | |
| 205 | + or its immediate child repos, and that those further down the chain | |
| 206 | + be read-only clones.</p></li> | |
| 207 | + | |
| 208 | + <li><p>You've automated Fossil (e.g. with a shell script) and | |
| 209 | + forking is a possibility, so you add "--allow-fork" to your | |
| 210 | + "checkin" commands to prevent Fossil from refusing the check-in due | |
| 211 | + to the fork. It's better to write such a script to detect this | |
| 212 | + condition and cope with it (e.g. "fossil update") but if the | |
| 213 | + alternative is losing information, you may feel justified in | |
| 214 | + creating forks that an interactive user must later clean up with | |
| 215 | + "fossil merge" commands.</p></li> | |
| 216 | +</ol> | |
| 217 | + | |
| 218 | +That leaves only one case where we can recommend use of "--allow-fork" | |
| 219 | +by interactive users, while autosync is enabled: when you're working on | |
| 220 | +a personal branch so that creating a dual-tipped branch isn't going to | |
| 221 | +cause any other user an inconvenience or risk forking the development. | |
| 222 | +This is a good alternative to branching when you just need to | |
| 223 | +temporarily fork the branch's development. | |
| 224 | + | |
| 225 | + | |
| 226 | +<h2 id="tags">Tags And Properties</h2> | |
| 164 | 227 | |
| 165 | -Tags and properties are used in fossil to help express the intent, and | |
| 228 | +Tags and properties are used in Fossil to help express the intent, and | |
| 166 | 229 | thus to distinguish between forks and branches. Figure 5 shows the |
| 167 | -same scenario as figure 4 but with tags and properties added: | |
| 230 | +same scenario as Figure 4 but with tags and properties added: | |
| 168 | 231 | |
| 169 | 232 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 170 | 233 | <tr><td align="center"> |
| 171 | 234 | <img src="branch05.svg"><br> |
| 172 | 235 | Figure 5 |
| 173 | 236 | </td></tr></table> |
| 174 | 237 | |
| 175 | 238 | A <i>tag</i> is a name that is attached to a check-in. A |
| 176 | -<i>property</i> is a name/value pair. Internally, fossil implements | |
| 239 | +<i>property</i> is a name/value pair. Internally, Fossil implements | |
| 177 | 240 | tags as properties with a NULL value. So, tags and properties really |
| 178 | 241 | are much the same thing, and henceforth we will use the word "tag" |
| 179 | 242 | to mean either a tag or a property. |
| 180 | 243 | |
| 181 | 244 | A tag can be a one-time tag, a propagating tag or a cancellation tag. |
| @@ -188,11 +251,11 @@ | ||
| 188 | 251 | is attached to a single check-in in order to either override a one-time |
| 189 | 252 | tag that was previously placed on that same check-in, or to block |
| 190 | 253 | tag propagation from an ancestor. |
| 191 | 254 | |
| 192 | 255 | The initial check-in of every repository has two propagating tags. In |
| 193 | -figure 5, that initial check-in is check-in 1. The <b>branch</b> tag | |
| 256 | +Figure 5, that initial check-in is check-in 1. The <b>branch</b> tag | |
| 194 | 257 | tells (by its value) what branch the check-in is a member of. |
| 195 | 258 | The default branch is called "trunk." All tags that begin with "<b>sym-</b>" |
| 196 | 259 | are symbolic name tags. When a symbolic name tag is attached to a |
| 197 | 260 | check-in, that allows you to refer to that check-in by its symbolic |
| 198 | 261 | name rather than by its hexadecimal hash name. When a symbolic name |
| @@ -250,22 +313,22 @@ | ||
| 250 | 313 | <dd><p>A branch point occurs when a check-in has two or more direct (non-merge) |
| 251 | 314 | children in different branches. A branch point is similar to a fork, |
| 252 | 315 | except that the children are in different branches.</p></dd> |
| 253 | 316 | </dl></blockquote> |
| 254 | 317 | |
| 255 | -Check-in 4 of figure 3 is not a leaf because it has a child (check-in 5) | |
| 256 | -in the same branch. Check-in 9 of figure 5 also has a child (check-in 10) | |
| 318 | +Check-in 4 of Figure 3 is not a leaf because it has a child (check-in 5) | |
| 319 | +in the same branch. Check-in 9 of Figure 5 also has a child (check-in 10) | |
| 257 | 320 | but that child is in a different branch, so check-in 9 is a leaf. Because |
| 258 | 321 | of the <b>closed</b> tag on check-in 9, it is a closed leaf. |
| 259 | 322 | |
| 260 | -Check-in 2 of figure 3 is considered a "fork" | |
| 261 | -because it has two children in the same branch. Check-in 2 of figure 5 | |
| 323 | +Check-in 2 of Figure 3 is considered a "fork" | |
| 324 | +because it has two children in the same branch. Check-in 2 of Figure 5 | |
| 262 | 325 | also has two children, but each child is in a different branch, hence in |
| 263 | -figure 5, check-in 2 is considered a "branch point." | |
| 326 | +Figure 5, check-in 2 is considered a "branch point." | |
| 264 | 327 | |
| 265 | 328 | <h2>Differences With Other DVCSes</h2> |
| 266 | 329 | |
| 267 | 330 | Fossil keeps all check-ins on a single DAG. Branches are identified with |
| 268 | 331 | tags. This means that check-ins can be freely moved between branches |
| 269 | 332 | simply by altering their tags. |
| 270 | 333 | |
| 271 | 334 | Most other DVCSes maintain a separate DAG for each branch. |
| 272 | 335 |
| --- www/branching.wiki | |
| +++ www/branching.wiki | |
| @@ -1,10 +1,10 @@ | |
| 1 | <title>Branching, Forking, Merging, and Tagging</title> |
| 2 | <h2>Background</h2> |
| 3 | |
| 4 | In a simple and perfect world, the development of a project would proceed |
| 5 | linearly, as shown in figure 1. |
| 6 | |
| 7 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 8 | <tr><td align="center"> |
| 9 | <img src="branch01.svg"><br> |
| 10 | Figure 1 |
| @@ -15,21 +15,20 @@ | |
| 15 | check-in numbers would be long hexadecimal hashes since it is not possible |
| 16 | to allocate collision-free sequential numbers in a distributed system. |
| 17 | But as sequential numbers are easier to read, we will substitute them for |
| 18 | the long hashes in this document. |
| 19 | |
| 20 | The arrows in figure 1 show the evolution of a project. The initial |
| 21 | check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2 |
| 22 | was created by making edits to check-in 1 and then committing those edits. |
| 23 | We say that 2 is a <i>child</i> of 1 |
| 24 | and that 1 is a <i>parent</i> of 2. |
| 25 | Check-in 3 is derived from check-in 2, making |
| 26 | 3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1 |
| 27 | and 2 are both <i>ancestors</i> of 3. |
| 28 | |
| 29 | <a name="dag"></a> |
| 30 | <h2>DAGs</h2> |
| 31 | |
| 32 | The graph of check-ins is a |
| 33 | [http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph] |
| 34 | commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG |
| 35 | since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since |
| @@ -36,50 +35,70 @@ | |
| 36 | it has no descendants. (We will give a more precise definition later of |
| 37 | "leaf.") |
| 38 | |
| 39 | Alas, reality often interferes with the simple linear development of a |
| 40 | project. Suppose two programmers make independent modifications to check-in 2. |
| 41 | After both changes are committed, the check-in graph looks like figure 2: |
| 42 | |
| 43 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 44 | <tr><td align="center"> |
| 45 | <img src="branch02.svg"><br> |
| 46 | Figure 2 |
| 47 | </td></tr></table> |
| 48 | |
| 49 | The graph in figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has |
| 50 | two children, check-ins 3 and 4. We call this state a <i>fork</i>. |
| 51 | |
| 52 | Fossil tries to prevent forks. Suppose two programmers named Alice and |
| 53 | Bob are each editing check-in 2 separately. Alice finishes her edits |
| 54 | first and commits her changes, resulting in check-in 3. Later, when Bob |
| 55 | attempts to commit his changes, fossil verifies that check-in 2 is still |
| 56 | a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit |
| 57 | attempt with a message "would fork." This allows Bob to do a "fossil |
| 58 | update" which pulls in Alice's changes, merging them into his own |
| 59 | changes. After merging, Bob commits check-in 4 as a child of check-in 3. |
| 60 | The result is a linear graph as shown in figure 1. This is how CVS |
| 61 | works. This is also how fossil works in [./concepts.wiki#workflow | |
| 62 | "autosync"] mode. |
| 63 | |
| 64 | But perhaps Bob is off-network when he does his commit, so he |
| 65 | has no way of knowing that Alice has already committed her changes. |
| 66 | Or, it could be that Bob has turned off "autosync" mode in Fossil. Or, |
| 67 | maybe Bob just doesn't want to merge in Alice's changes before he has |
| 68 | saved his own, so he forces the commit to occur using the "--allow-fork" |
| 69 | option to the fossil <b>commit</b> command. For any of these reasons, |
| 70 | two commits against check-in 2 have occurred and now the DAG has two leaves. |
| 71 | |
| 72 | So which version of the project is the "latest" in the sense of having |
| 73 | the most features and the most bug fixes? When there is more than |
| 74 | one leaf in the graph, you don't really know. So we like to have |
| 75 | graphs with a single leaf. |
| 76 | |
| 77 | To resolve this situation, Alice can use the fossil <b>merge</b> command |
| 78 | to merge in Bob's changes in her local copy of check-in 3. Then she |
| 79 | can commit the results as check-in 5. This results in a DAG as shown |
| 80 | in figure 3. |
| 81 | |
| 82 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 83 | <tr><td align="center"> |
| 84 | <img src="branch03.svg"><br> |
| 85 | Figure 3 |
| @@ -87,38 +106,41 @@ | |
| 87 | |
| 88 | Check-in 5 is a child of check-in 3 because it was created by editing |
| 89 | check-in 3. But check-in 5 also inherits the changes from check-in 4 by |
| 90 | virtue of the merge. So we say that check-in 5 is a <i>merge child</i> |
| 91 | of check-in 4 and that it is a <i>direct child</i> of check-in 3. |
| 92 | The graph is now back to a single leaf (check-in 5). |
| 93 | |
| 94 | We have already seen that if fossil is in autosync mode then Bob would |
| 95 | have been warned about the potential fork the first time he tried to |
| 96 | commit check-in 4. If Bob had updated his local check-out to merge in |
| 97 | Alice's check-in 3 changes, then committed, then the fork would have |
| 98 | never occurred. The resulting graph would have been linear, as shown |
| 99 | in figure 1. Really the graph of figure 1 is a subset of figure 3. |
| 100 | Hold your hand over the check-in 4 circle of figure 3 and then figure |
| 101 | 3 looks exactly like figure 1 (except that the leaf has a different check-in |
| 102 | number, but that is just a notational difference - the two check-ins have |
| 103 | exactly the same content). In other words, figure 3 is really a superset |
| 104 | of figure 1. The check-in 4 of figure 3 captures additional state which |
| 105 | is omitted from figure 1. Check-in 4 of figure 3 holds a copy |
| 106 | of Bob's local checkout before he merged in Alice's changes. That snapshot |
| 107 | of Bob's changes, which is independent of Alice's changes, is omitted from figure 1. |
| 108 | Some people say that the approach taken in figure 3 is better because it |
| 109 | preserves this extra intermediate state. Others say that the approach |
| 110 | taken in figure 1 is better because it is much easier to visualize a |
| 111 | linear line of development and because the merging happens automatically |
| 112 | instead of as a separate manual step. We will not take sides in that |
| 113 | debate. We will simply point out that fossil enables you to do it either way. |
| 114 | |
| 115 | <h2>Forking Versus Branching</h2> |
| 116 | |
| 117 | Having more than one leaf in the check-in DAG is called a "fork." This |
| 118 | is usually undesirable and either avoided entirely, |
| 119 | as in figure 1, or else quickly resolved as shown in figure 3. |
| 120 | But sometimes, one does want to have multiple leaves. For example, a project |
| 121 | might have one leaf that is the latest version of the project under |
| 122 | development and another leaf that is the latest version that has been |
| 123 | tested. |
| 124 | When multiple leaves are desirable, we call this <i>branching</i> |
| @@ -130,11 +152,11 @@ | |
| 130 | <tr><td align="center"> |
| 131 | <img src="branch04.svg"><br> |
| 132 | Figure 4 |
| 133 | </td></tr></table> |
| 134 | |
| 135 | The hypothetical scenario of figure 4 is this: The project starts and |
| 136 | progresses to a point where (at check-in 2) |
| 137 | it is ready to enter testing for its first release. |
| 138 | In a real project, of course, there might be hundreds or thousands of |
| 139 | check-ins before a project reaches this point, but for simplicity of |
| 140 | presentation we will say that the project is ready after check-in 2. |
| @@ -147,35 +169,76 @@ | |
| 147 | the bug fixes implemented by the testing team. So periodically, the |
| 148 | changes in the test branch are merged into the dev branch. This is |
| 149 | shown by the dashed merge arrows between check-ins 6 and 7 and between |
| 150 | check-ins 9 and 10. |
| 151 | |
| 152 | In both figures 2 and 4, check-in 2 has two children. In figure 2, |
| 153 | we call this a "fork." In diagram 4, we call it a "branch." What is |
| 154 | the difference? As far as the internal fossil data structures are |
| 155 | concerned, there is no difference. The distinction is in the intent. |
| 156 | In figure 2, the fact that check-in 2 has multiple children is an |
| 157 | accident that stems from concurrent development. In figure 4, giving |
| 158 | check-in 2 multiple children is a deliberate act. So, to a good |
| 159 | approximation, we define forking to be by accident and branching to |
| 160 | be by intent. Apart from that, they are the same. |
| 161 | |
| 162 | <a name="tags"></a> |
| 163 | <h2>Tags And Properties</h2> |
| 164 | |
| 165 | Tags and properties are used in fossil to help express the intent, and |
| 166 | thus to distinguish between forks and branches. Figure 5 shows the |
| 167 | same scenario as figure 4 but with tags and properties added: |
| 168 | |
| 169 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 170 | <tr><td align="center"> |
| 171 | <img src="branch05.svg"><br> |
| 172 | Figure 5 |
| 173 | </td></tr></table> |
| 174 | |
| 175 | A <i>tag</i> is a name that is attached to a check-in. A |
| 176 | <i>property</i> is a name/value pair. Internally, fossil implements |
| 177 | tags as properties with a NULL value. So, tags and properties really |
| 178 | are much the same thing, and henceforth we will use the word "tag" |
| 179 | to mean either a tag or a property. |
| 180 | |
| 181 | A tag can be a one-time tag, a propagating tag or a cancellation tag. |
| @@ -188,11 +251,11 @@ | |
| 188 | is attached to a single check-in in order to either override a one-time |
| 189 | tag that was previously placed on that same check-in, or to block |
| 190 | tag propagation from an ancestor. |
| 191 | |
| 192 | The initial check-in of every repository has two propagating tags. In |
| 193 | figure 5, that initial check-in is check-in 1. The <b>branch</b> tag |
| 194 | tells (by its value) what branch the check-in is a member of. |
| 195 | The default branch is called "trunk." All tags that begin with "<b>sym-</b>" |
| 196 | are symbolic name tags. When a symbolic name tag is attached to a |
| 197 | check-in, that allows you to refer to that check-in by its symbolic |
| 198 | name rather than by its hexadecimal hash name. When a symbolic name |
| @@ -250,22 +313,22 @@ | |
| 250 | <dd><p>A branch point occurs when a check-in has two or more direct (non-merge) |
| 251 | children in different branches. A branch point is similar to a fork, |
| 252 | except that the children are in different branches.</p></dd> |
| 253 | </dl></blockquote> |
| 254 | |
| 255 | Check-in 4 of figure 3 is not a leaf because it has a child (check-in 5) |
| 256 | in the same branch. Check-in 9 of figure 5 also has a child (check-in 10) |
| 257 | but that child is in a different branch, so check-in 9 is a leaf. Because |
| 258 | of the <b>closed</b> tag on check-in 9, it is a closed leaf. |
| 259 | |
| 260 | Check-in 2 of figure 3 is considered a "fork" |
| 261 | because it has two children in the same branch. Check-in 2 of figure 5 |
| 262 | also has two children, but each child is in a different branch, hence in |
| 263 | figure 5, check-in 2 is considered a "branch point." |
| 264 | |
| 265 | <h2>Differences With Other DVCSes</h2> |
| 266 | |
| 267 | Fossil keeps all check-ins on a single DAG. Branches are identified with |
| 268 | tags. This means that check-ins can be freely moved between branches |
| 269 | simply by altering their tags. |
| 270 | |
| 271 | Most other DVCSes maintain a separate DAG for each branch. |
| 272 |
| --- www/branching.wiki | |
| +++ www/branching.wiki | |
| @@ -1,10 +1,10 @@ | |
| 1 | <title>Branching, Forking, Merging, and Tagging</title> |
| 2 | <h2>Background</h2> |
| 3 | |
| 4 | In a simple and perfect world, the development of a project would proceed |
| 5 | linearly, as shown in Figure 1. |
| 6 | |
| 7 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 8 | <tr><td align="center"> |
| 9 | <img src="branch01.svg"><br> |
| 10 | Figure 1 |
| @@ -15,21 +15,20 @@ | |
| 15 | check-in numbers would be long hexadecimal hashes since it is not possible |
| 16 | to allocate collision-free sequential numbers in a distributed system. |
| 17 | But as sequential numbers are easier to read, we will substitute them for |
| 18 | the long hashes in this document. |
| 19 | |
| 20 | The arrows in Figure 1 show the evolution of a project. The initial |
| 21 | check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2 |
| 22 | was created by making edits to check-in 1 and then committing those edits. |
| 23 | We say that 2 is a <i>child</i> of 1 |
| 24 | and that 1 is a <i>parent</i> of 2. |
| 25 | Check-in 3 is derived from check-in 2, making |
| 26 | 3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1 |
| 27 | and 2 are both <i>ancestors</i> of 3. |
| 28 | |
| 29 | <h2 id="dag">DAGs</h2> |
| 30 | |
| 31 | The graph of check-ins is a |
| 32 | [http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph] |
| 33 | commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG |
| 34 | since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since |
| @@ -36,50 +35,70 @@ | |
| 35 | it has no descendants. (We will give a more precise definition later of |
| 36 | "leaf.") |
| 37 | |
| 38 | Alas, reality often interferes with the simple linear development of a |
| 39 | project. Suppose two programmers make independent modifications to check-in 2. |
| 40 | After both changes are committed, the check-in graph looks like Figure 2: |
| 41 | |
| 42 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 43 | <tr><td align="center"> |
| 44 | <img src="branch02.svg"><br> |
| 45 | Figure 2 |
| 46 | </td></tr></table> |
| 47 | |
| 48 | The graph in Figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has |
| 49 | two children, check-ins 3 and 4. We call this state a <i>fork</i>. |
| 50 | |
| 51 | Fossil tries to prevent forks. Suppose two programmers named Alice and |
| 52 | Bob are each editing check-in 2 separately. Alice finishes her edits |
| 53 | first and commits her changes, resulting in check-in 3. Later, when Bob |
| 54 | attempts to commit his changes, Fossil verifies that check-in 2 is still |
| 55 | a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit |
| 56 | attempt with a message "would fork." This allows Bob to do a "fossil |
| 57 | update" which pulls in Alice's changes, merging them into his own |
| 58 | changes. After merging, Bob commits check-in 4 as a child of check-in 3. |
| 59 | The result is a linear graph as shown in Figure 1. This is how CVS |
| 60 | works. This is also how Fossil works in [./concepts.wiki#workflow | |
| 61 | "autosync"] mode. |
| 62 | |
| 63 | But perhaps Bob is off-network when he does his commit, so he |
| 64 | has no way of knowing that Alice has already committed her changes. |
| 65 | Or, it could be that Bob has turned off "autosync" mode in Fossil. Or, |
| 66 | maybe Bob just doesn't want to merge in Alice's changes before he has |
| 67 | saved his own, so he forces the commit to occur using the "--allow-fork" |
| 68 | option to the <b>fossil commit</b> command. For any of these reasons, |
| 69 | two commits against check-in 2 have occurred and now the DAG has two leaves. |
| 70 | |
| 71 | So which version of the project is the "latest" in the sense of having |
| 72 | the most features and the most bug fixes? When there is more than |
| 73 | one leaf in the graph, you don't really know, so we like to have |
| 74 | check-in graphs with a single leaf. |
| 75 | |
| 76 | Fossil resolves such problems using the check-in time on the leaves to |
| 77 | decide which leaf to use as the parent of new leaves. When a branch is |
| 78 | forked as in Figure 2, Fossil will choose check-in 4 as the parent for a |
| 79 | later check-in 5, but <i>only</i> if it has sync'd that check-in down |
| 80 | into the local repository. If autosync is disabled or the user is |
| 81 | off-network when that fifth check-in occurs, so that check-in 3 is the |
| 82 | latest on that branch at the time within that clone of the repository, |
| 83 | Fossil will make check-in 3 the parent of check-in 5! |
| 84 | |
| 85 | Fossil also uses a forked branch's leaf check-in timestamps when |
| 86 | checking out that branch: it gives you the fork with the latest |
| 87 | check-in, which in turn selects which parent your next check-in will be |
| 88 | a child of. This situation means development on that branch can fork |
| 89 | into two independent lines of development, based solely on which branch |
| 90 | tip is newer at the time the next user starts his work on it. Because |
| 91 | of this, we strongly recommend that you do not intentionally create |
| 92 | forks on branches with "--allow-fork" if that branch is used by many |
| 93 | people over a long period of time. (Prime example: trunk.) |
| 94 | |
| 95 | Let us return to Figure 2. To resolve such situations before they can |
| 96 | become a real problem, Alice can use the <b>fossil merge</b> command to |
| 97 | merge Bob's changes into her local copy of check-in 3. Then she can |
| 98 | commit the results as check-in 5. This results in a DAG as shown in |
| 99 | Figure 3. |
| 100 | |
| 101 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 102 | <tr><td align="center"> |
| 103 | <img src="branch03.svg"><br> |
| 104 | Figure 3 |
| @@ -87,38 +106,41 @@ | |
| 106 | |
| 107 | Check-in 5 is a child of check-in 3 because it was created by editing |
| 108 | check-in 3. But check-in 5 also inherits the changes from check-in 4 by |
| 109 | virtue of the merge. So we say that check-in 5 is a <i>merge child</i> |
| 110 | of check-in 4 and that it is a <i>direct child</i> of check-in 3. |
| 111 | The graph is now back to a single leaf, check-in 5. |
| 112 | |
| 113 | We have already seen that if Fossil is in autosync mode then Bob would |
| 114 | have been warned about the potential fork the first time he tried to |
| 115 | commit check-in 4. If Bob had updated his local check-out to merge in |
| 116 | Alice's check-in 3 changes, then committed, then the fork would have |
| 117 | never occurred. The resulting graph would have been linear, as shown |
| 118 | in Figure 1. |
| 119 | |
| 120 | Realize that the graph of Figure 1 is a subset of Figure 3. Hold your |
| 121 | hand over the check-in 4 circle of Figure 3 and then Figure 3 looks |
| 122 | exactly like Figure 1, except that the leaf has a different check-in |
| 123 | number, but that is just a notational difference — the two check-ins |
| 124 | have exactly the same content. In other words, Figure 3 is really a |
| 125 | superset of Figure 1. The check-in 4 of Figure 3 captures additional |
| 126 | state which is omitted from Figure 1. Check-in 4 of Figure 3 holds a |
| 127 | copy of Bob's local checkout before he merged in Alice's changes. That |
| 128 | snapshot of Bob's changes, which is independent of Alice's changes, is |
| 129 | omitted from Figure 1. Some people say that the approach taken in |
| 130 | Figure 3 is better because it preserves this extra intermediate state. |
| 131 | Others say that the approach taken in Figure 1 is better because it is |
| 132 | much easier to visualize a linear line of development and because the |
| 133 | merging happens automatically instead of as a separate manual step. We |
| 134 | will not take sides in that debate. We will simply point out that |
| 135 | Fossil enables you to do it either way. |
| 136 | |
| 137 | <h2 id="branching">The Alternative to Forking: Branching</h2> |
| 138 | |
| 139 | Having more than one leaf in the check-in DAG is called a "fork." This |
| 140 | is usually undesirable and either avoided entirely, |
| 141 | as in Figure 1, or else quickly resolved as shown in Figure 3. |
| 142 | But sometimes, one does want to have multiple leaves. For example, a project |
| 143 | might have one leaf that is the latest version of the project under |
| 144 | development and another leaf that is the latest version that has been |
| 145 | tested. |
| 146 | When multiple leaves are desirable, we call this <i>branching</i> |
| @@ -130,11 +152,11 @@ | |
| 152 | <tr><td align="center"> |
| 153 | <img src="branch04.svg"><br> |
| 154 | Figure 4 |
| 155 | </td></tr></table> |
| 156 | |
| 157 | The hypothetical scenario of Figure 4 is this: The project starts and |
| 158 | progresses to a point where (at check-in 2) |
| 159 | it is ready to enter testing for its first release. |
| 160 | In a real project, of course, there might be hundreds or thousands of |
| 161 | check-ins before a project reaches this point, but for simplicity of |
| 162 | presentation we will say that the project is ready after check-in 2. |
| @@ -147,35 +169,76 @@ | |
| 169 | the bug fixes implemented by the testing team. So periodically, the |
| 170 | changes in the test branch are merged into the dev branch. This is |
| 171 | shown by the dashed merge arrows between check-ins 6 and 7 and between |
| 172 | check-ins 9 and 10. |
| 173 | |
| 174 | In both Figures 2 and 4, check-in 2 has two children. In Figure 2, |
| 175 | we call this a "fork." In diagram 4, we call it a "branch." What is |
| 176 | the difference? As far as the internal Fossil data structures are |
| 177 | concerned, there is no difference. The distinction is in the intent. |
| 178 | In Figure 2, the fact that check-in 2 has multiple children is an |
| 179 | accident that stems from concurrent development. In Figure 4, giving |
| 180 | check-in 2 multiple children is a deliberate act. So, to a good |
| 181 | approximation, we define forking to be by accident and branching to |
| 182 | be by intent. Apart from that, they are the same. |
| 183 | |
| 184 | <h2 id="forking">Justifications For Forking</h2> |
| 185 | |
| 186 | The primary cases where forking is justified are all when it is done purely |
| 187 | in software in order to avoid losing information: |
| 188 | |
| 189 | <ol> |
| 190 | <li><p>By Fossil itself when two users check in children to the same |
| 191 | leaf of a branch, as in Figure 2. If they're doing it because |
| 192 | autosync is disabled on one or both of the repositories, Fossil has |
| 193 | no way of knowing that it is creating a fork until the two |
| 194 | repositories are later sync'd manually.</p></li> |
| 195 | |
| 196 | <li><p>By Fossil when the cloning hierarchy is more than 2 levels |
| 197 | deep. If your master repository is cloned by user A and then user B |
| 198 | clones from user A's repository, check-ins to user B's repo do not |
| 199 | check the master repo before allowing the check-in even with |
| 200 | autosync enabled. It isn't until user A syncs her repo with the |
| 201 | master repo that an inadvertent fork can be detected. |
| 202 | <br><br> |
| 203 | Because of this, we recommend that if you're using Fossil in a |
| 204 | distributed way like this, that check-ins be made only to the master |
| 205 | or its immediate child repos, and that those further down the chain |
| 206 | be read-only clones.</p></li> |
| 207 | |
| 208 | <li><p>You've automated Fossil (e.g. with a shell script) and |
| 209 | forking is a possibility, so you add "--allow-fork" to your |
| 210 | "checkin" commands to prevent Fossil from refusing the check-in due |
| 211 | to the fork. It's better to write such a script to detect this |
| 212 | condition and cope with it (e.g. "fossil update") but if the |
| 213 | alternative is losing information, you may feel justified in |
| 214 | creating forks that an interactive user must later clean up with |
| 215 | "fossil merge" commands.</p></li> |
| 216 | </ol> |
| 217 | |
| 218 | That leaves only one case where we can recommend use of "--allow-fork" |
| 219 | by interactive users, while autosync is enabled: when you're working on |
| 220 | a personal branch so that creating a dual-tipped branch isn't going to |
| 221 | cause any other user an inconvenience or risk forking the development. |
| 222 | This is a good alternative to branching when you just need to |
| 223 | temporarily fork the branch's development. |
| 224 | |
| 225 | |
| 226 | <h2 id="tags">Tags And Properties</h2> |
| 227 | |
| 228 | Tags and properties are used in Fossil to help express the intent, and |
| 229 | thus to distinguish between forks and branches. Figure 5 shows the |
| 230 | same scenario as Figure 4 but with tags and properties added: |
| 231 | |
| 232 | <table border=1 cellpadding=10 hspace=10 vspace=10 align="center"> |
| 233 | <tr><td align="center"> |
| 234 | <img src="branch05.svg"><br> |
| 235 | Figure 5 |
| 236 | </td></tr></table> |
| 237 | |
| 238 | A <i>tag</i> is a name that is attached to a check-in. A |
| 239 | <i>property</i> is a name/value pair. Internally, Fossil implements |
| 240 | tags as properties with a NULL value. So, tags and properties really |
| 241 | are much the same thing, and henceforth we will use the word "tag" |
| 242 | to mean either a tag or a property. |
| 243 | |
| 244 | A tag can be a one-time tag, a propagating tag or a cancellation tag. |
| @@ -188,11 +251,11 @@ | |
| 251 | is attached to a single check-in in order to either override a one-time |
| 252 | tag that was previously placed on that same check-in, or to block |
| 253 | tag propagation from an ancestor. |
| 254 | |
| 255 | The initial check-in of every repository has two propagating tags. In |
| 256 | Figure 5, that initial check-in is check-in 1. The <b>branch</b> tag |
| 257 | tells (by its value) what branch the check-in is a member of. |
| 258 | The default branch is called "trunk." All tags that begin with "<b>sym-</b>" |
| 259 | are symbolic name tags. When a symbolic name tag is attached to a |
| 260 | check-in, that allows you to refer to that check-in by its symbolic |
| 261 | name rather than by its hexadecimal hash name. When a symbolic name |
| @@ -250,22 +313,22 @@ | |
| 313 | <dd><p>A branch point occurs when a check-in has two or more direct (non-merge) |
| 314 | children in different branches. A branch point is similar to a fork, |
| 315 | except that the children are in different branches.</p></dd> |
| 316 | </dl></blockquote> |
| 317 | |
| 318 | Check-in 4 of Figure 3 is not a leaf because it has a child (check-in 5) |
| 319 | in the same branch. Check-in 9 of Figure 5 also has a child (check-in 10) |
| 320 | but that child is in a different branch, so check-in 9 is a leaf. Because |
| 321 | of the <b>closed</b> tag on check-in 9, it is a closed leaf. |
| 322 | |
| 323 | Check-in 2 of Figure 3 is considered a "fork" |
| 324 | because it has two children in the same branch. Check-in 2 of Figure 5 |
| 325 | also has two children, but each child is in a different branch, hence in |
| 326 | Figure 5, check-in 2 is considered a "branch point." |
| 327 | |
| 328 | <h2>Differences With Other DVCSes</h2> |
| 329 | |
| 330 | Fossil keeps all check-ins on a single DAG. Branches are identified with |
| 331 | tags. This means that check-ins can be freely moved between branches |
| 332 | simply by altering their tags. |
| 333 | |
| 334 | Most other DVCSes maintain a separate DAG for each branch. |
| 335 |