Fossil SCM

Expanded on the discussion of forking vs branching in www/branching.wiki.

wyoung 2019-06-19 17:13 trunk
Commit b882c623868311dff805ecb8ea48bfc23d2b4e100aea29c83920fee207a5e17a
1 file changed +115 -52
+115 -52
--- www/branching.wiki
+++ www/branching.wiki
@@ -1,10 +1,10 @@
11
<title>Branching, Forking, Merging, and Tagging</title>
22
<h2>Background</h2>
33
44
In a simple and perfect world, the development of a project would proceed
5
-linearly, as shown in figure 1.
5
+linearly, as shown in Figure 1.
66
77
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
88
<tr><td align="center">
99
<img src="branch01.svg"><br>
1010
Figure 1
@@ -15,21 +15,20 @@
1515
check-in numbers would be long hexadecimal hashes since it is not possible
1616
to allocate collision-free sequential numbers in a distributed system.
1717
But as sequential numbers are easier to read, we will substitute them for
1818
the long hashes in this document.
1919
20
-The arrows in figure 1 show the evolution of a project. The initial
20
+The arrows in Figure 1 show the evolution of a project. The initial
2121
check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2
2222
was created by making edits to check-in 1 and then committing those edits.
2323
We say that 2 is a <i>child</i> of 1
2424
and that 1 is a <i>parent</i> of 2.
2525
Check-in 3 is derived from check-in 2, making
2626
3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
2727
and 2 are both <i>ancestors</i> of 3.
2828
29
-<a name="dag"></a>
30
-<h2>DAGs</h2>
29
+<h2 id="dag">DAGs</h2>
3130
3231
The graph of check-ins is a
3332
[http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph]
3433
commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG
3534
since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since
@@ -36,50 +35,70 @@
3635
it has no descendants. (We will give a more precise definition later of
3736
"leaf.")
3837
3938
Alas, reality often interferes with the simple linear development of a
4039
project. Suppose two programmers make independent modifications to check-in 2.
41
-After both changes are committed, the check-in graph looks like figure 2:
40
+After both changes are committed, the check-in graph looks like Figure 2:
4241
4342
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
4443
<tr><td align="center">
4544
<img src="branch02.svg"><br>
4645
Figure 2
4746
</td></tr></table>
4847
49
-The graph in figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has
48
+The graph in Figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has
5049
two children, check-ins 3 and 4. We call this state a <i>fork</i>.
5150
5251
Fossil tries to prevent forks. Suppose two programmers named Alice and
5352
Bob are each editing check-in 2 separately. Alice finishes her edits
5453
first and commits her changes, resulting in check-in 3. Later, when Bob
55
-attempts to commit his changes, fossil verifies that check-in 2 is still
54
+attempts to commit his changes, Fossil verifies that check-in 2 is still
5655
a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit
5756
attempt with a message "would fork." This allows Bob to do a "fossil
5857
update" which pulls in Alice's changes, merging them into his own
5958
changes. After merging, Bob commits check-in 4 as a child of check-in 3.
60
-The result is a linear graph as shown in figure 1. This is how CVS
61
-works. This is also how fossil works in [./concepts.wiki#workflow |
59
+The result is a linear graph as shown in Figure 1. This is how CVS
60
+works. This is also how Fossil works in [./concepts.wiki#workflow |
6261
"autosync"] mode.
6362
6463
But perhaps Bob is off-network when he does his commit, so he
6564
has no way of knowing that Alice has already committed her changes.
6665
Or, it could be that Bob has turned off "autosync" mode in Fossil. Or,
6766
maybe Bob just doesn't want to merge in Alice's changes before he has
6867
saved his own, so he forces the commit to occur using the "--allow-fork"
69
-option to the fossil <b>commit</b> command. For any of these reasons,
68
+option to the <b>fossil commit</b> command. For any of these reasons,
7069
two commits against check-in 2 have occurred and now the DAG has two leaves.
7170
7271
So which version of the project is the "latest" in the sense of having
7372
the most features and the most bug fixes? When there is more than
74
-one leaf in the graph, you don't really know. So we like to have
75
-graphs with a single leaf.
73
+one leaf in the graph, you don't really know, so we like to have
74
+check-in graphs with a single leaf.
75
+
76
+Fossil resolves such problems using the check-in time on the leaves to
77
+decide which leaf to use as the parent of new leaves. When a branch is
78
+forked as in Figure 2, Fossil will choose check-in 4 as the parent for a
79
+later check-in 5, but <i>only</i> if it has sync'd that check-in down
80
+into the local repository. If autosync is disabled or the user is
81
+off-network when that fifth check-in occurs, so that check-in 3 is the
82
+latest on that branch at the time within that clone of the repository,
83
+Fossil will make check-in 3 the parent of check-in 5!
84
+
85
+Fossil also uses a forked branch's leaf check-in timestamps when
86
+checking out that branch: it gives you the fork with the latest
87
+check-in, which in turn selects which parent your next check-in will be
88
+a child of. This situation means development on that branch can fork
89
+into two independent lines of development, based solely on which branch
90
+tip is newer at the time the next user starts his work on it. Because
91
+of this, we strongly recommend that you do not intentionally create
92
+forks on branches with "--allow-fork" if that branch is used by many
93
+people over a long period of time. (Prime example: trunk.)
7694
77
-To resolve this situation, Alice can use the fossil <b>merge</b> command
78
-to merge in Bob's changes in her local copy of check-in 3. Then she
79
-can commit the results as check-in 5. This results in a DAG as shown
80
-in figure 3.
95
+Let us return to Figure 2. To resolve such situations before they can
96
+become a real problem, Alice can use the <b>fossil merge</b> command to
97
+merge Bob's changes into her local copy of check-in 3. Then she can
98
+commit the results as check-in 5. This results in a DAG as shown in
99
+Figure 3.
81100
82101
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
83102
<tr><td align="center">
84103
<img src="branch03.svg"><br>
85104
Figure 3
@@ -87,38 +106,41 @@
87106
88107
Check-in 5 is a child of check-in 3 because it was created by editing
89108
check-in 3. But check-in 5 also inherits the changes from check-in 4 by
90109
virtue of the merge. So we say that check-in 5 is a <i>merge child</i>
91110
of check-in 4 and that it is a <i>direct child</i> of check-in 3.
92
-The graph is now back to a single leaf (check-in 5).
111
+The graph is now back to a single leaf, check-in 5.
93112
94
-We have already seen that if fossil is in autosync mode then Bob would
113
+We have already seen that if Fossil is in autosync mode then Bob would
95114
have been warned about the potential fork the first time he tried to
96115
commit check-in 4. If Bob had updated his local check-out to merge in
97116
Alice's check-in 3 changes, then committed, then the fork would have
98117
never occurred. The resulting graph would have been linear, as shown
99
-in figure 1. Really the graph of figure 1 is a subset of figure 3.
100
-Hold your hand over the check-in 4 circle of figure 3 and then figure
101
-3 looks exactly like figure 1 (except that the leaf has a different check-in
102
-number, but that is just a notational difference - the two check-ins have
103
-exactly the same content). In other words, figure 3 is really a superset
104
-of figure 1. The check-in 4 of figure 3 captures additional state which
105
-is omitted from figure 1. Check-in 4 of figure 3 holds a copy
106
-of Bob's local checkout before he merged in Alice's changes. That snapshot
107
-of Bob's changes, which is independent of Alice's changes, is omitted from figure 1.
108
-Some people say that the approach taken in figure 3 is better because it
109
-preserves this extra intermediate state. Others say that the approach
110
-taken in figure 1 is better because it is much easier to visualize a
111
-linear line of development and because the merging happens automatically
112
-instead of as a separate manual step. We will not take sides in that
113
-debate. We will simply point out that fossil enables you to do it either way.
114
-
115
-<h2>Forking Versus Branching</h2>
118
+in Figure 1.
119
+
120
+Realize that the graph of Figure 1 is a subset of Figure 3. Hold your
121
+hand over the check-in 4 circle of Figure 3 and then Figure 3 looks
122
+exactly like Figure 1, except that the leaf has a different check-in
123
+number, but that is just a notational difference — the two check-ins
124
+have exactly the same content. In other words, Figure 3 is really a
125
+superset of Figure 1. The check-in 4 of Figure 3 captures additional
126
+state which is omitted from Figure 1. Check-in 4 of Figure 3 holds a
127
+copy of Bob's local checkout before he merged in Alice's changes. That
128
+snapshot of Bob's changes, which is independent of Alice's changes, is
129
+omitted from Figure 1. Some people say that the approach taken in
130
+Figure 3 is better because it preserves this extra intermediate state.
131
+Others say that the approach taken in Figure 1 is better because it is
132
+much easier to visualize a linear line of development and because the
133
+merging happens automatically instead of as a separate manual step. We
134
+will not take sides in that debate. We will simply point out that
135
+Fossil enables you to do it either way.
136
+
137
+<h2 id="branching">The Alternative to Forking: Branching</h2>
116138
117139
Having more than one leaf in the check-in DAG is called a "fork." This
118140
is usually undesirable and either avoided entirely,
119
-as in figure 1, or else quickly resolved as shown in figure 3.
141
+as in Figure 1, or else quickly resolved as shown in Figure 3.
120142
But sometimes, one does want to have multiple leaves. For example, a project
121143
might have one leaf that is the latest version of the project under
122144
development and another leaf that is the latest version that has been
123145
tested.
124146
When multiple leaves are desirable, we call this <i>branching</i>
@@ -130,11 +152,11 @@
130152
<tr><td align="center">
131153
<img src="branch04.svg"><br>
132154
Figure 4
133155
</td></tr></table>
134156
135
-The hypothetical scenario of figure 4 is this: The project starts and
157
+The hypothetical scenario of Figure 4 is this: The project starts and
136158
progresses to a point where (at check-in 2)
137159
it is ready to enter testing for its first release.
138160
In a real project, of course, there might be hundreds or thousands of
139161
check-ins before a project reaches this point, but for simplicity of
140162
presentation we will say that the project is ready after check-in 2.
@@ -147,35 +169,76 @@
147169
the bug fixes implemented by the testing team. So periodically, the
148170
changes in the test branch are merged into the dev branch. This is
149171
shown by the dashed merge arrows between check-ins 6 and 7 and between
150172
check-ins 9 and 10.
151173
152
-In both figures 2 and 4, check-in 2 has two children. In figure 2,
174
+In both Figures 2 and 4, check-in 2 has two children. In Figure 2,
153175
we call this a "fork." In diagram 4, we call it a "branch." What is
154
-the difference? As far as the internal fossil data structures are
176
+the difference? As far as the internal Fossil data structures are
155177
concerned, there is no difference. The distinction is in the intent.
156
-In figure 2, the fact that check-in 2 has multiple children is an
157
-accident that stems from concurrent development. In figure 4, giving
178
+In Figure 2, the fact that check-in 2 has multiple children is an
179
+accident that stems from concurrent development. In Figure 4, giving
158180
check-in 2 multiple children is a deliberate act. So, to a good
159181
approximation, we define forking to be by accident and branching to
160182
be by intent. Apart from that, they are the same.
161183
162
-<a name="tags"></a>
163
-<h2>Tags And Properties</h2>
184
+<h2 id="forking">Justifications For Forking</h2>
185
+
186
+The primary cases where forking is justified are all when it is done purely
187
+in software in order to avoid losing information:
188
+
189
+<ol>
190
+ <li><p>By Fossil itself when two users check in children to the same
191
+ leaf of a branch, as in Figure 2. If they're doing it because
192
+ autosync is disabled on one or both of the repositories, Fossil has
193
+ no way of knowing that it is creating a fork until the two
194
+ repositories are later sync'd manually.</p></li>
195
+
196
+ <li><p>By Fossil when the cloning hierarchy is more than 2 levels
197
+ deep. If your master repository is cloned by user A and then user B
198
+ clones from user A's repository, check-ins to user B's repo do not
199
+ check the master repo before allowing the check-in even with
200
+ autosync enabled. It isn't until user A syncs her repo with the
201
+ master repo that an inadvertent fork can be detected.
202
+ <br><br>
203
+ Because of this, we recommend that if you're using Fossil in a
204
+ distributed way like this, that check-ins be made only to the master
205
+ or its immediate child repos, and that those further down the chain
206
+ be read-only clones.</p></li>
207
+
208
+ <li><p>You've automated Fossil (e.g. with a shell script) and
209
+ forking is a possibility, so you add "--allow-fork" to your
210
+ "checkin" commands to prevent Fossil from refusing the check-in due
211
+ to the fork. It's better to write such a script to detect this
212
+ condition and cope with it (e.g. "fossil update") but if the
213
+ alternative is losing information, you may feel justified in
214
+ creating forks that an interactive user must later clean up with
215
+ "fossil merge" commands.</p></li>
216
+</ol>
217
+
218
+That leaves only one case where we can recommend use of "--allow-fork"
219
+by interactive users, while autosync is enabled: when you're working on
220
+a personal branch so that creating a dual-tipped branch isn't going to
221
+cause any other user an inconvenience or risk forking the development.
222
+This is a good alternative to branching when you just need to
223
+temporarily fork the branch's development.
224
+
225
+
226
+<h2 id="tags">Tags And Properties</h2>
164227
165
-Tags and properties are used in fossil to help express the intent, and
228
+Tags and properties are used in Fossil to help express the intent, and
166229
thus to distinguish between forks and branches. Figure 5 shows the
167
-same scenario as figure 4 but with tags and properties added:
230
+same scenario as Figure 4 but with tags and properties added:
168231
169232
<table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
170233
<tr><td align="center">
171234
<img src="branch05.svg"><br>
172235
Figure 5
173236
</td></tr></table>
174237
175238
A <i>tag</i> is a name that is attached to a check-in. A
176
-<i>property</i> is a name/value pair. Internally, fossil implements
239
+<i>property</i> is a name/value pair. Internally, Fossil implements
177240
tags as properties with a NULL value. So, tags and properties really
178241
are much the same thing, and henceforth we will use the word "tag"
179242
to mean either a tag or a property.
180243
181244
A tag can be a one-time tag, a propagating tag or a cancellation tag.
@@ -188,11 +251,11 @@
188251
is attached to a single check-in in order to either override a one-time
189252
tag that was previously placed on that same check-in, or to block
190253
tag propagation from an ancestor.
191254
192255
The initial check-in of every repository has two propagating tags. In
193
-figure 5, that initial check-in is check-in 1. The <b>branch</b> tag
256
+Figure 5, that initial check-in is check-in 1. The <b>branch</b> tag
194257
tells (by its value) what branch the check-in is a member of.
195258
The default branch is called "trunk." All tags that begin with "<b>sym-</b>"
196259
are symbolic name tags. When a symbolic name tag is attached to a
197260
check-in, that allows you to refer to that check-in by its symbolic
198261
name rather than by its hexadecimal hash name. When a symbolic name
@@ -250,22 +313,22 @@
250313
<dd><p>A branch point occurs when a check-in has two or more direct (non-merge)
251314
children in different branches. A branch point is similar to a fork,
252315
except that the children are in different branches.</p></dd>
253316
</dl></blockquote>
254317
255
-Check-in 4 of figure 3 is not a leaf because it has a child (check-in 5)
256
-in the same branch. Check-in 9 of figure 5 also has a child (check-in 10)
318
+Check-in 4 of Figure 3 is not a leaf because it has a child (check-in 5)
319
+in the same branch. Check-in 9 of Figure 5 also has a child (check-in 10)
257320
but that child is in a different branch, so check-in 9 is a leaf. Because
258321
of the <b>closed</b> tag on check-in 9, it is a closed leaf.
259322
260
-Check-in 2 of figure 3 is considered a "fork"
261
-because it has two children in the same branch. Check-in 2 of figure 5
323
+Check-in 2 of Figure 3 is considered a "fork"
324
+because it has two children in the same branch. Check-in 2 of Figure 5
262325
also has two children, but each child is in a different branch, hence in
263
-figure 5, check-in 2 is considered a "branch point."
326
+Figure 5, check-in 2 is considered a "branch point."
264327
265328
<h2>Differences With Other DVCSes</h2>
266329
267330
Fossil keeps all check-ins on a single DAG. Branches are identified with
268331
tags. This means that check-ins can be freely moved between branches
269332
simply by altering their tags.
270333
271334
Most other DVCSes maintain a separate DAG for each branch.
272335
--- www/branching.wiki
+++ www/branching.wiki
@@ -1,10 +1,10 @@
1 <title>Branching, Forking, Merging, and Tagging</title>
2 <h2>Background</h2>
3
4 In a simple and perfect world, the development of a project would proceed
5 linearly, as shown in figure 1.
6
7 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
8 <tr><td align="center">
9 <img src="branch01.svg"><br>
10 Figure 1
@@ -15,21 +15,20 @@
15 check-in numbers would be long hexadecimal hashes since it is not possible
16 to allocate collision-free sequential numbers in a distributed system.
17 But as sequential numbers are easier to read, we will substitute them for
18 the long hashes in this document.
19
20 The arrows in figure 1 show the evolution of a project. The initial
21 check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2
22 was created by making edits to check-in 1 and then committing those edits.
23 We say that 2 is a <i>child</i> of 1
24 and that 1 is a <i>parent</i> of 2.
25 Check-in 3 is derived from check-in 2, making
26 3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
27 and 2 are both <i>ancestors</i> of 3.
28
29 <a name="dag"></a>
30 <h2>DAGs</h2>
31
32 The graph of check-ins is a
33 [http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph]
34 commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG
35 since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since
@@ -36,50 +35,70 @@
36 it has no descendants. (We will give a more precise definition later of
37 "leaf.")
38
39 Alas, reality often interferes with the simple linear development of a
40 project. Suppose two programmers make independent modifications to check-in 2.
41 After both changes are committed, the check-in graph looks like figure 2:
42
43 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
44 <tr><td align="center">
45 <img src="branch02.svg"><br>
46 Figure 2
47 </td></tr></table>
48
49 The graph in figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has
50 two children, check-ins 3 and 4. We call this state a <i>fork</i>.
51
52 Fossil tries to prevent forks. Suppose two programmers named Alice and
53 Bob are each editing check-in 2 separately. Alice finishes her edits
54 first and commits her changes, resulting in check-in 3. Later, when Bob
55 attempts to commit his changes, fossil verifies that check-in 2 is still
56 a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit
57 attempt with a message "would fork." This allows Bob to do a "fossil
58 update" which pulls in Alice's changes, merging them into his own
59 changes. After merging, Bob commits check-in 4 as a child of check-in 3.
60 The result is a linear graph as shown in figure 1. This is how CVS
61 works. This is also how fossil works in [./concepts.wiki#workflow |
62 "autosync"] mode.
63
64 But perhaps Bob is off-network when he does his commit, so he
65 has no way of knowing that Alice has already committed her changes.
66 Or, it could be that Bob has turned off "autosync" mode in Fossil. Or,
67 maybe Bob just doesn't want to merge in Alice's changes before he has
68 saved his own, so he forces the commit to occur using the "--allow-fork"
69 option to the fossil <b>commit</b> command. For any of these reasons,
70 two commits against check-in 2 have occurred and now the DAG has two leaves.
71
72 So which version of the project is the "latest" in the sense of having
73 the most features and the most bug fixes? When there is more than
74 one leaf in the graph, you don't really know. So we like to have
75 graphs with a single leaf.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
77 To resolve this situation, Alice can use the fossil <b>merge</b> command
78 to merge in Bob's changes in her local copy of check-in 3. Then she
79 can commit the results as check-in 5. This results in a DAG as shown
80 in figure 3.
 
81
82 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
83 <tr><td align="center">
84 <img src="branch03.svg"><br>
85 Figure 3
@@ -87,38 +106,41 @@
87
88 Check-in 5 is a child of check-in 3 because it was created by editing
89 check-in 3. But check-in 5 also inherits the changes from check-in 4 by
90 virtue of the merge. So we say that check-in 5 is a <i>merge child</i>
91 of check-in 4 and that it is a <i>direct child</i> of check-in 3.
92 The graph is now back to a single leaf (check-in 5).
93
94 We have already seen that if fossil is in autosync mode then Bob would
95 have been warned about the potential fork the first time he tried to
96 commit check-in 4. If Bob had updated his local check-out to merge in
97 Alice's check-in 3 changes, then committed, then the fork would have
98 never occurred. The resulting graph would have been linear, as shown
99 in figure 1. Really the graph of figure 1 is a subset of figure 3.
100 Hold your hand over the check-in 4 circle of figure 3 and then figure
101 3 looks exactly like figure 1 (except that the leaf has a different check-in
102 number, but that is just a notational difference - the two check-ins have
103 exactly the same content). In other words, figure 3 is really a superset
104 of figure 1. The check-in 4 of figure 3 captures additional state which
105 is omitted from figure 1. Check-in 4 of figure 3 holds a copy
106 of Bob's local checkout before he merged in Alice's changes. That snapshot
107 of Bob's changes, which is independent of Alice's changes, is omitted from figure 1.
108 Some people say that the approach taken in figure 3 is better because it
109 preserves this extra intermediate state. Others say that the approach
110 taken in figure 1 is better because it is much easier to visualize a
111 linear line of development and because the merging happens automatically
112 instead of as a separate manual step. We will not take sides in that
113 debate. We will simply point out that fossil enables you to do it either way.
114
115 <h2>Forking Versus Branching</h2>
 
 
 
116
117 Having more than one leaf in the check-in DAG is called a "fork." This
118 is usually undesirable and either avoided entirely,
119 as in figure 1, or else quickly resolved as shown in figure 3.
120 But sometimes, one does want to have multiple leaves. For example, a project
121 might have one leaf that is the latest version of the project under
122 development and another leaf that is the latest version that has been
123 tested.
124 When multiple leaves are desirable, we call this <i>branching</i>
@@ -130,11 +152,11 @@
130 <tr><td align="center">
131 <img src="branch04.svg"><br>
132 Figure 4
133 </td></tr></table>
134
135 The hypothetical scenario of figure 4 is this: The project starts and
136 progresses to a point where (at check-in 2)
137 it is ready to enter testing for its first release.
138 In a real project, of course, there might be hundreds or thousands of
139 check-ins before a project reaches this point, but for simplicity of
140 presentation we will say that the project is ready after check-in 2.
@@ -147,35 +169,76 @@
147 the bug fixes implemented by the testing team. So periodically, the
148 changes in the test branch are merged into the dev branch. This is
149 shown by the dashed merge arrows between check-ins 6 and 7 and between
150 check-ins 9 and 10.
151
152 In both figures 2 and 4, check-in 2 has two children. In figure 2,
153 we call this a "fork." In diagram 4, we call it a "branch." What is
154 the difference? As far as the internal fossil data structures are
155 concerned, there is no difference. The distinction is in the intent.
156 In figure 2, the fact that check-in 2 has multiple children is an
157 accident that stems from concurrent development. In figure 4, giving
158 check-in 2 multiple children is a deliberate act. So, to a good
159 approximation, we define forking to be by accident and branching to
160 be by intent. Apart from that, they are the same.
161
162 <a name="tags"></a>
163 <h2>Tags And Properties</h2>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
165 Tags and properties are used in fossil to help express the intent, and
166 thus to distinguish between forks and branches. Figure 5 shows the
167 same scenario as figure 4 but with tags and properties added:
168
169 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
170 <tr><td align="center">
171 <img src="branch05.svg"><br>
172 Figure 5
173 </td></tr></table>
174
175 A <i>tag</i> is a name that is attached to a check-in. A
176 <i>property</i> is a name/value pair. Internally, fossil implements
177 tags as properties with a NULL value. So, tags and properties really
178 are much the same thing, and henceforth we will use the word "tag"
179 to mean either a tag or a property.
180
181 A tag can be a one-time tag, a propagating tag or a cancellation tag.
@@ -188,11 +251,11 @@
188 is attached to a single check-in in order to either override a one-time
189 tag that was previously placed on that same check-in, or to block
190 tag propagation from an ancestor.
191
192 The initial check-in of every repository has two propagating tags. In
193 figure 5, that initial check-in is check-in 1. The <b>branch</b> tag
194 tells (by its value) what branch the check-in is a member of.
195 The default branch is called "trunk." All tags that begin with "<b>sym-</b>"
196 are symbolic name tags. When a symbolic name tag is attached to a
197 check-in, that allows you to refer to that check-in by its symbolic
198 name rather than by its hexadecimal hash name. When a symbolic name
@@ -250,22 +313,22 @@
250 <dd><p>A branch point occurs when a check-in has two or more direct (non-merge)
251 children in different branches. A branch point is similar to a fork,
252 except that the children are in different branches.</p></dd>
253 </dl></blockquote>
254
255 Check-in 4 of figure 3 is not a leaf because it has a child (check-in 5)
256 in the same branch. Check-in 9 of figure 5 also has a child (check-in 10)
257 but that child is in a different branch, so check-in 9 is a leaf. Because
258 of the <b>closed</b> tag on check-in 9, it is a closed leaf.
259
260 Check-in 2 of figure 3 is considered a "fork"
261 because it has two children in the same branch. Check-in 2 of figure 5
262 also has two children, but each child is in a different branch, hence in
263 figure 5, check-in 2 is considered a "branch point."
264
265 <h2>Differences With Other DVCSes</h2>
266
267 Fossil keeps all check-ins on a single DAG. Branches are identified with
268 tags. This means that check-ins can be freely moved between branches
269 simply by altering their tags.
270
271 Most other DVCSes maintain a separate DAG for each branch.
272
--- www/branching.wiki
+++ www/branching.wiki
@@ -1,10 +1,10 @@
1 <title>Branching, Forking, Merging, and Tagging</title>
2 <h2>Background</h2>
3
4 In a simple and perfect world, the development of a project would proceed
5 linearly, as shown in Figure 1.
6
7 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
8 <tr><td align="center">
9 <img src="branch01.svg"><br>
10 Figure 1
@@ -15,21 +15,20 @@
15 check-in numbers would be long hexadecimal hashes since it is not possible
16 to allocate collision-free sequential numbers in a distributed system.
17 But as sequential numbers are easier to read, we will substitute them for
18 the long hashes in this document.
19
20 The arrows in Figure 1 show the evolution of a project. The initial
21 check-in is 1. Check-in 2 is derived from 1. In other words, check-in 2
22 was created by making edits to check-in 1 and then committing those edits.
23 We say that 2 is a <i>child</i> of 1
24 and that 1 is a <i>parent</i> of 2.
25 Check-in 3 is derived from check-in 2, making
26 3 a child of 2. We say that 3 is a <i>descendant</i> of both 1 and 2 and that 1
27 and 2 are both <i>ancestors</i> of 3.
28
29 <h2 id="dag">DAGs</h2>
 
30
31 The graph of check-ins is a
32 [http://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic graph]
33 commonly shortened to <i>DAG</i>. Check-in 1 is the <i>root</i> of the DAG
34 since it has no ancestors. Check-in 4 is a <i>leaf</i> of the DAG since
@@ -36,50 +35,70 @@
35 it has no descendants. (We will give a more precise definition later of
36 "leaf.")
37
38 Alas, reality often interferes with the simple linear development of a
39 project. Suppose two programmers make independent modifications to check-in 2.
40 After both changes are committed, the check-in graph looks like Figure 2:
41
42 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
43 <tr><td align="center">
44 <img src="branch02.svg"><br>
45 Figure 2
46 </td></tr></table>
47
48 The graph in Figure 2 has two leaves: check-ins 3 and 4. Check-in 2 has
49 two children, check-ins 3 and 4. We call this state a <i>fork</i>.
50
51 Fossil tries to prevent forks. Suppose two programmers named Alice and
52 Bob are each editing check-in 2 separately. Alice finishes her edits
53 first and commits her changes, resulting in check-in 3. Later, when Bob
54 attempts to commit his changes, Fossil verifies that check-in 2 is still
55 a leaf. Fossil sees that check-in 3 has occurred and aborts Bob's commit
56 attempt with a message "would fork." This allows Bob to do a "fossil
57 update" which pulls in Alice's changes, merging them into his own
58 changes. After merging, Bob commits check-in 4 as a child of check-in 3.
59 The result is a linear graph as shown in Figure 1. This is how CVS
60 works. This is also how Fossil works in [./concepts.wiki#workflow |
61 "autosync"] mode.
62
63 But perhaps Bob is off-network when he does his commit, so he
64 has no way of knowing that Alice has already committed her changes.
65 Or, it could be that Bob has turned off "autosync" mode in Fossil. Or,
66 maybe Bob just doesn't want to merge in Alice's changes before he has
67 saved his own, so he forces the commit to occur using the "--allow-fork"
68 option to the <b>fossil commit</b> command. For any of these reasons,
69 two commits against check-in 2 have occurred and now the DAG has two leaves.
70
71 So which version of the project is the "latest" in the sense of having
72 the most features and the most bug fixes? When there is more than
73 one leaf in the graph, you don't really know, so we like to have
74 check-in graphs with a single leaf.
75
76 Fossil resolves such problems using the check-in time on the leaves to
77 decide which leaf to use as the parent of new leaves. When a branch is
78 forked as in Figure 2, Fossil will choose check-in 4 as the parent for a
79 later check-in 5, but <i>only</i> if it has sync'd that check-in down
80 into the local repository. If autosync is disabled or the user is
81 off-network when that fifth check-in occurs, so that check-in 3 is the
82 latest on that branch at the time within that clone of the repository,
83 Fossil will make check-in 3 the parent of check-in 5!
84
85 Fossil also uses a forked branch's leaf check-in timestamps when
86 checking out that branch: it gives you the fork with the latest
87 check-in, which in turn selects which parent your next check-in will be
88 a child of. This situation means development on that branch can fork
89 into two independent lines of development, based solely on which branch
90 tip is newer at the time the next user starts his work on it. Because
91 of this, we strongly recommend that you do not intentionally create
92 forks on branches with "--allow-fork" if that branch is used by many
93 people over a long period of time. (Prime example: trunk.)
94
95 Let us return to Figure 2. To resolve such situations before they can
96 become a real problem, Alice can use the <b>fossil merge</b> command to
97 merge Bob's changes into her local copy of check-in 3. Then she can
98 commit the results as check-in 5. This results in a DAG as shown in
99 Figure 3.
100
101 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
102 <tr><td align="center">
103 <img src="branch03.svg"><br>
104 Figure 3
@@ -87,38 +106,41 @@
106
107 Check-in 5 is a child of check-in 3 because it was created by editing
108 check-in 3. But check-in 5 also inherits the changes from check-in 4 by
109 virtue of the merge. So we say that check-in 5 is a <i>merge child</i>
110 of check-in 4 and that it is a <i>direct child</i> of check-in 3.
111 The graph is now back to a single leaf, check-in 5.
112
113 We have already seen that if Fossil is in autosync mode then Bob would
114 have been warned about the potential fork the first time he tried to
115 commit check-in 4. If Bob had updated his local check-out to merge in
116 Alice's check-in 3 changes, then committed, then the fork would have
117 never occurred. The resulting graph would have been linear, as shown
118 in Figure 1.
119
120 Realize that the graph of Figure 1 is a subset of Figure 3. Hold your
121 hand over the check-in 4 circle of Figure 3 and then Figure 3 looks
122 exactly like Figure 1, except that the leaf has a different check-in
123 number, but that is just a notational difference — the two check-ins
124 have exactly the same content. In other words, Figure 3 is really a
125 superset of Figure 1. The check-in 4 of Figure 3 captures additional
126 state which is omitted from Figure 1. Check-in 4 of Figure 3 holds a
127 copy of Bob's local checkout before he merged in Alice's changes. That
128 snapshot of Bob's changes, which is independent of Alice's changes, is
129 omitted from Figure 1. Some people say that the approach taken in
130 Figure 3 is better because it preserves this extra intermediate state.
131 Others say that the approach taken in Figure 1 is better because it is
132 much easier to visualize a linear line of development and because the
133 merging happens automatically instead of as a separate manual step. We
134 will not take sides in that debate. We will simply point out that
135 Fossil enables you to do it either way.
136
137 <h2 id="branching">The Alternative to Forking: Branching</h2>
138
139 Having more than one leaf in the check-in DAG is called a "fork." This
140 is usually undesirable and either avoided entirely,
141 as in Figure 1, or else quickly resolved as shown in Figure 3.
142 But sometimes, one does want to have multiple leaves. For example, a project
143 might have one leaf that is the latest version of the project under
144 development and another leaf that is the latest version that has been
145 tested.
146 When multiple leaves are desirable, we call this <i>branching</i>
@@ -130,11 +152,11 @@
152 <tr><td align="center">
153 <img src="branch04.svg"><br>
154 Figure 4
155 </td></tr></table>
156
157 The hypothetical scenario of Figure 4 is this: The project starts and
158 progresses to a point where (at check-in 2)
159 it is ready to enter testing for its first release.
160 In a real project, of course, there might be hundreds or thousands of
161 check-ins before a project reaches this point, but for simplicity of
162 presentation we will say that the project is ready after check-in 2.
@@ -147,35 +169,76 @@
169 the bug fixes implemented by the testing team. So periodically, the
170 changes in the test branch are merged into the dev branch. This is
171 shown by the dashed merge arrows between check-ins 6 and 7 and between
172 check-ins 9 and 10.
173
174 In both Figures 2 and 4, check-in 2 has two children. In Figure 2,
175 we call this a "fork." In diagram 4, we call it a "branch." What is
176 the difference? As far as the internal Fossil data structures are
177 concerned, there is no difference. The distinction is in the intent.
178 In Figure 2, the fact that check-in 2 has multiple children is an
179 accident that stems from concurrent development. In Figure 4, giving
180 check-in 2 multiple children is a deliberate act. So, to a good
181 approximation, we define forking to be by accident and branching to
182 be by intent. Apart from that, they are the same.
183
184 <h2 id="forking">Justifications For Forking</h2>
185
186 The primary cases where forking is justified are all when it is done purely
187 in software in order to avoid losing information:
188
189 <ol>
190 <li><p>By Fossil itself when two users check in children to the same
191 leaf of a branch, as in Figure 2. If they're doing it because
192 autosync is disabled on one or both of the repositories, Fossil has
193 no way of knowing that it is creating a fork until the two
194 repositories are later sync'd manually.</p></li>
195
196 <li><p>By Fossil when the cloning hierarchy is more than 2 levels
197 deep. If your master repository is cloned by user A and then user B
198 clones from user A's repository, check-ins to user B's repo do not
199 check the master repo before allowing the check-in even with
200 autosync enabled. It isn't until user A syncs her repo with the
201 master repo that an inadvertent fork can be detected.
202 <br><br>
203 Because of this, we recommend that if you're using Fossil in a
204 distributed way like this, that check-ins be made only to the master
205 or its immediate child repos, and that those further down the chain
206 be read-only clones.</p></li>
207
208 <li><p>You've automated Fossil (e.g. with a shell script) and
209 forking is a possibility, so you add "--allow-fork" to your
210 "checkin" commands to prevent Fossil from refusing the check-in due
211 to the fork. It's better to write such a script to detect this
212 condition and cope with it (e.g. "fossil update") but if the
213 alternative is losing information, you may feel justified in
214 creating forks that an interactive user must later clean up with
215 "fossil merge" commands.</p></li>
216 </ol>
217
218 That leaves only one case where we can recommend use of "--allow-fork"
219 by interactive users, while autosync is enabled: when you're working on
220 a personal branch so that creating a dual-tipped branch isn't going to
221 cause any other user an inconvenience or risk forking the development.
222 This is a good alternative to branching when you just need to
223 temporarily fork the branch's development.
224
225
226 <h2 id="tags">Tags And Properties</h2>
227
228 Tags and properties are used in Fossil to help express the intent, and
229 thus to distinguish between forks and branches. Figure 5 shows the
230 same scenario as Figure 4 but with tags and properties added:
231
232 <table border=1 cellpadding=10 hspace=10 vspace=10 align="center">
233 <tr><td align="center">
234 <img src="branch05.svg"><br>
235 Figure 5
236 </td></tr></table>
237
238 A <i>tag</i> is a name that is attached to a check-in. A
239 <i>property</i> is a name/value pair. Internally, Fossil implements
240 tags as properties with a NULL value. So, tags and properties really
241 are much the same thing, and henceforth we will use the word "tag"
242 to mean either a tag or a property.
243
244 A tag can be a one-time tag, a propagating tag or a cancellation tag.
@@ -188,11 +251,11 @@
251 is attached to a single check-in in order to either override a one-time
252 tag that was previously placed on that same check-in, or to block
253 tag propagation from an ancestor.
254
255 The initial check-in of every repository has two propagating tags. In
256 Figure 5, that initial check-in is check-in 1. The <b>branch</b> tag
257 tells (by its value) what branch the check-in is a member of.
258 The default branch is called "trunk." All tags that begin with "<b>sym-</b>"
259 are symbolic name tags. When a symbolic name tag is attached to a
260 check-in, that allows you to refer to that check-in by its symbolic
261 name rather than by its hexadecimal hash name. When a symbolic name
@@ -250,22 +313,22 @@
313 <dd><p>A branch point occurs when a check-in has two or more direct (non-merge)
314 children in different branches. A branch point is similar to a fork,
315 except that the children are in different branches.</p></dd>
316 </dl></blockquote>
317
318 Check-in 4 of Figure 3 is not a leaf because it has a child (check-in 5)
319 in the same branch. Check-in 9 of Figure 5 also has a child (check-in 10)
320 but that child is in a different branch, so check-in 9 is a leaf. Because
321 of the <b>closed</b> tag on check-in 9, it is a closed leaf.
322
323 Check-in 2 of Figure 3 is considered a "fork"
324 because it has two children in the same branch. Check-in 2 of Figure 5
325 also has two children, but each child is in a different branch, hence in
326 Figure 5, check-in 2 is considered a "branch point."
327
328 <h2>Differences With Other DVCSes</h2>
329
330 Fossil keeps all check-ins on a single DAG. Branches are identified with
331 tags. This means that check-ins can be freely moved between branches
332 simply by altering their tags.
333
334 Most other DVCSes maintain a separate DAG for each branch.
335

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button