|
1
|
# Fossil and the CAP Theorem |
|
2
|
|
|
3
|
[The CAP theorem][cap] is a fundamental mathematical proof about |
|
4
|
distributed systems. A software system can no more get around it than a |
|
5
|
physical system can get past *c*, the [speed of light][sol] constant. |
|
6
|
|
|
7
|
Fossil is a distributed system, so it can be useful to think about it in |
|
8
|
terms of the CAP theorem. We won’t discuss the theorem itself or how you |
|
9
|
reason using its results here. For that, we recommend [this article][tut]. |
|
10
|
|
|
11
|
[cap]: https://en.wikipedia.org/wiki/CAP_theorem |
|
12
|
[sol]: https://en.wikipedia.org/wiki/Speed_of_light |
|
13
|
[tut]: https://www.ibm.com/cloud/learn/cap-theorem |
|
14
|
|
|
15
|
|
|
16
|
<a id="ap"></a> |
|
17
|
## Fossil Is an AP-Mode System |
|
18
|
|
|
19
|
As with all common [DVCSes][dvcs], Fossil is an AP-mode system, meaning |
|
20
|
that your local clone isn’t necessarily consistent with all other clones |
|
21
|
(C), but the system is always available for use (A) and |
|
22
|
partition-tolerant (P). This is what allows you to turn off Fossil’s |
|
23
|
autosync mode, go off-network, and continue working with Fossil, even |
|
24
|
though only a single node (your local repo clone) is accessible at the |
|
25
|
time. |
|
26
|
|
|
27
|
You may consider that going back online restores “C”, because upon sync, |
|
28
|
you’re now consistent with the repo you cloned from. But, if another |
|
29
|
user has gone offline in the meantime, and they’ve made commits to their |
|
30
|
disconnected repo, *you* aren’t consistent with *them.* Besides which, |
|
31
|
if another user commits to the central repo, that doesn’t push the |
|
32
|
change down to you automatically: even if all users of a Fossil system |
|
33
|
are online at the same instant, and they’re all using autosync, Fossil |
|
34
|
doesn’t guarantee consistency across the network. |
|
35
|
|
|
36
|
There’s no getting around the CAP theorem! |
|
37
|
|
|
38
|
[dvcs]: https://en.wikipedia.org/wiki/Distributed_version_control |
|
39
|
|
|
40
|
|
|
41
|
<a id="ca"></a> |
|
42
|
## CA-Mode Fossil |
|
43
|
|
|
44
|
What would it mean to redesign Fossil to be CA-mode? |
|
45
|
|
|
46
|
It means we get a system that is always consistent (C) and available (A) |
|
47
|
as long as there are no partitions (P). |
|
48
|
|
|
49
|
That’s basically [CVS] and [Subversion][svn]: you can only continue |
|
50
|
working with the repository itself as long as your connection to the central repo server functions. |
|
51
|
|
|
52
|
It’s rather trivial to talk about single-point-of-failure systems like |
|
53
|
CVS or Subversion as |
|
54
|
CA-mode. Another common example used this way is a classical RDBMS, but |
|
55
|
aren’t we here to talk about distributed systems? What’s a good example |
|
56
|
of a *distributed* CA-mode system? |
|
57
|
|
|
58
|
A better example is [Kafka], which in its default configuration assumes |
|
59
|
it being run on a corporate LAN in a single data center, so network |
|
60
|
partitions are exceedingly rare. It therefore sacrifices partition |
|
61
|
tolerance to get the advantages of CA-mode operation. In its particular application of |
|
62
|
this mode, a |
|
63
|
message isn’t “committed” until all running brokers have a copy of it, |
|
64
|
at which point the message becomes visible to the client(s). In that |
|
65
|
way, all clients always see the same message store as long as all of the |
|
66
|
Kafka servers are up and communicating. |
|
67
|
|
|
68
|
How would that work in Fossil terms? |
|
69
|
|
|
70
|
If there is only one central server and I clone it on my local laptop, |
|
71
|
then CA mode means I can only commit if the remote Fossil is available, |
|
72
|
so in that sense, it devolves to the old CVS model. |
|
73
|
|
|
74
|
What if there are three clones? Perhaps there is a central server *A*, |
|
75
|
the clone *B* on my laptop, and the clone *C* on your laptop. Doesn’t CA |
|
76
|
mode now mean that my commit on *B* doesn’t exist after I commit it to |
|
77
|
the central repo *A* until you, my coworker, *also* pull down the copy |
|
78
|
of that commit to your laptop *C*, validating the commit through the |
|
79
|
network? |
|
80
|
|
|
81
|
That’s one way to design the system, but another way would be to scope |
|
82
|
the system to only talk about proper *servers*, not about the clients. |
|
83
|
In that model, a CA-mode Fossil alternative might require 2+ servers to |
|
84
|
be running for proper replication. When I make a commit, if all of the |
|
85
|
configured servers aren’t online, I can’t commit. This is basically CVS |
|
86
|
with replication, but without any useful amount of failover. |
|
87
|
|
|
88
|
[CVS]: https://en.wikipedia.org/wiki/Concurrent_Versions_System |
|
89
|
[Kafka]: https://engineering.linkedin.com/kafka/intra-cluster-replication-apache-kafka |
|
90
|
[svn]: https://en.wikipedia.org/wiki/Apache_Subversion |
|
91
|
|
|
92
|
|
|
93
|
<a id="cp"></a> |
|
94
|
## CP-Mode Fossil |
|
95
|
|
|
96
|
What if we modify our CA-mode system above with “warm spares”? We can |
|
97
|
say that commits must go to all of the spares as well as the active |
|
98
|
servers, but a loss of one active server requires that one warm spare |
|
99
|
come into active state, and all of the clients learn that the spare is |
|
100
|
now considered “active.” At this point, you have a CP-mode system, not a |
|
101
|
CA-mode system, because it’s now partition-tolerant (P) but it becomes |
|
102
|
unavailable when there aren’t enough active servers or warm |
|
103
|
spares to promote to active status. |
|
104
|
|
|
105
|
CP is your classical [BFT] style distributed consensus system, where the |
|
106
|
system is available only if the client can contact a *majority* of the |
|
107
|
servers. This is a formalization of the warm spare concept above: with |
|
108
|
*N* server nodes, you need at least ⌊*N* / 2⌋ + 1 of them to be online |
|
109
|
for a commit to succeed. |
|
110
|
|
|
111
|
Many distributed database systems run in CP mode because consistency (C) and |
|
112
|
partition-tolerance (P) is a useful combination. What you lose is |
|
113
|
always-available (A) operation: with a suitably bad partition, the |
|
114
|
system goes down for users on the small side of that partition. |
|
115
|
|
|
116
|
An optional CP mode for Fossil would be attractive in some ways since in |
|
117
|
some sense Fossil is a distributed DBMS, but in practical terms, it |
|
118
|
means Fossil would then not be a [DVCS] in the most useful sense, being |
|
119
|
that you could work while your client is disconnected from the remote |
|
120
|
Fossil it cloned from. |
|
121
|
|
|
122
|
A fraught question is whether the non-server Fossil clones count as |
|
123
|
“nodes” in this sense. |
|
124
|
|
|
125
|
If they do count, then if there are only two systems, the central server |
|
126
|
and the clone on my laptop, then it stands to reason from the formula |
|
127
|
above that I can only commit if the central server is available. In that |
|
128
|
scheme, a CP-mode Fossil is basically like CVS. |
|
129
|
|
|
130
|
But what happens if my company hires a coworker to help me with the |
|
131
|
project, and this person makes their own clone of the central repo? The |
|
132
|
equation says I still need 2 nodes to be available for a commit, so if |
|
133
|
my new coworker goes off-network, that doesn’t affect whether I can make |
|
134
|
commits. Likewise, if I go off-network, my coworker can make commits to |
|
135
|
the central server. |
|
136
|
|
|
137
|
But what happens if the central server goes down? The equation says we |
|
138
|
still have 2 nodes, so we should be able to commit, right? Sure, but |
|
139
|
only if my laptop and communicate directly to my coworker’s laptop! If |
|
140
|
it can’t, that’s also a network partition, so *N=1* on both sides in |
|
141
|
that case. The implication is that for a true CP-mode Fossil, we’d need |
|
142
|
some kind of peer-to-peer networking layer so that our laptops can |
|
143
|
accept commits from the other, so that when the central server comes |
|
144
|
online, one of us can send the results up to it to get it caught up. |
|
145
|
|
|
146
|
But doesn’t that then mean there is no security? How does [Fossil’s RBAC |
|
147
|
system][caps] work if peer-to-peer commits are allowed? |
|
148
|
|
|
149
|
You can instead reconceptualize the system as “node” meaning only server |
|
150
|
nodes, so that client-only systems don’t count. This allows you to have |
|
151
|
an RBAC system again. |
|
152
|
|
|
153
|
With just one central server, ⌊1/2⌋+1=1, so you get CVS-like behavior: |
|
154
|
if the server’s up, you can commit. |
|
155
|
|
|
156
|
If you set up 2 servers for redundancy, both must be up for commits to |
|
157
|
be allowed, since otherwise you could end up with half the commits going |
|
158
|
to the server on one side of a network partition, half going to the |
|
159
|
other, and no way to arbitrate among the two once the partition is |
|
160
|
lifted. |
|
161
|
|
|
162
|
(Today’s AP-mode Fossil has this capability, but the necessary cost is |
|
163
|
“C”, consistency! Once again, you can’t get around the CAP theorem.) |
|
164
|
|
|
165
|
3 servers is more sensible: any client that can see at least 2 of them |
|
166
|
can commit. |
|
167
|
|
|
168
|
Will there ever be a CP-mode Fossil? This author doubts it, but as I’ve |
|
169
|
shown, it would be useful in contexts where you’d rather have a |
|
170
|
guarantee of consistency than availability. |
|
171
|
|
|
172
|
[BFT]: https://en.wikipedia.org/wiki/Byzantine_fault |
|
173
|
[caps]: ./caps/ |
|
174
|
|