Fossil SCM

fossil-scm / www / backup.md

Source Rendered

Blame History Raw 305 lines

1	`# Backing Up a Remote Fossil Repository`
2
3	`One of the great benefits of Fossil and other [distributed version control systems][dvcs]`
4	`is that cloning a repository makes a backup. If you are running a project with multiple`
5	`developers who share their work using a [central server][server] and the server hardware`
6	`catches fire, the clones of the repository on each developer`
7	`workstation may serve as a suitable backup.`
8
9	`[dvcs]: wikipedia:/wiki/Distributed_version_control`
10	`[server]: ./server/whyuseaserver.wiki`
11
12	`We say “may” because`
13	`it turns out not everything in a Fossil repository is copied when cloning. You`
14	`don’t even always get copies of all historical file artifacts. More than`
15	`that, a Fossil repository typically contains`
16	`other useful information that is not always shared as part of a clone, which might need`
17	`to be backed up separately. To wit:`
18
19
20	`## <a id="pii"></a> Sensitive Information`
21
22	`Fossil purposefully does not clone certain sensitive information unless`
23	`you’re logged in as a user with [Setup] capability. As an example, a local clone`
24	may have a different `user` table than the remote, because only a
25	`Setup user is allowed to see the full version for privacy and security`
26	`reasons.`
27
28
29	`## <a id="config"></a> Configuration Drift`
30
31	`Fossil allows the local configuration to differ in several areas from`
32	`that of the remote. You get a copy`
33	`of some of these configuration areas on initial clone — not all! — but after that,`
34	`remote configuration changes mostly do not sync down automatically.`
35
36
37	`#### <a id="skin"></a> Skin`
38
39	`Changes to the remote’s skin don’t sync down, on purpose, since you may`
40	`want to have a different skin on the local clone than on the remote. You`
41	can ask for updates with [`fossil config pull skin`][cfg], but that does
42	`not happen automatically during the course of normal development.`
43
44
45	`#### <a id="alerts"></a> Email Alerts`
46
47	`The Admin → Notification settings do not get copied on clone or sync,`
48	`and it is not possible to push such settings from one repository to`
49	`another. We did this on purpose because you may have a network of peer`
50	`repositories, and you only want one repository sending email alerts. If`
51	`Fossil were to automatically replicate the email alert settings to a`
52	`separate repository, subscribers would get multiple alerts for each`
53	`event, which would be bad.`
54
55	`The only element of the email alert configuration that can be pulled`
56	`over the sync protocol on demand is the subscriber list, via`
57	[`fossil config pull subscriber`][cfg].
58
59
60	`#### <a id="project"></a> Project Configuration`
61
62	This is normally generated once during `fossil init` and never changed,
63	`so Fossil doesn’t pull this information without being forced, on`
64	`purpose. You could accidentally merge two separate Fossil repos by`
65	`pushing one repo’s project config up to another, for example.`
66
67
68	`#### <a id="other-cfg"></a> Others`
69
70	`A repo’s URL aliases, [interwiki configuration](./interwiki.md), and`
71	`[ticket customizations](./custom_tcket.wiki) also do not normally sync.`
72
73	`[cfg]: /help/configuration`
74
75
76
77	`## <a id="private"></a> Private Branches`
78
79	`The very nature of Fossil’s [private branch feature][pbr] ensures that`
80	`remote clones don’t get a copy of those branches. Normally this is`
81	`exactly what you want, but in the case of making backups, you probably`
82	`want to back up these branches as well. One of the two backup methods below`
83	`provides this.`
84
85
86	`## <a id="shun"></a> Shunned Artifacts`
87
88	`Fossil purposefully doesn’t sync [shunned artifacts][shun]. If you want`
89	`your local clone to be a precise match to the remote, it needs to track`
90	`changes to the shun table as well.`
91
92
93	`## <a id="uv"></a> Unversioned Artifacts`
94
95	`Data in Fossil’s [unversioned artifacts table][uv] doesn’t sync down by`
96	`default unless you specifically ask for it. Like local configuration`
97	data, it doesn’t get pulled as part of a normal `fossil sync`, but
98	`unlike the config data, you don’t get unversioned files as part of the`
99	initial clone unless you ask for it by passing the `--unversioned/-u`
100	`flag.`
101
102
103	`## <a id="ait"></a>Autosync Is Intransitive`
104
105	`If you’re using Fossil in a truly distributed mode, rather than the`
106	`simple central-and-clones model that is more common, there may be no`
107	`single source of truth in the network because Fossil’s autosync feature`
108	`isn’t transitive.`
109
110	`That is, if you cloned from server A, and then you stand that up on a`
111	`server B, then if I clone from your server as my repository C, your changes to B`
112	`autosync up to A, but not down to me on C until I do something locally`
113	`that triggers autosync. The inverse is also true: if I commit something`
114	`on C, it will autosync up to B, but A won’t get a copy until someone on`
115	`B does something to trigger a sync there.`
116
117	`An easy way to run into this problem is to set up failover servers`
118	`svr1` thru `svr3.example.com`, then set `svr2` and `svr3` up to sync
119	with the first. If all of the users normally clone from `svr1`, their
120	commits don’t get to `svr2` and `svr3` until something on one of the
121	`servers pushes or pulls the changes down to the next server in the sync`
122	`chain.`
123
124	Likewise, if `svr1` falls over and all of the users re-point their local
125	clones at `svr2`, then `svr1` later reappears, `svr1` is likely to
126	`remain a stale copy of the old version of the repository until someone`
127	causes it to sync with `svr2` or `svr3` to catch up again. And then if
128	you originally designed the sync scheme to treat `svr1` as the primary
129	source of truth, those users still syncing with `svr2` won’t have their
130	commits pushed up to `svr1` unless you’ve set up bidirectional sync,
131	rather than have the two backup servers do `pull` only.
132
133
134	`# <a id="sync-solution"></a> Solution 1: Explicit Pulls`
135
136	`The following script solves most of the above problems for the use case`
137	`where you want a nearly-complete clone of the remote repository using nothing`
138	`but the normal Fossil sync protocol. It only does so if you are logged into`
139	`the remote as a user with Setup capability, however.`
140
141	``` shell
142	`#!/bin/sh`
143	`fossil sync --unversioned`
144	`fossil configuration pull all`
145	`fossil rebuild`
146	```
147
148	`The last step is needed to ensure that shunned artifacts on the remote`
149	`are removed from the local clone. The second step includes`
150	`fossil conf pull shun`, but until those artifacts are actually rebuilt
151	`out of existence, your backup will be “more than complete” in the sense`
152	`that it will continue to have information that the remote says should`
153	`not exist any more. That would be not so much a “backup” as an`
154	`“archive,” which might not be what you want.`
155
156
157	`# <a id="sql-solution"></a> Solution 2: SQL-Level Backup`
158
159	`The first method doesn’t get you a copy of the remote’s`
160	`[private branches][pbr], on purpose. It may also miss other info on the`
161	`remote, such as SQL-level customizations that the sync protocol can’t`
162	`see. (Some [ticket system customization][tkt] schemes rely on this ability, for example.) You can`
163	`solve such problems if you have access to the remote server, which`
164	`allows you to get a SQL-level backup by delegating handling of locking`
165	`and transaction isolation to`
166	[the `backup` command][bu], allowing the user to safely back up an in-use
167	`repository.`
168
169	`If you have SSH access to the remote server, something like this will work:`
170
171	``` shell
172	`#!/bin/bash`
173	`bf=repo-$(date +%Y-%m-%d).fossil`
174	`ssh example.com "cd museum ; fossil backup -R repo.fossil backups/$bf" &&`
175	`scp example.com:museum/backups/$bf ~/museum/backups`
176	```
177
178	`Beware that this method does not solve [the intransitive sync`
179	`problem](#ait), in and of itself: if you do a SQL-level backup of a`
180	`stale repo DB, you have a stale backup! You should therefore run this`
181	`on every node that may need to serve as a backup so that at least one`
182	`of the backups is also up-to-date.`
183
184
185	`# <a id="enc"></a> Encrypted Off-Site Backups`
186
187	`A useful refinement that you can apply to both methods above is`
188	`encrypted off-site backups. You may wish to store backups of your`
189	`repositories off-site on a service such as Dropbox, Google Drive, iCloud,`
190	`or Microsoft OneDrive, where you don’t fully trust the service not to`
191	`leak your information. This addition to the prior scripts will encrypt`
192	`the resulting backup in such a way that the cloud copy is a useless blob`
193	`of noise to anyone without the key:`
194
195	```shell
196	`iter=152830`
197	`pass="h8TixP6Mt6edJ3d6COaexiiFlvAM54auF2AjT7ZYYn"`
198	`gd="$HOME/Google Drive/Fossil Backups/$bf.xz.enc"`
199	`fossil sql -R ~/museum/backups/"$bf" .dump \| xz -9 \|`
200	`openssl enc -e -aes-256-cbc -pbkdf2 -iter $iter -pass pass:"$pass" -out "$gd"`
201	```
202
203	`If you’re adding this to the first script above, remove the`
204	“`-R repo-name`” bit so you get a dump of the repository backing the
205	`current working directory.`
206
207	Change the `pass` value to some other long random string, and change the
208	`iter` value to something in the hundreds of thousands range. A good source for
209	`the first is [here][grcp], and for the second, [here][rint].`
210
211	`You may find posts online written by people recommending millions of`
212	`iterations for PBKDF2, but they’re generally talking about this in the`
213	`context of memorizable passwords, where adding even one more character`
214	`to the password is a significant burden. Given our script’s purely`
215	`random maximum-length passphrase, there isn’t much more that increasing`
216	`the key derivation iteration count can do for us.`
217
218	`Conversely, if you were to reduce the passphrase to 41 characters, that`
219	`would drop the key strength by roughly 2⁶, being the entropy value per`
220	`character for using most of printable ASCII in our passphrase. To make`
221	`that lost strength up on the PBKDF2 end, you’d have to multiply your`
222	`iterations by 2⁶ = 64 times. It’s easier to use a max-length passphrase`
223	`in this situation than get crazy with key derivation iteration counts.`
224
225	`(This, by the way, is why the example passphrase above is 42 characters:`
226	`with 6 bits of entropy per character, that gives you a key size of 252,`
227	`as close as we can get to our chosen encryption algorithm’s 256-bit key`
228	`size without going over. If it pleases you to give it 43 random`
229	`characters for a passphrase in order to pick up those last four bits of`
230	`security, you’re welcome to do so.)`
231
232	`Compressing the data before encrypting it removes redundancies that can`
233	`make decryption easier, and it results in a smaller backup than you get`
234	`with the previous script alone, at the expense of a lot of CPU time`
235	`during the backup. You may wish to switch to a less space-efficient`
236	compression algorithm that takes less CPU power, such as [`lz4`][lz4].
237	`Changing up the compression algorithm also provides some`
238	`security-thru-obscurity, which is useless on its own, but it is a`
239	`useful adjunct to strong encryption.`
240
241	`This requires OpenSSL 1.1 or higher. If you’re on 1.0 or older, you`
242	won’t have the `-pbkdf2` and `-iter` options, and you may have to choose
243	`a different cipher algorithm; both changes are likely to weaken the`
244	`encryption significantly, so you should install a newer version rather`
245	`than work around the lack of these features.`
246
247	`Beware that macOS ships a fork of OpenSSL called [LibreSSL][lssl] that`
248	`lacked this capability until Ventura (13.0). If you’re on Monterey (12)`
249	`or older, we recommend use of the [Homebrew][hb] OpenSSL package rather`
250	`than give up on the security afforded by use of configurable-iteration`
251	PBKDF2. To avoid a conflict with the platform’s `openssl` binary,
252	`Homebrew’s installation is [unlinked][hbul] by default, so you have to`
253	`give an explicit path to it, one of:`
254
255	`/usr/local/opt/openssl/bin/openssl ... # Intel x86 Macs`
256	`/opt/homebrew/opt/openssl/bin/openssl ... # ARM Macs (“Apple silicon”)`
257
258	`[lssl]: https://www.libressl.org/`
259
260
261	`## <a id="rest"></a> Restoring From An Encrypted Backup`
262
263	`The “restore” script for the above fragment is basically an inverse of`
264	`it, but it’s worth showing it because there are some subtleties to take`
265	`care of. If all variables defined in earlier scripts are available, then`
266	`restoration is:`
267
268	```
269	`openssl enc -d -aes-256-cbc -pbkdf2 -iter $iter -pass pass:"$pass" -in "$gd" \|`
270	`xz -d \| fossil sql --no-repository ~/museum/restored-repo.fossil`
271	```
272
273	We changed the `-e` to `-d` on the `openssl` command to get decryption,
274	and we changed the `-out` to `-in` so it reads from the encrypted backup
275	`file and writes the result to stdout.`
276
277	`The decompression step is trivial.`
278
279	The last change is tricky: we used `fossil sql` above to ensure that
280	`we’re using the same version of SQLite to write the encrypted backup DB`
281	`as was used to maintain the repository. We must also do that on`
282	`restoration:`
283	`Fossil serves as a dogfooding project for SQLite,`
284	`often making use of the latest features, so it is quite likely that a given`
285	random `sqlite3` binary in your `PATH` will be unable to understand the
286	file created by “`fossil sql .dump`”! The tricky bit is, you can’t just
287	pipe the decrypted SQL dump into `fossil sql`, because on startup, Fossil
288	normally goes looking for tables created by `fossil init`, and it won’t
289	`find them in a newly-created repo DB. We get around this by passing`
290	the `--no-repository` flag, which suppresses this behavior. Doing it
291	`this way saves you from needing to go and build a matching version of`
292	`sqlite3` just to restore the backup.
293
294	`[bu]: /help/backup`
295	`[grcp]: https://www.grc.com/passwords.htm`
296	`[hb]: https://brew.sh`
297	`[hbul]: https://docs.brew.sh/FAQ#what-does-keg-only-mean`
298	`[lz4]: https://lz4.github.io/lz4/`
299	`[pbr]: ./private.wiki`
300	`[rint]: https://www.random.org/integers/?num=1&min=100000&max=1000000&col=5&base=10&format=html&rnd=new`
301	`[Setup]: ./caps/admin-v-setup.md#apsu`
302	`[shun]: ./shunning.wiki`
303	`[tkt]: ./tickets.wiki`
304	`[uv]: ./unvers.wiki`
305

Fossil SCM

Keyboard Shortcuts