Fossil SCM

fossil-scm / www / backup.md
1
# Backing Up a Remote Fossil Repository
2
3
One of the great benefits of Fossil and other [distributed version control systems][dvcs]
4
is that cloning a repository makes a backup. If you are running a project with multiple
5
developers who share their work using a [central server][server] and the server hardware
6
catches fire, the clones of the repository on each developer
7
workstation *may* serve as a suitable backup.
8
9
[dvcs]: wikipedia:/wiki/Distributed_version_control
10
[server]: ./server/whyuseaserver.wiki
11
12
We say “may” because
13
it turns out not everything in a Fossil repository is copied when cloning. You
14
don’t even always get copies of all historical file artifacts. More than
15
that, a Fossil repository typically contains
16
other useful information that is not always shared as part of a clone, which might need
17
to be backed up separately. To wit:
18
19
20
## <a id="pii"></a> Sensitive Information
21
22
Fossil purposefully does not clone certain sensitive information unless
23
you’re logged in as a user with [Setup] capability. As an example, a local clone
24
may have a different `user` table than the remote, because only a
25
Setup user is allowed to see the full version for privacy and security
26
reasons.
27
28
29
## <a id="config"></a> Configuration Drift
30
31
Fossil allows the local configuration to differ in several areas from
32
that of the remote. You get a copy
33
of *some* of these configuration areas on initial clone — not all! — but after that,
34
remote configuration changes mostly do not sync down automatically.
35
36
37
#### <a id="skin"></a> Skin
38
39
Changes to the remote’s skin don’t sync down, on purpose, since you may
40
want to have a different skin on the local clone than on the remote. You
41
can ask for updates with [`fossil config pull skin`][cfg], but that does
42
not happen automatically during the course of normal development.
43
44
45
#### <a id="alerts"></a> Email Alerts
46
47
The Admin → Notification settings do not get copied on clone or sync,
48
and it is not possible to push such settings from one repository to
49
another. We did this on purpose because you may have a network of peer
50
repositories, and you only want one repository sending email alerts. If
51
Fossil were to automatically replicate the email alert settings to a
52
separate repository, subscribers would get multiple alerts for each
53
event, which would be *bad.*
54
55
The only element of the email alert configuration that can be pulled
56
over the sync protocol on demand is the subscriber list, via
57
[`fossil config pull subscriber`][cfg].
58
59
60
#### <a id="project"></a> Project Configuration
61
62
This is normally generated once during `fossil init` and never changed,
63
so Fossil doesn’t pull this information without being forced, on
64
purpose. You could accidentally merge two separate Fossil repos by
65
pushing one repo’s project config up to another, for example.
66
67
68
#### <a id="other-cfg"></a> Others
69
70
A repo’s URL aliases, [interwiki configuration](./interwiki.md), and
71
[ticket customizations](./custom_tcket.wiki) also do not normally sync.
72
73
[cfg]: /help/configuration
74
75
76
77
## <a id="private"></a> Private Branches
78
79
The very nature of Fossil’s [private branch feature][pbr] ensures that
80
remote clones don’t get a copy of those branches. Normally this is
81
exactly what you want, but in the case of making backups, you probably
82
want to back up these branches as well. One of the two backup methods below
83
provides this.
84
85
86
## <a id="shun"></a> Shunned Artifacts
87
88
Fossil purposefully doesn’t sync [shunned artifacts][shun]. If you want
89
your local clone to be a precise match to the remote, it needs to track
90
changes to the shun table as well.
91
92
93
## <a id="uv"></a> Unversioned Artifacts
94
95
Data in Fossil’s [unversioned artifacts table][uv] doesn’t sync down by
96
default unless you specifically ask for it. Like local configuration
97
data, it doesn’t get pulled as part of a normal `fossil sync`, but
98
*unlike* the config data, you don’t get unversioned files as part of the
99
initial clone unless you ask for it by passing the `--unversioned/-u`
100
flag.
101
102
103
## <a id="ait"></a>Autosync Is Intransitive
104
105
If you’re using Fossil in a truly distributed mode, rather than the
106
simple central-and-clones model that is more common, there may be no
107
single source of truth in the network because Fossil’s autosync feature
108
isn’t transitive.
109
110
That is, if you cloned from server A, and then you stand that up on a
111
server B, then if I clone from your server as my repository C, your changes to B
112
autosync up to A, but not down to me on C until I do something locally
113
that triggers autosync. The inverse is also true: if I commit something
114
on C, it will autosync up to B, but A won’t get a copy until someone on
115
B does something to trigger a sync there.
116
117
An easy way to run into this problem is to set up failover servers
118
`svr1` thru `svr3.example.com`, then set `svr2` and `svr3` up to sync
119
with the first. If all of the users normally clone from `svr1`, their
120
commits don’t get to `svr2` and `svr3` until something on one of the
121
servers pushes or pulls the changes down to the next server in the sync
122
chain.
123
124
Likewise, if `svr1` falls over and all of the users re-point their local
125
clones at `svr2`, then `svr1` later reappears, `svr1` is likely to
126
remain a stale copy of the old version of the repository until someone
127
causes it to sync with `svr2` or `svr3` to catch up again. And then if
128
you originally designed the sync scheme to treat `svr1` as the primary
129
source of truth, those users still syncing with `svr2` won’t have their
130
commits pushed up to `svr1` unless you’ve set up bidirectional sync,
131
rather than have the two backup servers do `pull` only.
132
133
134
# <a id="sync-solution"></a> Solution 1: Explicit Pulls
135
136
The following script solves most of the above problems for the use case
137
where you want a *nearly-complete* clone of the remote repository using nothing
138
but the normal Fossil sync protocol. It only does so if you are logged into
139
the remote as a user with Setup capability, however.
140
141
``` shell
142
#!/bin/sh
143
fossil sync --unversioned
144
fossil configuration pull all
145
fossil rebuild
146
```
147
148
The last step is needed to ensure that shunned artifacts on the remote
149
are removed from the local clone. The second step includes
150
`fossil conf pull shun`, but until those artifacts are actually rebuilt
151
out of existence, your backup will be “more than complete” in the sense
152
that it will continue to have information that the remote says should
153
not exist any more. That would be not so much a “backup” as an
154
“archive,” which might not be what you want.
155
156
157
# <a id="sql-solution"></a> Solution 2: SQL-Level Backup
158
159
The first method doesn’t get you a copy of the remote’s
160
[private branches][pbr], on purpose. It may also miss other info on the
161
remote, such as SQL-level customizations that the sync protocol can’t
162
see. (Some [ticket system customization][tkt] schemes rely on this ability, for example.) You can
163
solve such problems if you have access to the remote server, which
164
allows you to get a SQL-level backup by delegating handling of locking
165
and transaction isolation to
166
[the `backup` command][bu], allowing the user to safely back up an in-use
167
repository.
168
169
If you have SSH access to the remote server, something like this will work:
170
171
``` shell
172
#!/bin/bash
173
bf=repo-$(date +%Y-%m-%d).fossil
174
ssh example.com "cd museum ; fossil backup -R repo.fossil backups/$bf" &&
175
scp example.com:museum/backups/$bf ~/museum/backups
176
```
177
178
Beware that this method does not solve [the intransitive sync
179
problem](#ait), in and of itself: if you do a SQL-level backup of a
180
stale repo DB, you have a *stale backup!* You should therefore run this
181
on every node that may need to serve as a backup so that at least *one*
182
of the backups is also up-to-date.
183
184
185
# <a id="enc"></a> Encrypted Off-Site Backups
186
187
A useful refinement that you can apply to both methods above is
188
encrypted off-site backups. You may wish to store backups of your
189
repositories off-site on a service such as Dropbox, Google Drive, iCloud,
190
or Microsoft OneDrive, where you don’t fully trust the service not to
191
leak your information. This addition to the prior scripts will encrypt
192
the resulting backup in such a way that the cloud copy is a useless blob
193
of noise to anyone without the key:
194
195
```shell
196
iter=152830
197
pass="h8TixP6Mt6edJ3d6COaexiiFlvAM54auF2AjT7ZYYn"
198
gd="$HOME/Google Drive/Fossil Backups/$bf.xz.enc"
199
fossil sql -R ~/museum/backups/"$bf" .dump | xz -9 |
200
openssl enc -e -aes-256-cbc -pbkdf2 -iter $iter -pass pass:"$pass" -out "$gd"
201
```
202
203
If you’re adding this to the first script above, remove the
204
“`-R repo-name`” bit so you get a dump of the repository backing the
205
current working directory.
206
207
Change the `pass` value to some other long random string, and change the
208
`iter` value to something in the hundreds of thousands range. A good source for
209
the first is [here][grcp], and for the second, [here][rint].
210
211
You may find posts online written by people recommending millions of
212
iterations for PBKDF2, but they’re generally talking about this in the
213
context of memorizable passwords, where adding even one more character
214
to the password is a significant burden. Given our script’s purely
215
random maximum-length passphrase, there isn’t much more that increasing
216
the key derivation iteration count can do for us.
217
218
Conversely, if you were to reduce the passphrase to 41 characters, that
219
would drop the key strength by roughly 2⁶, being the entropy value per
220
character for using most of printable ASCII in our passphrase. To make
221
that lost strength up on the PBKDF2 end, you’d have to multiply your
222
iterations by 2⁶ = 64 times. It’s easier to use a max-length passphrase
223
in this situation than get crazy with key derivation iteration counts.
224
225
(This, by the way, is why the example passphrase above is 42 characters:
226
with 6 bits of entropy per character, that gives you a key size of 252,
227
as close as we can get to our chosen encryption algorithm’s 256-bit key
228
size without going over. If it pleases you to give it 43 random
229
characters for a passphrase in order to pick up those last four bits of
230
security, you’re welcome to do so.)
231
232
Compressing the data before encrypting it removes redundancies that can
233
make decryption easier, and it results in a smaller backup than you get
234
with the previous script alone, at the expense of a lot of CPU time
235
during the backup. You may wish to switch to a less space-efficient
236
compression algorithm that takes less CPU power, such as [`lz4`][lz4].
237
Changing up the compression algorithm also provides some
238
security-thru-obscurity, which is useless on its own, but it *is* a
239
useful adjunct to strong encryption.
240
241
This requires OpenSSL 1.1 or higher. If you’re on 1.0 or older, you
242
won’t have the `-pbkdf2` and `-iter` options, and you may have to choose
243
a different cipher algorithm; both changes are likely to weaken the
244
encryption significantly, so you should install a newer version rather
245
than work around the lack of these features.
246
247
Beware that macOS ships a fork of OpenSSL called [LibreSSL][lssl] that
248
lacked this capability until Ventura (13.0). If you’re on Monterey (12)
249
or older, we recommend use of the [Homebrew][hb] OpenSSL package rather
250
than give up on the security afforded by use of configurable-iteration
251
PBKDF2. To avoid a conflict with the platform’s `openssl` binary,
252
Homebrew’s installation is [unlinked][hbul] by default, so you have to
253
give an explicit path to it, one of:
254
255
/usr/local/opt/openssl/bin/openssl ... # Intel x86 Macs
256
/opt/homebrew/opt/openssl/bin/openssl ... # ARM Macs (“Apple silicon”)
257
258
[lssl]: https://www.libressl.org/
259
260
261
## <a id="rest"></a> Restoring From An Encrypted Backup
262
263
The “restore” script for the above fragment is basically an inverse of
264
it, but it’s worth showing it because there are some subtleties to take
265
care of. If all variables defined in earlier scripts are available, then
266
restoration is:
267
268
```
269
openssl enc -d -aes-256-cbc -pbkdf2 -iter $iter -pass pass:"$pass" -in "$gd" |
270
xz -d | fossil sql --no-repository ~/museum/restored-repo.fossil
271
```
272
273
We changed the `-e` to `-d` on the `openssl` command to get decryption,
274
and we changed the `-out` to `-in` so it reads from the encrypted backup
275
file and writes the result to stdout.
276
277
The decompression step is trivial.
278
279
The last change is tricky: we used `fossil sql` above to ensure that
280
we’re using the same version of SQLite to write the encrypted backup DB
281
as was used to maintain the repository. We must also do that on
282
restoration:
283
Fossil serves as a dogfooding project for SQLite,
284
often making use of the latest features, so it is quite likely that a given
285
random `sqlite3` binary in your `PATH` will be unable to understand the
286
file created by “`fossil sql .dump`”! The tricky bit is, you can’t just
287
pipe the decrypted SQL dump into `fossil sql`, because on startup, Fossil
288
normally goes looking for tables created by `fossil init`, and it won’t
289
find them in a newly-created repo DB. We get around this by passing
290
the `--no-repository` flag, which suppresses this behavior. Doing it
291
this way saves you from needing to go and build a matching version of
292
`sqlite3` just to restore the backup.
293
294
[bu]: /help/backup
295
[grcp]: https://www.grc.com/passwords.htm
296
[hb]: https://brew.sh
297
[hbul]: https://docs.brew.sh/FAQ#what-does-keg-only-mean
298
[lz4]: https://lz4.github.io/lz4/
299
[pbr]: ./private.wiki
300
[rint]: https://www.random.org/integers/?num=1&min=100000&max=1000000&col=5&base=10&format=html&rnd=new
301
[Setup]: ./caps/admin-v-setup.md#apsu
302
[shun]: ./shunning.wiki
303
[tkt]: ./tickets.wiki
304
[uv]: ./unvers.wiki
305

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button