Fossil SCM

Researched, tested, and documented the set of "docker create --cap-drop" options we can add to strip away unnecessary root privileges inside the container without harming normal operation. Belt-and-suspenders: if any bad actor ever got into the container with root privileges, this would help prevent them from affecting anything outside the container. Added that set to the "make container-run" target so they get applied by default in the easy case.

wyoung 2022-08-29 17:54 trunk
Commit f715add9381024ab6a733db88a1f6a8915775444464ad4337dcae7ea8c3421a0
2 files changed +12 -1 +104 -2
+12 -1
--- Makefile.in
+++ Makefile.in
@@ -122,10 +122,21 @@
122122
# Container stuff
123123
container-image: @srcdir@/Dockerfile
124124
docker build -t fossil:@FOSSIL_CI_PFX@ $(DBFLAGS) @srcdir@
125125
126126
container-run: container-image
127
- docker run --name fossil-@FOSSIL_CI_PFX@ $(DRFLAGS) fossil:@FOSSIL_CI_PFX@
127
+ docker run \
128
+ --name fossil-@FOSSIL_CI_PFX@ \
129
+ --cap-drop AUDIT_WRITE \
130
+ --cap-drop CHOWN \
131
+ --cap-drop FSETID \
132
+ --cap-drop KILL \
133
+ --cap-drop MKNOD \
134
+ --cap-drop NET_BIND_SERVICE \
135
+ --cap-drop NET_RAW \
136
+ --cap-drop SETFCAP \
137
+ --cap-drop SETPCAP \
138
+ $(DRFLAGS) fossil:@FOSSIL_CI_PFX@
128139
129140
@srcdir@/Dockerfile: @srcdir@/Dockerfile.in @srcdir@/manifest.uuid
130141
@AUTOREMAKE@
131142
132143
--- Makefile.in
+++ Makefile.in
@@ -122,10 +122,21 @@
122 # Container stuff
123 container-image: @srcdir@/Dockerfile
124 docker build -t fossil:@FOSSIL_CI_PFX@ $(DBFLAGS) @srcdir@
125
126 container-run: container-image
127 docker run --name fossil-@FOSSIL_CI_PFX@ $(DRFLAGS) fossil:@FOSSIL_CI_PFX@
 
 
 
 
 
 
 
 
 
 
 
128
129 @srcdir@/Dockerfile: @srcdir@/Dockerfile.in @srcdir@/manifest.uuid
130 @AUTOREMAKE@
131
132
--- Makefile.in
+++ Makefile.in
@@ -122,10 +122,21 @@
122 # Container stuff
123 container-image: @srcdir@/Dockerfile
124 docker build -t fossil:@FOSSIL_CI_PFX@ $(DBFLAGS) @srcdir@
125
126 container-run: container-image
127 docker run \
128 --name fossil-@FOSSIL_CI_PFX@ \
129 --cap-drop AUDIT_WRITE \
130 --cap-drop CHOWN \
131 --cap-drop FSETID \
132 --cap-drop KILL \
133 --cap-drop MKNOD \
134 --cap-drop NET_BIND_SERVICE \
135 --cap-drop NET_RAW \
136 --cap-drop SETFCAP \
137 --cap-drop SETPCAP \
138 $(DRFLAGS) fossil:@FOSSIL_CI_PFX@
139
140 @srcdir@/Dockerfile: @srcdir@/Dockerfile.in @srcdir@/manifest.uuid
141 @AUTOREMAKE@
142
143
+104 -2
--- www/build.wiki
+++ www/build.wiki
@@ -344,11 +344,11 @@
344344
(See [#docker-args | below] for how to change this default.)
345345
You don't have to restart the server after fixing this with
346346
<tt>chmod</tt>: simply reload the browser, and Fossil will try again.
347347
348348
349
-<h4>5.1.2 Storing the Repo Outside the Container</h4>
349
+<h4 id="docker-bind-mount">5.1.2 Storing the Repo Outside the Container</h4>
350350
351351
The simple storage method above has a problem: Docker containers are designed to be
352352
killed off at the slightest cause, rebuilt, and redeployed. If you do
353353
that with the repo inside the container, it gets destroyed, too. The
354354
solution is to replace the "run" command above with the following:
@@ -377,11 +377,13 @@
377377
rebuild the container or its underlying image — such as to upgrade to a newer version of Fossil
378378
— the external directory remains behind and gets remapped into the new container
379379
when you recreate it with <tt>-v</tt>.
380380
381381
382
-<h3 id="docker-chroot">5.2 Why Chroot?</h3>
382
+<h3 id="docker-security">5.2 Security</h3>
383
+
384
+<h4 id="docker-chroot">5.2.1 Why Chroot?</h4>
383385
384386
A potentially surprising feature of this container is that it runs
385387
Fossil as root. Since that causes [./chroot.md | Fossil's chroot jail
386388
feature] to kick in, and a Docker container is a type of über-jail
387389
already, you may be wondering why we bother. Instead, why not either:
@@ -422,10 +424,110 @@
422424
the ones forked off to handle each HTTP/CGI hit. This is why you can fix broken
423425
permissions with <tt>chown</tt> after the container is already running,
424426
without restarting it: each hit reevaluates the repository file
425427
permissions when deciding what user to become when dropping root
426428
privileges.
429
+
430
+
431
+<h4 id="docker-caps">5.2.2 Dropping Unnecessary Capabilities</h4>
432
+
433
+The example commands given in this section create the container with
434
+[https://docs.docker.com/engine/security/#linux-kernel-capabilities | a
435
+default set of Linux kernel capabilities]. Although Docker strips almost
436
+all of the traditional root capabilities away by default, and Fossil
437
+doesn't need any of those it does take away, Docker does leave some
438
+enabled that Fossil doesn't actually need. You can tighten the scope of
439
+capabilities by adding a "<tt>--cap-drop LIST</tt>" option to your
440
+container creation commands. Specifically:
441
+
442
+ * <b><tt>AUDIT_WRITE</tt></b>: Fossil doesn't write to the kernel's
443
+ auditing log, and we can't see any reason you'd want to be able to
444
+ do that as an administrator shelled into the container, either.
445
+ Auditing is something done on the host, not from inside each
446
+ individual container.<p>
447
+ * <b><tt>CHOWN</tt></b>: The Fossil server never even calls
448
+ <tt>chown(2)</tt>, and our image build process sets up all file
449
+ ownership properly, to the extent that this is possible under the
450
+ limitations of our automation.<p>
451
+ Curiously, stripping this capability doesn't affect your ability to
452
+ run commands like "<tt>chown -R fossil:fossil /jail/museum</tt>"
453
+ when you're using bind mounts or external volumes — as we recommend
454
+ [#docker-bind-mount | above] — because it's the host OS's kernel
455
+ capabilities that affect the underlying <tt>chown(2)</tt> call in
456
+ that case, not those of the container.<p>
457
+ If for some reason you did have to change file ownership of
458
+ in-container files, it's best to do that by changing the
459
+ <tt>Dockerfile</tt> to suit, then rebuilding the container, since
460
+ that bakes the need for the change into your reproducible build
461
+ process. If you had to do it without rebuilding the container,
462
+ [https://stackoverflow.com/a/45752205/142454 | there's a
463
+ workaround] for the fact that capabilities are a create-time
464
+ change, baked semi-indelibly into the container configuration.<p>
465
+ * <b><tt>FSETID</tt></b>: Fossil doesn't use the SUID and SGID bits
466
+ itself, and our build process doesn't set those flags on any of the
467
+ files. Although the second fact means we can't see any harm from
468
+ leaving this enabled, we also can't see any good reason to allow
469
+ it, so we strip it.<p>
470
+ * <b><tt>KILL</tt></b>: The only place Fossil calls <tt>kill(2)</tt>
471
+ is in the [./backoffice.md | backoffice], and then only for
472
+ processes it created on earlier runs; it doesn't need the ability
473
+ to kill processes created by other users. You might wish for this
474
+ ability as an administrator shelled into the container, but you can
475
+ pass the "<tt>docker exec --user</tt>" option to run commands
476
+ within your container as the legitimate owner of the process,
477
+ removing the need for this capability.<p>
478
+ * <b><tt>MKNOD</tt></b>: All device nodes are created at build time
479
+ and are never changed at run time. Realize that the virtualized
480
+ device nodes inside the container get mapped onto real devices on
481
+ the host, so if an attacker ever got a root shell on the container,
482
+ they might be able to do actual damage to the host if we didn't
483
+ preemptively strip this capability away.<p>
484
+ * <b><tt>NET_BIND_SERVICE</tt></b>: With containerized deployment,
485
+ Fossil never needs the ability to bind the server to low-numbered
486
+ TCP ports, not even if you're running the server in production with
487
+ TLS enabled and want the service bound to port 443. It's perfectly
488
+ fine to let the Fossil instance inside the container bind to its
489
+ default port (8080) because you can rebind it on the host with the
490
+ "<tt>docker create --publish 443:8080</tt>" option. It's the
491
+ container's <i>host</i> that needs this ability, not the container
492
+ itself.<p> (Even the container runtime might not need that
493
+ capability if you're [./ssl.wiki#server | terminating TLS with a
494
+ front-end proxy]. You're more likely to say something like "<tt>-p
495
+ localhost:12345:8080</tt>", then configure the reverse proxy to
496
+ translate external HTTPS calls into HTTP directed at this internal
497
+ port 12345.)<p>
498
+ * <b><tt>NET_RAW</tt></b>: Fossil itself doesn't use raw sockets, and
499
+ our build process leaves out <tt>ping</tt> and <tt>traceroute</tt>,
500
+ the only Busybox utilities that require that ability. If you need
501
+ to ping something, you can almost certainly do it just as well out
502
+ on the host; we foresee no compelling reason to use ping or
503
+ traceroute from inside the container.<p> If we did not take this
504
+ hard-line stance, an attacker that broke into the container and
505
+ gained root privileges could use raw sockets to do a wide array of
506
+ bad things to any network the container is bound to.<p>
507
+ * <b><tt>SETFCAP, SETPCAP</tt></b>: There isn't much call for file
508
+ permission granularity beyond the classic Unix ones inside the
509
+ container, so we drop root's ability to change them.
510
+
511
+All together, we recommend adding the following options to your
512
+"<tt>docker run</tt>" commands, as well as to any "<tt>docker
513
+create</tt>" command that will be followed by "<tt>docker start</tt>":
514
+
515
+<pre><code> --cap-drop AUDIT_WRITE \
516
+ --cap-drop CHOWN \
517
+ --cap-drop FSETID \
518
+ --cap-drop KILL \
519
+ --cap-drop MKNOD \
520
+ --cap-drop NET_BIND_SERVICE \
521
+ --cap-drop NET_RAW \
522
+ --cap-drop SETFCAP \
523
+ --cap-drop SETPCAP
524
+</code></pre>
525
+
526
+In the next section, we'll show a case where you create a container
527
+without ever running it, making these options pointless.
528
+
427529
428530
429531
<h3 id="docker-static">5.3 Extracting a Static Binary</h3>
430532
431533
Our 2-stage build process uses Alpine Linux only as a build host. Once
432534
--- www/build.wiki
+++ www/build.wiki
@@ -344,11 +344,11 @@
344 (See [#docker-args | below] for how to change this default.)
345 You don't have to restart the server after fixing this with
346 <tt>chmod</tt>: simply reload the browser, and Fossil will try again.
347
348
349 <h4>5.1.2 Storing the Repo Outside the Container</h4>
350
351 The simple storage method above has a problem: Docker containers are designed to be
352 killed off at the slightest cause, rebuilt, and redeployed. If you do
353 that with the repo inside the container, it gets destroyed, too. The
354 solution is to replace the "run" command above with the following:
@@ -377,11 +377,13 @@
377 rebuild the container or its underlying image — such as to upgrade to a newer version of Fossil
378 — the external directory remains behind and gets remapped into the new container
379 when you recreate it with <tt>-v</tt>.
380
381
382 <h3 id="docker-chroot">5.2 Why Chroot?</h3>
 
 
383
384 A potentially surprising feature of this container is that it runs
385 Fossil as root. Since that causes [./chroot.md | Fossil's chroot jail
386 feature] to kick in, and a Docker container is a type of über-jail
387 already, you may be wondering why we bother. Instead, why not either:
@@ -422,10 +424,110 @@
422 the ones forked off to handle each HTTP/CGI hit. This is why you can fix broken
423 permissions with <tt>chown</tt> after the container is already running,
424 without restarting it: each hit reevaluates the repository file
425 permissions when deciding what user to become when dropping root
426 privileges.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
427
428
429 <h3 id="docker-static">5.3 Extracting a Static Binary</h3>
430
431 Our 2-stage build process uses Alpine Linux only as a build host. Once
432
--- www/build.wiki
+++ www/build.wiki
@@ -344,11 +344,11 @@
344 (See [#docker-args | below] for how to change this default.)
345 You don't have to restart the server after fixing this with
346 <tt>chmod</tt>: simply reload the browser, and Fossil will try again.
347
348
349 <h4 id="docker-bind-mount">5.1.2 Storing the Repo Outside the Container</h4>
350
351 The simple storage method above has a problem: Docker containers are designed to be
352 killed off at the slightest cause, rebuilt, and redeployed. If you do
353 that with the repo inside the container, it gets destroyed, too. The
354 solution is to replace the "run" command above with the following:
@@ -377,11 +377,13 @@
377 rebuild the container or its underlying image — such as to upgrade to a newer version of Fossil
378 — the external directory remains behind and gets remapped into the new container
379 when you recreate it with <tt>-v</tt>.
380
381
382 <h3 id="docker-security">5.2 Security</h3>
383
384 <h4 id="docker-chroot">5.2.1 Why Chroot?</h4>
385
386 A potentially surprising feature of this container is that it runs
387 Fossil as root. Since that causes [./chroot.md | Fossil's chroot jail
388 feature] to kick in, and a Docker container is a type of über-jail
389 already, you may be wondering why we bother. Instead, why not either:
@@ -422,10 +424,110 @@
424 the ones forked off to handle each HTTP/CGI hit. This is why you can fix broken
425 permissions with <tt>chown</tt> after the container is already running,
426 without restarting it: each hit reevaluates the repository file
427 permissions when deciding what user to become when dropping root
428 privileges.
429
430
431 <h4 id="docker-caps">5.2.2 Dropping Unnecessary Capabilities</h4>
432
433 The example commands given in this section create the container with
434 [https://docs.docker.com/engine/security/#linux-kernel-capabilities | a
435 default set of Linux kernel capabilities]. Although Docker strips almost
436 all of the traditional root capabilities away by default, and Fossil
437 doesn't need any of those it does take away, Docker does leave some
438 enabled that Fossil doesn't actually need. You can tighten the scope of
439 capabilities by adding a "<tt>--cap-drop LIST</tt>" option to your
440 container creation commands. Specifically:
441
442 * <b><tt>AUDIT_WRITE</tt></b>: Fossil doesn't write to the kernel's
443 auditing log, and we can't see any reason you'd want to be able to
444 do that as an administrator shelled into the container, either.
445 Auditing is something done on the host, not from inside each
446 individual container.<p>
447 * <b><tt>CHOWN</tt></b>: The Fossil server never even calls
448 <tt>chown(2)</tt>, and our image build process sets up all file
449 ownership properly, to the extent that this is possible under the
450 limitations of our automation.<p>
451 Curiously, stripping this capability doesn't affect your ability to
452 run commands like "<tt>chown -R fossil:fossil /jail/museum</tt>"
453 when you're using bind mounts or external volumes — as we recommend
454 [#docker-bind-mount | above] — because it's the host OS's kernel
455 capabilities that affect the underlying <tt>chown(2)</tt> call in
456 that case, not those of the container.<p>
457 If for some reason you did have to change file ownership of
458 in-container files, it's best to do that by changing the
459 <tt>Dockerfile</tt> to suit, then rebuilding the container, since
460 that bakes the need for the change into your reproducible build
461 process. If you had to do it without rebuilding the container,
462 [https://stackoverflow.com/a/45752205/142454 | there's a
463 workaround] for the fact that capabilities are a create-time
464 change, baked semi-indelibly into the container configuration.<p>
465 * <b><tt>FSETID</tt></b>: Fossil doesn't use the SUID and SGID bits
466 itself, and our build process doesn't set those flags on any of the
467 files. Although the second fact means we can't see any harm from
468 leaving this enabled, we also can't see any good reason to allow
469 it, so we strip it.<p>
470 * <b><tt>KILL</tt></b>: The only place Fossil calls <tt>kill(2)</tt>
471 is in the [./backoffice.md | backoffice], and then only for
472 processes it created on earlier runs; it doesn't need the ability
473 to kill processes created by other users. You might wish for this
474 ability as an administrator shelled into the container, but you can
475 pass the "<tt>docker exec --user</tt>" option to run commands
476 within your container as the legitimate owner of the process,
477 removing the need for this capability.<p>
478 * <b><tt>MKNOD</tt></b>: All device nodes are created at build time
479 and are never changed at run time. Realize that the virtualized
480 device nodes inside the container get mapped onto real devices on
481 the host, so if an attacker ever got a root shell on the container,
482 they might be able to do actual damage to the host if we didn't
483 preemptively strip this capability away.<p>
484 * <b><tt>NET_BIND_SERVICE</tt></b>: With containerized deployment,
485 Fossil never needs the ability to bind the server to low-numbered
486 TCP ports, not even if you're running the server in production with
487 TLS enabled and want the service bound to port 443. It's perfectly
488 fine to let the Fossil instance inside the container bind to its
489 default port (8080) because you can rebind it on the host with the
490 "<tt>docker create --publish 443:8080</tt>" option. It's the
491 container's <i>host</i> that needs this ability, not the container
492 itself.<p> (Even the container runtime might not need that
493 capability if you're [./ssl.wiki#server | terminating TLS with a
494 front-end proxy]. You're more likely to say something like "<tt>-p
495 localhost:12345:8080</tt>", then configure the reverse proxy to
496 translate external HTTPS calls into HTTP directed at this internal
497 port 12345.)<p>
498 * <b><tt>NET_RAW</tt></b>: Fossil itself doesn't use raw sockets, and
499 our build process leaves out <tt>ping</tt> and <tt>traceroute</tt>,
500 the only Busybox utilities that require that ability. If you need
501 to ping something, you can almost certainly do it just as well out
502 on the host; we foresee no compelling reason to use ping or
503 traceroute from inside the container.<p> If we did not take this
504 hard-line stance, an attacker that broke into the container and
505 gained root privileges could use raw sockets to do a wide array of
506 bad things to any network the container is bound to.<p>
507 * <b><tt>SETFCAP, SETPCAP</tt></b>: There isn't much call for file
508 permission granularity beyond the classic Unix ones inside the
509 container, so we drop root's ability to change them.
510
511 All together, we recommend adding the following options to your
512 "<tt>docker run</tt>" commands, as well as to any "<tt>docker
513 create</tt>" command that will be followed by "<tt>docker start</tt>":
514
515 <pre><code> --cap-drop AUDIT_WRITE \
516 --cap-drop CHOWN \
517 --cap-drop FSETID \
518 --cap-drop KILL \
519 --cap-drop MKNOD \
520 --cap-drop NET_BIND_SERVICE \
521 --cap-drop NET_RAW \
522 --cap-drop SETFCAP \
523 --cap-drop SETPCAP
524 </code></pre>
525
526 In the next section, we'll show a case where you create a container
527 without ever running it, making these options pointless.
528
529
530
531 <h3 id="docker-static">5.3 Extracting a Static Binary</h3>
532
533 Our 2-stage build process uses Alpine Linux only as a build host. Once
534

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button