Fossil SCM

Mentioned containerd+nerdctl in place of runc in the containers doc. A tightened-up version of the prior runc and crun sections are now collected below the Podman section. This gives a better flow: each successive option is smaller than the last, excepting only nspawn, which is a bit bigger than crun. (We leave nspawn last because we can't get it to work!)

wyoung 2022-09-07 09:11 trunk
Commit 457c14a490c76ef168505d9e17456fe470ff6c47c3068038e25595905ef5a5ad
1 file changed +237 -254
+237 -254
--- www/containers.md
+++ www/containers.md
@@ -530,245 +530,45 @@
530530
[DD]: https://www.docker.com/products/docker-desktop/
531531
[DE]: https://docs.docker.com/engine/
532532
[DNT]: ./server/debian/nginx.md
533533
534534
535
-### 6.1 <a id="runc" name="containerd"></a>Stripping Docker Engine Down
535
+### 6.1 <a id="nerdctl" name="containerd"></a>Stripping Docker Engine Down
536536
537537
The core of Docker Engine is its [`containerd`][ctrd] daemon and the
538
-[`runc`][runc] container runner. It’s possible to dig into the subtree
539
-managed by `containerd` on the build host and extract what we need to
540
-run our Fossil container elsewhere with `runc`, leaving out all the
541
-rest. `runc` alone is about 18 MiB, and you can do without `containerd`
542
-entirely, if you want.
543
-
544
-The method isn’t complicated, but it *is* cryptic enough to want a shell
545
-script:
546
-
-----
547
-
548
-```shell
549
-#!/bin/sh
550
-c=fossil
551
-b=$HOME/containers/$c
552
-r=$b/rootfs
553
-m=/run/containerd/io.containerd.runtime.v2.task/moby
554
-
555
-if [ -d "$t" ] && mkdir -p $r
556
-then
557
- docker container start $c
558
- docker container export $c | sudo tar -C $r -xf -
559
- id=$(docker inspect --format="{{.Id}}" $c)
560
- sudo cat $m/$id/config.json |
561
- jq '.root.path = "'$r'"' |
562
- jq '.linux.cgroupsPath = ""' |
563
- jq 'del(.linux.sysctl)' |
564
- jq 'del(.linux.namespaces[] | select(.type == "network"))' |
565
- jq 'del(.mounts[] | select(.destination == "/etc/hostname"))' |
566
- jq 'del(.mounts[] | select(.destination == "/etc/resolv.conf"))' |
567
- jq 'del(.mounts[] | select(.destination == "/etc/hosts"))' |
568
- jq 'del(.hooks)' > $b/config.json
569
-fi
570
-```
571
-
-----
572
-
573
-The first several lines list configurables:
574
-
575
-* **`b`**: the path of the exported container, called the “bundle” in OCI
576
- jargon
577
-* **`c`**: the name of the Docker container you’re bundling up for use
578
- with `runc`
579
-* **`m`**: the directory holding the running machines, configurable
580
- because:
581
- * it’s long
582
- * it’s been known to change from one version of Docker to the next
583
- * you might be using [Podman](#podman)/[`crun`](#crun), so it has
584
- to be “`/run/user/$UID/crun`” instead
585
-* **`r`**: the path of the directory containing the bundle’s root file
586
- system.
587
-
588
-That last doesn’t have to be called `rootfs/`, and it doesn’t have to
589
-live in the same directory as `config.json`, but it is conventional.
590
-Because some OCI tools use those names as defaults, it’s best to follow
591
-suit.
592
-
593
-The rest is generic, but you’re welcome to freestyle here. We’ll show an
594
-example of this below.
595
-
596
-We’re using [jq] for two separate purposes:
597
-
598
-1. To automatically transmogrify Docker’s container configuration so it
599
- will work with `runc`:
600
-
601
- * point it where we unpacked the container’s exported rootfs
602
- * accede to its wish to [manage cgroups by itself][ecg]
603
- * remove the `sysctl` calls that will break after…
604
- * …we remove the network namespace to allow Fossil’s TCP listening
605
- port to be available on the host; `runc` doesn’t offer the
606
- equivalent of `docker create --publish`, and we can’t be
607
- bothered to set up a manual mapping from the host port into the
608
- container
609
- * remove file bindings that point into the local runtime managed
610
- directories; one of the things we give up by using a bare
611
- container runner is automatic management of these files
612
- * remove the hooks for essentially the same reason
613
-
614
-2. To make the Docker-managed machine-readable `config.json` more
615
- human-readable, in case there are other things you want changed in
616
- this version of the container. Exposing the `config.json` file like
617
- this means you don’t have to rebuild the container merely to change
618
- a value like a mount point, the kernel capability set, and so forth.
619
-
620
-<a id="why-sudo"></a>
621
-We have to do this transformation of `config.json` as the local root
622
-user because it isn’t readable by your normal user. Additionally, that
623
-input file is only available while the container is started, which is
624
-why we ensure that before exporting the container’s rootfs.
625
-
626
-With the container exported like this, you can start it as:
627
-
628
-```
629
- $ cd /path/to/bundle
630
- $ c=any-name-you-like
631
- $ sudo runc create $c
632
- $ sudo runc start $c
633
- $ sudo runc exec $c -t sh -l
634
- ~ $ ls museum
635
- repo.fossil
636
- ~ $ ps -eaf
637
- PID USER TIME COMMAND
638
- 1 fossil 0:00 bin/fossil server --create …
639
- ~ $ exit
640
- $ sudo runc kill fossil-runc
641
- $ sudo runc delete fossil-runc
642
-```
643
-
644
-If you’re doing this on the export host, the first command is “`cd $b`”
645
-if we’re using the variables from the shell script above. We do this
646
-because `runc` assumes you’re running it from the bundle directory. If
647
-you prefer, the `runc` commands that care about this take a
648
-`--bundle/-b` flag to let you avoid switching directories.
649
-
650
-The rest should be straightforward: create and start the container as
651
-root so the `chroot(2)` call inside the container will succeed, then get
652
-into it with a login shell and poke around to prove to ourselves that
653
-everything is working properly. It is. Yay!
654
-
655
-The remaining commands show shutting the container down and destroying
656
-it, simply to show how these commands change relative to using the
657
-Docker Engine commands. It’s “kill,” not “stop,” and it’s “delete,” not
658
-“rm.”
659
-
660
-If you want the bundle to run on a remote host, the local and remote
661
-bundle directories likely will not match, as the shell script above
662
-assumes. This is a more realistic shell script for that case:
663
-
-----
664
-
665
-```shell
666
-#!/bin/bash -ex
667
-c=fossil
668
-b=/var/lib/machines/$c
669
-h=my-host.example.com
670
-m=/run/containerd/io.containerd.runtime.v2.task/moby
671
-t=$(mktemp -d /tmp/$c-bundle.XXXXXX)
672
-
673
-if [ -d "$t" ]
674
-then
675
- docker container start $c
676
- docker container export $c > $t/rootfs.tar
677
- id=$(docker inspect --format="{{.Id}}" $c)
678
- sudo cat $m/$id/config.json |
679
- jq '.root.path = "'$b/rootfs'"' |
680
- jq '.linux.cgroupsPath = ""' |
681
- jq 'del(.linux.sysctl)' |
682
- jq 'del(.linux.namespaces[] | select(.type == "network"))' |
683
- jq 'del(.mounts[] | select(.destination == "/etc/hostname"))' |
684
- jq 'del(.mounts[] | select(.destination == "/etc/resolv.conf"))' |
685
- jq 'del(.mounts[] | select(.destination == "/etc/hosts"))' |
686
- jq 'del(.hooks)' > $t/config.json
687
- scp -r $t $h:tmp
688
- ssh -t $h "{
689
- mv ./$t/config.json $b &&
690
- sudo tar -C $b/rootfs -xf ./$t/rootfs.tar &&
691
- rm -r ./$t
692
- }"
693
- rm -r $t
694
-fi
695
-```
696
-
-----
697
-
698
-We’ve introduced two new variables:
699
-
700
-* **`h`**: the remote host name
701
-* **`t`**: a temporary bundle directory we populate locally, then
702
- `scp` to the remote machine, where it’s unpacked
703
-
704
-We dropped the **`r`** variable because now we have two different
705
-“rootfs” types: the tarball and the unpacked version of that tarball.
706
-To avoid confusing ourselves between these cases, we’ve replaced uses of
707
-`$r` with explicit paths.
708
-
709
-You need to be aware that this script uses `sudo` for two different purposes:
710
-
711
-1. To read the local `config.json` file out of the `containerd` managed
712
- directory. ([Details above](#why-sudo).)
713
-
714
-2. To unpack the bundle onto the remote machine. If you try to get
715
- clever and unpack it locally, then `rsync` it to the remote host to
716
- avoid re-copying files that haven’t changed since the last update,
717
- you’ll find that it fails when it tries to copy device nodes, to
718
- create files owned only by the remote root user, and so forth. If the
719
- container bundle is small, it’s simpler to re-copy and unpack it
720
- fresh each time.
721
-
722
-I point that out because it might ask for your password twice: once for
723
-the local sudo command, and once for the remote.
724
-
725
-The default for the **`b`** variable is the convention for systemd based
726
-machines, which will play into the [`nspawn` alternative below][sdnsp].
727
-Even if you aren’t using `nspawn`, it’s a reasonable place to put
728
-containers under the [Linux FHS rules][LFHS].
729
-
730
-[ctrd]: https://containerd.io/
731
-[ecg]: https://github.com/opencontainers/runc/pull/3131
732
-[LFHS]: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
733
-[jq]: https://stedolan.github.io/jq/
734
-[sdnsp]: #nspawn
735
-[runc]: https://github.com/opencontainers/runc
538
+[`runc`][runc] container runner. Add to this the out-of-core CLI program
539
+[`nerdctl`][nerdctl] and you have enough of the engine to run Fossil
540
+containers. The big things you’re missing are:
541
+
542
+* **BuildKit**: The container build engine, which doesn’t matter if
543
+ you’re building elsewhere and using a container registry as an
544
+ intermediary between that build host and the deployment host.
545
+
546
+* **SwarmKit**: A powerful yet simple orchestrator for Docker that you
547
+ probably aren’t using with Fossil anyway.
548
+
549
+In exchange, you get a runtime that’s about half the size of Docker
550
+Engine. The commands are essentially the same as above, but you say
551
+“`nerdctl`” instead of “`docker`”. You might alias one to the other,
552
+because you’re still going to be using Docker to build and ship your
553
+container images.
554
+
555
+[ctrd]: https://containerd.io/
556
+[nerdctl]: https://github.com/containerd/nerdctl
557
+[runc]: https://github.com/opencontainers/runc
736558
737559
738560
### 6.2 <a id="podman"></a>Podman
739561
740
-Although your humble author claims the `runc` methods above are not
741
-complicated, merely cryptic, you might be fondly recollecting the
742
-carefree commands at the top of this document, pondering whether you can
743
-live without the abstractions a proper container runtime system
744
-provides.
745
-
746
-More than that, there’s a hidden cost to the `runc` method: there is no
747
-layer sharing among containers. If you have multiple Fossil containers
748
-on a single host — perhaps because each serves an independent section of
749
-the overall web site — and you export them to a remote host using the
750
-shell script above, you’ll end up with redundant copies of the `rootfs`
751
-in each. A proper OCI container runtime knows they’re all derived from
752
-the same base image, differing only in minor configuration details,
753
-giving us one of the major advantages of containerization: if none of
754
-the running containers can change these immutable base layers, it
755
-doesn’t have to copy them.
756
-
757
-A lighter-weight alternative to Docker Engine that doesn’t give up so
758
-many of its administrator affordances is [Podman], initially created by
759
-Red Hat and thus popular on that family of OSes, although it will run on
562
+A lighter-weight alternative to either of the prior options that doesn’t
563
+give up the image builder is [Podman]. Initially created by
564
+Red Hat and thus popular on that family of OSes, it will run on
760565
any flavor of Linux. It can even be made to run [on macOS via Homebrew][pmmac]
761566
or [on Windows via WSL2][pmwin].
762567
763
-On Ubuntu 22.04, it’s about a quarter the size of Docker Engine. That
764
-isn’t nearly so slim as `runc`, but we may be willing to pay this
765
-overhead to get shorter and fewer commands.
568
+On Ubuntu 22.04, it’s about a quarter the size of Docker Engine, or half
569
+that of the “full” distribution of `nerdctl` and all its dependencies.
766570
767571
Although Podman [bills itself][whatis] as a drop-in replacement for the
768572
`docker` command and everything that sits behind it, some of the tool’s
769573
design decisions affect how our Fossil containers run, as compared to
770574
using Docker. The most important of these is that, by default, Podman
@@ -817,39 +617,14 @@
817617
they’ll be connected to the network the container runs on. Once the bad
818618
guy is inside the house, he doesn’t necessarily have to go after the
819619
residents directly to cause problems for them.
820620
821621
822
-#### 6.2.2 <a id="crun"></a>`crun`
823
-
824
-In the same way that [Docker Engine is based on `runc`](#runc), Podman’s
825
-engine is based on [`crun`][crun], a lighter-weight alternative to
826
-`runc`. It’s only 1.4 MiB on the system I tested it on, yet it will run
827
-the same container bundles as in my `runc` examples above.
828
-Above, we saved more than that by compressing the container’s Fossil
829
-executable with UPX!
830
-
831
-This makes `crun` a great option for tiny remote hosts with a single
832
-container, or at least where none of the containers share base layers,
833
-so that there is no effective cost to duplicating the immutable base
834
-layers of the containers’ source images.
835
-
836
-This suggests one method around the problem of rootless Podman containers:
837
-`sudo crun`, following the examples above.
838
-
839
-[crun]: https://github.com/containers/crun
840
-
841
-
842
-#### 6.2.3 <a id="podman-rootful"></a>Fossil in a Rootful Podman Container
622
+#### 6.2.2 <a id="podman-rootful"></a>Fossil in a Rootful Podman Container
843623
844624
##### Simple Method
845625
846
-As we saw above with `runc`, switching to `crun` just to get your
847
-containers to run as root loses a lot of functionality and requires a
848
-bunch of cryptic commands to get the same effect as a single command
849
-under Podman.
850
-
851626
Fortunately, it’s easy enough to have it both ways. Simply run your
852627
`podman` commands as root:
853628
854629
```
855630
$ sudo podman build -t fossil --cap-add MKNOD .
@@ -875,11 +650,11 @@
875650
it’s done inside a container runtime’s build environment doesn’t mean we
876651
can get away without root privileges to do things like create the
877652
`/jail/dev/null` node.
878653
879654
The other reason we need “`sudo podman build`” is because it puts the result
880
-into root’s Podman image repository, where the next steps look for it.
655
+into root’s Podman image registry, where the next steps look for it.
881656
882657
That in turn explains why we need “`sudo podman create`:” because it’s
883658
creating a container based on an image that was created by root. If you
884659
ran that step without `sudo`, it wouldn’t be able to find the image.
885660
@@ -927,23 +702,227 @@
927702
$ sudo podman create \
928703
--any-options-you-like \
929704
docker.io/mydockername/fossil
930705
```
931706
932
-This round-trip through the public image repository has another side
707
+This round-trip through the public image registry has another side
933708
benefit: your local system might be a lot faster than your remote one,
934709
as when the remote is a small VPS. Even with the overhead of schlepping
935710
container images across the Internet, it can be a net win in terms of
936711
build time.
937712
938713
939714
940
-### 6.3 <a id="nspawn"></a>`systemd-nspawn`
715
+### 6.3 <a id="barebones"></a>Bare-Bones OCI Bundle Runners
716
+
717
+If even the Podman stack is too big for you, you still have options for
718
+running containers that are considerably slimmer, at a high cost to
719
+administration complexity and loss of features.
720
+
721
+Part of the OCI standard is the notion of a “bundle,” being a consistent
722
+way to present a pre-built and configured container to the runtime.
723
+Essentially, it consists of a directory containing a `config.json` file
724
+and a `rootfs/` subdirectory containing the root filesystem image. Many
725
+tools can produce these for you. We’ll show only one method in the first
726
+section below, then reuse that in the following sections.
727
+
728
+
729
+#### 6.3.1 <a id="runc"></a>`runc`
730
+
731
+We mentioned `runc` [above](#nerdctl), but it’s possible to use it
732
+standalone, without `containerd` or its CLI frontend `nerdctl`. You also
733
+lose the build engine, intelligent image layer sharing, image registry
734
+connections, and much more. The plus side is that `runc` alone is
735
+18 MiB.
736
+
737
+Using it without all the support tooling isn’t complicated, but it *is*
738
+cryptic enough to want a shell script. Let’s say we want to build on our
739
+big desktop machine but ship the resulting container to a small remote
740
+host. This should serve:
741
+
742
+----
743
+
744
+```shell
745
+#!/bin/bash -ex
746
+c=fossil
747
+b=/var/lib/machines/$c
748
+h=my-host.example.com
749
+m=/run/containerd/io.containerd.runtime.v2.task/moby
750
+t=$(mktemp -d /tmp/$c-bundle.XXXXXX)
751
+
752
+if [ -d "$t" ]
753
+then
754
+ docker container start $c
755
+ docker container export $c > $t/rootfs.tar
756
+ id=$(docker inspect --format="{{.Id}}" $c)
757
+ sudo cat $m/$id/config.json \
758
+ | jq '.root.path = "'$b/rootfs'"'
759
+ | jq '.linux.cgroupsPath = ""'
760
+ | jq 'del(.linux.sysctl)'
761
+ | jq 'del(.linux.namespaces[] | select(.type == "network"))'
762
+ | jq 'del(.mounts[] | select(.destination == "/etc/hostname"))'
763
+ | jq 'del(.mounts[] | select(.destination == "/etc/resolv.conf"))'
764
+ | jq 'del(.mounts[] | select(.destination == "/etc/hosts"))'
765
+ | jq 'del(.hooks)' > $t/config.json
766
+ scp -r $t $h:tmp
767
+ ssh -t $h "{
768
+ mv ./$t/config.json $b &&
769
+ sudo tar -C $b/rootfs -xf ./$t/rootfs.tar &&
770
+ rm -r ./$t
771
+ }"
772
+ rm -r $t
773
+fi
774
+```
775
+
776
+----
777
+
778
+The first several lines list configurables:
779
+
780
+* **`c`**: the name of the Docker container you’re bundling up for use
781
+ with `runc`
782
+* **`b`**: the path of the exported container, called the “bundle” in
783
+ OCI jargon; we’re using the [`nspawn`](#nspawn) convention, a
784
+ reasonable choice under the [Linux FHS rules][LFHS]
785
+* **`h`**: the remote host name
786
+* **`m`**: the local directory holding the running machines, configurable
787
+ because:
788
+ * the path name is longer than we want to use inline
789
+ * it’s been known to change from one version of Docker to the next
790
+ * you might be building and testing with [Podman](#podman), so it
791
+ has to be “`/run/user/$UID/crun`” instead
792
+* **`t`**: the temporary bundle directory we populate locally, then
793
+ `scp` to the remote machine, where it’s unpacked
794
+
795
+[LFHS]: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
796
+
797
+
798
+##### Why All That `sudo` Stuff?
799
+
800
+This script uses `sudo` for two different purposes:
801
+
802
+1. To read the local `config.json` file out of the `containerd` managed
803
+ directory, which is owned by `root` on Docker systems. Additionally,
804
+ that input file is only available while the container is started, so
805
+ we must ensure that before extracting it.
806
+
807
+2. To unpack the bundle onto the remote machine. If you try to get
808
+ clever and unpack it locally, then `rsync` it to the remote host to
809
+ avoid re-copying files that haven’t changed since the last update,
810
+ you’ll find that it fails when it tries to copy device nodes, to
811
+ create files owned only by the remote root user, and so forth. If the
812
+ container bundle is small, it’s simpler to re-copy and unpack it
813
+ fresh each time.
814
+
815
+I point all this out because it might ask for your password twice: once for
816
+the local sudo command, and once for the remote.
817
+
818
+
819
+
820
+##### Why All That `jq` Stuff?
821
+
822
+We’re using [jq] for two separate purposes:
823
+
824
+1. To automatically transmogrify Docker’s container configuration so it
825
+ will work with `runc`:
826
+
827
+ * point it where we unpacked the container’s exported rootfs
828
+ * accede to its wish to [manage cgroups by itself][ecg]
829
+ * remove the `sysctl` calls that will break after…
830
+ * …we remove the network namespace to allow Fossil’s TCP listening
831
+ port to be available on the host; `runc` doesn’t offer the
832
+ equivalent of `docker create --publish`, and we can’t be
833
+ bothered to set up a manual mapping from the host port into the
834
+ container
835
+ * remove file bindings that point into the local runtime managed
836
+ directories; one of the things we give up by using a bare
837
+ container runner is automatic management of these files
838
+ * remove the hooks for essentially the same reason
839
+
840
+2. To make the Docker-managed machine-readable `config.json` more
841
+ human-readable, in case there are other things you want changed in
842
+ this version of the container. Exposing the `config.json` file like
843
+ this means you don’t have to rebuild the container merely to change
844
+ a value like a mount point, the kernel capability set, and so forth.
845
+
846
+
847
+##### Running the Bundle
848
+
849
+With the container exported to a bundle like this, you can start it as:
850
+
851
+```
852
+ $ cd /path/to/bundle
853
+ $ c=fossil-runc ← …or anything else you prefer
854
+ $ sudo runc create $c
855
+ $ sudo runc start $c
856
+ $ sudo runc exec $c -t sh -l
857
+ ~ $ ls museum
858
+ repo.fossil
859
+ ~ $ ps -eaf
860
+ PID USER TIME COMMAND
861
+ 1 fossil 0:00 bin/fossil server --create …
862
+ ~ $ exit
863
+ $ sudo runc kill $c
864
+ $ sudo runc delete $c
865
+```
866
+
867
+If you’re doing this on the export host, the first command is “`cd $b`”
868
+if we’re using the variables from the shell script above. Alternately,
869
+the `runc` subcommands that need to read the bundle files take a
870
+`--bundle/-b` flag to let you avoid switching directories.
871
+
872
+The rest should be straightforward: create and start the container as
873
+root so the `chroot(2)` call inside the container will succeed, then get
874
+into it with a login shell and poke around to prove to ourselves that
875
+everything is working properly. It is. Yay!
876
+
877
+The remaining commands show shutting the container down and destroying
878
+it, simply to show how these commands change relative to using the
879
+Docker Engine commands. It’s “kill,” not “stop,” and it’s “delete,” not
880
+“rm.”
881
+
882
+[ecg]: https://github.com/opencontainers/runc/pull/3131
883
+[jq]: https://stedolan.github.io/jq/
884
+
885
+
886
+##### Lack of Layer Sharing
887
+
888
+The bundle export process collapses Docker’s union filesystem down to a
889
+single layer. Atop that, it makes all files mutable.
890
+
891
+All of this is fine for tiny remote hosts with a single container, or at
892
+least one where none of the containers share base layers. Where it
893
+becomes a problem is when you have multiple Fossil containers on a
894
+single host, since they all derive from the same base image.
895
+
896
+The full-featured container runtimes above will intelligently share
897
+these immutable base layers among the containers, storing only the
898
+differences in each individual container. More, when pulling images from
899
+a registry host, they’ll transfer only the layers you don’t have copies
900
+of locally, so you don’t have to burn bandwidth sending copies of Alpine
901
+and BusyBox each time, even though they’re unlikely to change from one
902
+build to the next.
903
+
904
+
905
+#### 6.3.2 <a id="crun"></a>`crun`
906
+
907
+In the same way that [Docker Engine is based on `runc`](#runc), Podman’s
908
+engine is based on [`crun`][crun], a lighter-weight alternative to
909
+`runc`. It’s only 1.4 MiB on the system I tested it on, yet it will run
910
+the same container bundles as in my `runc` examples above. We saved
911
+more than that by compressing the container’s Fossil executable with
912
+UPX, making the runtime virtually free in this case. The only question
913
+is whether you can put up with its limitations, which are the same as
914
+for `runc`.
915
+
916
+[crun]: https://github.com/containers/crun
917
+
918
+
919
+#### 6.3.3 <a id="nspawn"></a>`systemd-nspawn`
941920
942921
As of `systemd` version 242, its optional `nspawn` piece
943922
[reportedly](https://www.phoronix.com/news/Systemd-Nspawn-OCI-Runtime)
944
-now has the ability to run OCI container bundles directly. You might
923
+got the ability to run OCI bundles directly. You might
945924
have it installed already, but if not, it’s only about 2 MiB. It’s
946925
in the `systemd-containers` package as of Ubuntu 22.04 LTS:
947926
948927
```
949928
$ sudo apt install systemd-containers
@@ -963,12 +942,12 @@
963942
--port=127.0.0.1:127.0.0.1:9999:8080
964943
$ sudo machinectl list
965944
No machines.
966945
```
967946
968
-This is why I wrote “reportedly” above: it doesn’t work on two different
969
-Linux distributions, and I can’t see why. I’m putting this here to give
947
+This is why I wrote “reportedly” above: I couldn’t get it to work on two different
948
+Linux distributions, and I can’t see why. I’m leaving this here to give
970949
someone else a leg up, with the hope that they will work out what’s
971950
needed to get the container running and registered with `machinectl`.
972951
973952
As of this writing, the tool expects an OCI container version of
974953
“1.0.0”. I had to edit this at the top of my `config.json` file to get
975954
--- www/containers.md
+++ www/containers.md
@@ -530,245 +530,45 @@
530 [DD]: https://www.docker.com/products/docker-desktop/
531 [DE]: https://docs.docker.com/engine/
532 [DNT]: ./server/debian/nginx.md
533
534
535 ### 6.1 <a id="runc" name="containerd"></a>Stripping Docker Engine Down
536
537 The core of Docker Engine is its [`containerd`][ctrd] daemon and the
538 [`runc`][runc] container runner. It’s possible to dig into the subtree
539 managed by `containerd` on the build host and extract what we need to
540 run our Fossil container elsewhere with `runc`, leaving out all the
541 rest. `runc` alone is about 18 MiB, and you can do without `containerd`
542 entirely, if you want.
543
544 The method isn’t complicated, but it *is* cryptic enough to want a shell
545 script:
546
-----
547
548 ```shell
549 #!/bin/sh
550 c=fossil
551 b=$HOME/containers/$c
552 r=$b/rootfs
553 m=/run/containerd/io.containerd.runtime.v2.task/moby
554
555 if [ -d "$t" ] && mkdir -p $r
556 then
557 docker container start $c
558 docker container export $c | sudo tar -C $r -xf -
559 id=$(docker inspect --format="{{.Id}}" $c)
560 sudo cat $m/$id/config.json |
561 jq '.root.path = "'$r'"' |
562 jq '.linux.cgroupsPath = ""' |
563 jq 'del(.linux.sysctl)' |
564 jq 'del(.linux.namespaces[] | select(.type == "network"))' |
565 jq 'del(.mounts[] | select(.destination == "/etc/hostname"))' |
566 jq 'del(.mounts[] | select(.destination == "/etc/resolv.conf"))' |
567 jq 'del(.mounts[] | select(.destination == "/etc/hosts"))' |
568 jq 'del(.hooks)' > $b/config.json
569 fi
570 ```
571
-----
572
573 The first several lines list configurables:
574
575 * **`b`**: the path of the exported container, called the “bundle” in OCI
576 jargon
577 * **`c`**: the name of the Docker container you’re bundling up for use
578 with `runc`
579 * **`m`**: the directory holding the running machines, configurable
580 because:
581 * it’s long
582 * it’s been known to change from one version of Docker to the next
583 * you might be using [Podman](#podman)/[`crun`](#crun), so it has
584 to be “`/run/user/$UID/crun`” instead
585 * **`r`**: the path of the directory containing the bundle’s root file
586 system.
587
588 That last doesn’t have to be called `rootfs/`, and it doesn’t have to
589 live in the same directory as `config.json`, but it is conventional.
590 Because some OCI tools use those names as defaults, it’s best to follow
591 suit.
592
593 The rest is generic, but you’re welcome to freestyle here. We’ll show an
594 example of this below.
595
596 We’re using [jq] for two separate purposes:
597
598 1. To automatically transmogrify Docker’s container configuration so it
599 will work with `runc`:
600
601 * point it where we unpacked the container’s exported rootfs
602 * accede to its wish to [manage cgroups by itself][ecg]
603 * remove the `sysctl` calls that will break after…
604 * …we remove the network namespace to allow Fossil’s TCP listening
605 port to be available on the host; `runc` doesn’t offer the
606 equivalent of `docker create --publish`, and we can’t be
607 bothered to set up a manual mapping from the host port into the
608 container
609 * remove file bindings that point into the local runtime managed
610 directories; one of the things we give up by using a bare
611 container runner is automatic management of these files
612 * remove the hooks for essentially the same reason
613
614 2. To make the Docker-managed machine-readable `config.json` more
615 human-readable, in case there are other things you want changed in
616 this version of the container. Exposing the `config.json` file like
617 this means you don’t have to rebuild the container merely to change
618 a value like a mount point, the kernel capability set, and so forth.
619
620 <a id="why-sudo"></a>
621 We have to do this transformation of `config.json` as the local root
622 user because it isn’t readable by your normal user. Additionally, that
623 input file is only available while the container is started, which is
624 why we ensure that before exporting the container’s rootfs.
625
626 With the container exported like this, you can start it as:
627
628 ```
629 $ cd /path/to/bundle
630 $ c=any-name-you-like
631 $ sudo runc create $c
632 $ sudo runc start $c
633 $ sudo runc exec $c -t sh -l
634 ~ $ ls museum
635 repo.fossil
636 ~ $ ps -eaf
637 PID USER TIME COMMAND
638 1 fossil 0:00 bin/fossil server --create …
639 ~ $ exit
640 $ sudo runc kill fossil-runc
641 $ sudo runc delete fossil-runc
642 ```
643
644 If you’re doing this on the export host, the first command is “`cd $b`”
645 if we’re using the variables from the shell script above. We do this
646 because `runc` assumes you’re running it from the bundle directory. If
647 you prefer, the `runc` commands that care about this take a
648 `--bundle/-b` flag to let you avoid switching directories.
649
650 The rest should be straightforward: create and start the container as
651 root so the `chroot(2)` call inside the container will succeed, then get
652 into it with a login shell and poke around to prove to ourselves that
653 everything is working properly. It is. Yay!
654
655 The remaining commands show shutting the container down and destroying
656 it, simply to show how these commands change relative to using the
657 Docker Engine commands. It’s “kill,” not “stop,” and it’s “delete,” not
658 “rm.”
659
660 If you want the bundle to run on a remote host, the local and remote
661 bundle directories likely will not match, as the shell script above
662 assumes. This is a more realistic shell script for that case:
663
-----
664
665 ```shell
666 #!/bin/bash -ex
667 c=fossil
668 b=/var/lib/machines/$c
669 h=my-host.example.com
670 m=/run/containerd/io.containerd.runtime.v2.task/moby
671 t=$(mktemp -d /tmp/$c-bundle.XXXXXX)
672
673 if [ -d "$t" ]
674 then
675 docker container start $c
676 docker container export $c > $t/rootfs.tar
677 id=$(docker inspect --format="{{.Id}}" $c)
678 sudo cat $m/$id/config.json |
679 jq '.root.path = "'$b/rootfs'"' |
680 jq '.linux.cgroupsPath = ""' |
681 jq 'del(.linux.sysctl)' |
682 jq 'del(.linux.namespaces[] | select(.type == "network"))' |
683 jq 'del(.mounts[] | select(.destination == "/etc/hostname"))' |
684 jq 'del(.mounts[] | select(.destination == "/etc/resolv.conf"))' |
685 jq 'del(.mounts[] | select(.destination == "/etc/hosts"))' |
686 jq 'del(.hooks)' > $t/config.json
687 scp -r $t $h:tmp
688 ssh -t $h "{
689 mv ./$t/config.json $b &&
690 sudo tar -C $b/rootfs -xf ./$t/rootfs.tar &&
691 rm -r ./$t
692 }"
693 rm -r $t
694 fi
695 ```
696
-----
697
698 We’ve introduced two new variables:
699
700 * **`h`**: the remote host name
701 * **`t`**: a temporary bundle directory we populate locally, then
702 `scp` to the remote machine, where it’s unpacked
703
704 We dropped the **`r`** variable because now we have two different
705 “rootfs” types: the tarball and the unpacked version of that tarball.
706 To avoid confusing ourselves between these cases, we’ve replaced uses of
707 `$r` with explicit paths.
708
709 You need to be aware that this script uses `sudo` for two different purposes:
710
711 1. To read the local `config.json` file out of the `containerd` managed
712 directory. ([Details above](#why-sudo).)
713
714 2. To unpack the bundle onto the remote machine. If you try to get
715 clever and unpack it locally, then `rsync` it to the remote host to
716 avoid re-copying files that haven’t changed since the last update,
717 you’ll find that it fails when it tries to copy device nodes, to
718 create files owned only by the remote root user, and so forth. If the
719 container bundle is small, it’s simpler to re-copy and unpack it
720 fresh each time.
721
722 I point that out because it might ask for your password twice: once for
723 the local sudo command, and once for the remote.
724
725 The default for the **`b`** variable is the convention for systemd based
726 machines, which will play into the [`nspawn` alternative below][sdnsp].
727 Even if you aren’t using `nspawn`, it’s a reasonable place to put
728 containers under the [Linux FHS rules][LFHS].
729
730 [ctrd]: https://containerd.io/
731 [ecg]: https://github.com/opencontainers/runc/pull/3131
732 [LFHS]: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
733 [jq]: https://stedolan.github.io/jq/
734 [sdnsp]: #nspawn
735 [runc]: https://github.com/opencontainers/runc
736
737
738 ### 6.2 <a id="podman"></a>Podman
739
740 Although your humble author claims the `runc` methods above are not
741 complicated, merely cryptic, you might be fondly recollecting the
742 carefree commands at the top of this document, pondering whether you can
743 live without the abstractions a proper container runtime system
744 provides.
745
746 More than that, there’s a hidden cost to the `runc` method: there is no
747 layer sharing among containers. If you have multiple Fossil containers
748 on a single host — perhaps because each serves an independent section of
749 the overall web site — and you export them to a remote host using the
750 shell script above, you’ll end up with redundant copies of the `rootfs`
751 in each. A proper OCI container runtime knows they’re all derived from
752 the same base image, differing only in minor configuration details,
753 giving us one of the major advantages of containerization: if none of
754 the running containers can change these immutable base layers, it
755 doesn’t have to copy them.
756
757 A lighter-weight alternative to Docker Engine that doesn’t give up so
758 many of its administrator affordances is [Podman], initially created by
759 Red Hat and thus popular on that family of OSes, although it will run on
760 any flavor of Linux. It can even be made to run [on macOS via Homebrew][pmmac]
761 or [on Windows via WSL2][pmwin].
762
763 On Ubuntu 22.04, it’s about a quarter the size of Docker Engine. That
764 isn’t nearly so slim as `runc`, but we may be willing to pay this
765 overhead to get shorter and fewer commands.
766
767 Although Podman [bills itself][whatis] as a drop-in replacement for the
768 `docker` command and everything that sits behind it, some of the tool’s
769 design decisions affect how our Fossil containers run, as compared to
770 using Docker. The most important of these is that, by default, Podman
@@ -817,39 +617,14 @@
817 they’ll be connected to the network the container runs on. Once the bad
818 guy is inside the house, he doesn’t necessarily have to go after the
819 residents directly to cause problems for them.
820
821
822 #### 6.2.2 <a id="crun"></a>`crun`
823
824 In the same way that [Docker Engine is based on `runc`](#runc), Podman’s
825 engine is based on [`crun`][crun], a lighter-weight alternative to
826 `runc`. It’s only 1.4 MiB on the system I tested it on, yet it will run
827 the same container bundles as in my `runc` examples above.
828 Above, we saved more than that by compressing the container’s Fossil
829 executable with UPX!
830
831 This makes `crun` a great option for tiny remote hosts with a single
832 container, or at least where none of the containers share base layers,
833 so that there is no effective cost to duplicating the immutable base
834 layers of the containers’ source images.
835
836 This suggests one method around the problem of rootless Podman containers:
837 `sudo crun`, following the examples above.
838
839 [crun]: https://github.com/containers/crun
840
841
842 #### 6.2.3 <a id="podman-rootful"></a>Fossil in a Rootful Podman Container
843
844 ##### Simple Method
845
846 As we saw above with `runc`, switching to `crun` just to get your
847 containers to run as root loses a lot of functionality and requires a
848 bunch of cryptic commands to get the same effect as a single command
849 under Podman.
850
851 Fortunately, it’s easy enough to have it both ways. Simply run your
852 `podman` commands as root:
853
854 ```
855 $ sudo podman build -t fossil --cap-add MKNOD .
@@ -875,11 +650,11 @@
875 it’s done inside a container runtime’s build environment doesn’t mean we
876 can get away without root privileges to do things like create the
877 `/jail/dev/null` node.
878
879 The other reason we need “`sudo podman build`” is because it puts the result
880 into root’s Podman image repository, where the next steps look for it.
881
882 That in turn explains why we need “`sudo podman create`:” because it’s
883 creating a container based on an image that was created by root. If you
884 ran that step without `sudo`, it wouldn’t be able to find the image.
885
@@ -927,23 +702,227 @@
927 $ sudo podman create \
928 --any-options-you-like \
929 docker.io/mydockername/fossil
930 ```
931
932 This round-trip through the public image repository has another side
933 benefit: your local system might be a lot faster than your remote one,
934 as when the remote is a small VPS. Even with the overhead of schlepping
935 container images across the Internet, it can be a net win in terms of
936 build time.
937
938
939
940 ### 6.3 <a id="nspawn"></a>`systemd-nspawn`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
941
942 As of `systemd` version 242, its optional `nspawn` piece
943 [reportedly](https://www.phoronix.com/news/Systemd-Nspawn-OCI-Runtime)
944 now has the ability to run OCI container bundles directly. You might
945 have it installed already, but if not, it’s only about 2 MiB. It’s
946 in the `systemd-containers` package as of Ubuntu 22.04 LTS:
947
948 ```
949 $ sudo apt install systemd-containers
@@ -963,12 +942,12 @@
963 --port=127.0.0.1:127.0.0.1:9999:8080
964 $ sudo machinectl list
965 No machines.
966 ```
967
968 This is why I wrote “reportedly” above: it doesn’t work on two different
969 Linux distributions, and I can’t see why. I’m putting this here to give
970 someone else a leg up, with the hope that they will work out what’s
971 needed to get the container running and registered with `machinectl`.
972
973 As of this writing, the tool expects an OCI container version of
974 “1.0.0”. I had to edit this at the top of my `config.json` file to get
975
--- www/containers.md
+++ www/containers.md
@@ -530,245 +530,45 @@
530 [DD]: https://www.docker.com/products/docker-desktop/
531 [DE]: https://docs.docker.com/engine/
532 [DNT]: ./server/debian/nginx.md
533
534
535 ### 6.1 <a id="nerdctl" name="containerd"></a>Stripping Docker Engine Down
536
537 The core of Docker Engine is its [`containerd`][ctrd] daemon and the
 
 
 
 
 
 
 
 
 
-----
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
-----
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
-----
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
-----
538 [`runc`][runc] container runner. Add to this the out-of-core CLI program
539 [`nerdctl`][nerdctl] and you have enough of the engine to run Fossil
540 containers. The big things you’re missing are:
541
542 * **BuildKit**: The container build engine, which doesn’t matter if
543 you’re building elsewhere and using a container registry as an
544 intermediary between that build host and the deployment host.
545
546 * **SwarmKit**: A powerful yet simple orchestrator for Docker that you
547 probably aren’t using with Fossil anyway.
548
549 In exchange, you get a runtime that’s about half the size of Docker
550 Engine. The commands are essentially the same as above, but you say
551 “`nerdctl`” instead of “`docker`”. You might alias one to the other,
552 because you’re still going to be using Docker to build and ship your
553 container images.
554
555 [ctrd]: https://containerd.io/
556 [nerdctl]: https://github.com/containerd/nerdctl
557 [runc]: https://github.com/opencontainers/runc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
558
559
560 ### 6.2 <a id="podman"></a>Podman
561
562 A lighter-weight alternative to either of the prior options that doesn’t
563 give up the image builder is [Podman]. Initially created by
564 Red Hat and thus popular on that family of OSes, it will run on
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
565 any flavor of Linux. It can even be made to run [on macOS via Homebrew][pmmac]
566 or [on Windows via WSL2][pmwin].
567
568 On Ubuntu 22.04, it’s about a quarter the size of Docker Engine, or half
569 that of the “full” distribution of `nerdctl` and all its dependencies.
 
570
571 Although Podman [bills itself][whatis] as a drop-in replacement for the
572 `docker` command and everything that sits behind it, some of the tool’s
573 design decisions affect how our Fossil containers run, as compared to
574 using Docker. The most important of these is that, by default, Podman
@@ -817,39 +617,14 @@
617 they’ll be connected to the network the container runs on. Once the bad
618 guy is inside the house, he doesn’t necessarily have to go after the
619 residents directly to cause problems for them.
620
621
622 #### 6.2.2 <a id="podman-rootful"></a>Fossil in a Rootful Podman Container
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
623
624 ##### Simple Method
625
 
 
 
 
 
626 Fortunately, it’s easy enough to have it both ways. Simply run your
627 `podman` commands as root:
628
629 ```
630 $ sudo podman build -t fossil --cap-add MKNOD .
@@ -875,11 +650,11 @@
650 it’s done inside a container runtime’s build environment doesn’t mean we
651 can get away without root privileges to do things like create the
652 `/jail/dev/null` node.
653
654 The other reason we need “`sudo podman build`” is because it puts the result
655 into root’s Podman image registry, where the next steps look for it.
656
657 That in turn explains why we need “`sudo podman create`:” because it’s
658 creating a container based on an image that was created by root. If you
659 ran that step without `sudo`, it wouldn’t be able to find the image.
660
@@ -927,23 +702,227 @@
702 $ sudo podman create \
703 --any-options-you-like \
704 docker.io/mydockername/fossil
705 ```
706
707 This round-trip through the public image registry has another side
708 benefit: your local system might be a lot faster than your remote one,
709 as when the remote is a small VPS. Even with the overhead of schlepping
710 container images across the Internet, it can be a net win in terms of
711 build time.
712
713
714
715 ### 6.3 <a id="barebones"></a>Bare-Bones OCI Bundle Runners
716
717 If even the Podman stack is too big for you, you still have options for
718 running containers that are considerably slimmer, at a high cost to
719 administration complexity and loss of features.
720
721 Part of the OCI standard is the notion of a “bundle,” being a consistent
722 way to present a pre-built and configured container to the runtime.
723 Essentially, it consists of a directory containing a `config.json` file
724 and a `rootfs/` subdirectory containing the root filesystem image. Many
725 tools can produce these for you. We’ll show only one method in the first
726 section below, then reuse that in the following sections.
727
728
729 #### 6.3.1 <a id="runc"></a>`runc`
730
731 We mentioned `runc` [above](#nerdctl), but it’s possible to use it
732 standalone, without `containerd` or its CLI frontend `nerdctl`. You also
733 lose the build engine, intelligent image layer sharing, image registry
734 connections, and much more. The plus side is that `runc` alone is
735 18 MiB.
736
737 Using it without all the support tooling isn’t complicated, but it *is*
738 cryptic enough to want a shell script. Let’s say we want to build on our
739 big desktop machine but ship the resulting container to a small remote
740 host. This should serve:
741
742 ----
743
744 ```shell
745 #!/bin/bash -ex
746 c=fossil
747 b=/var/lib/machines/$c
748 h=my-host.example.com
749 m=/run/containerd/io.containerd.runtime.v2.task/moby
750 t=$(mktemp -d /tmp/$c-bundle.XXXXXX)
751
752 if [ -d "$t" ]
753 then
754 docker container start $c
755 docker container export $c > $t/rootfs.tar
756 id=$(docker inspect --format="{{.Id}}" $c)
757 sudo cat $m/$id/config.json \
758 | jq '.root.path = "'$b/rootfs'"'
759 | jq '.linux.cgroupsPath = ""'
760 | jq 'del(.linux.sysctl)'
761 | jq 'del(.linux.namespaces[] | select(.type == "network"))'
762 | jq 'del(.mounts[] | select(.destination == "/etc/hostname"))'
763 | jq 'del(.mounts[] | select(.destination == "/etc/resolv.conf"))'
764 | jq 'del(.mounts[] | select(.destination == "/etc/hosts"))'
765 | jq 'del(.hooks)' > $t/config.json
766 scp -r $t $h:tmp
767 ssh -t $h "{
768 mv ./$t/config.json $b &&
769 sudo tar -C $b/rootfs -xf ./$t/rootfs.tar &&
770 rm -r ./$t
771 }"
772 rm -r $t
773 fi
774 ```
775
776 ----
777
778 The first several lines list configurables:
779
780 * **`c`**: the name of the Docker container you’re bundling up for use
781 with `runc`
782 * **`b`**: the path of the exported container, called the “bundle” in
783 OCI jargon; we’re using the [`nspawn`](#nspawn) convention, a
784 reasonable choice under the [Linux FHS rules][LFHS]
785 * **`h`**: the remote host name
786 * **`m`**: the local directory holding the running machines, configurable
787 because:
788 * the path name is longer than we want to use inline
789 * it’s been known to change from one version of Docker to the next
790 * you might be building and testing with [Podman](#podman), so it
791 has to be “`/run/user/$UID/crun`” instead
792 * **`t`**: the temporary bundle directory we populate locally, then
793 `scp` to the remote machine, where it’s unpacked
794
795 [LFHS]: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard
796
797
798 ##### Why All That `sudo` Stuff?
799
800 This script uses `sudo` for two different purposes:
801
802 1. To read the local `config.json` file out of the `containerd` managed
803 directory, which is owned by `root` on Docker systems. Additionally,
804 that input file is only available while the container is started, so
805 we must ensure that before extracting it.
806
807 2. To unpack the bundle onto the remote machine. If you try to get
808 clever and unpack it locally, then `rsync` it to the remote host to
809 avoid re-copying files that haven’t changed since the last update,
810 you’ll find that it fails when it tries to copy device nodes, to
811 create files owned only by the remote root user, and so forth. If the
812 container bundle is small, it’s simpler to re-copy and unpack it
813 fresh each time.
814
815 I point all this out because it might ask for your password twice: once for
816 the local sudo command, and once for the remote.
817
818
819
820 ##### Why All That `jq` Stuff?
821
822 We’re using [jq] for two separate purposes:
823
824 1. To automatically transmogrify Docker’s container configuration so it
825 will work with `runc`:
826
827 * point it where we unpacked the container’s exported rootfs
828 * accede to its wish to [manage cgroups by itself][ecg]
829 * remove the `sysctl` calls that will break after…
830 * …we remove the network namespace to allow Fossil’s TCP listening
831 port to be available on the host; `runc` doesn’t offer the
832 equivalent of `docker create --publish`, and we can’t be
833 bothered to set up a manual mapping from the host port into the
834 container
835 * remove file bindings that point into the local runtime managed
836 directories; one of the things we give up by using a bare
837 container runner is automatic management of these files
838 * remove the hooks for essentially the same reason
839
840 2. To make the Docker-managed machine-readable `config.json` more
841 human-readable, in case there are other things you want changed in
842 this version of the container. Exposing the `config.json` file like
843 this means you don’t have to rebuild the container merely to change
844 a value like a mount point, the kernel capability set, and so forth.
845
846
847 ##### Running the Bundle
848
849 With the container exported to a bundle like this, you can start it as:
850
851 ```
852 $ cd /path/to/bundle
853 $ c=fossil-runc ← …or anything else you prefer
854 $ sudo runc create $c
855 $ sudo runc start $c
856 $ sudo runc exec $c -t sh -l
857 ~ $ ls museum
858 repo.fossil
859 ~ $ ps -eaf
860 PID USER TIME COMMAND
861 1 fossil 0:00 bin/fossil server --create …
862 ~ $ exit
863 $ sudo runc kill $c
864 $ sudo runc delete $c
865 ```
866
867 If you’re doing this on the export host, the first command is “`cd $b`”
868 if we’re using the variables from the shell script above. Alternately,
869 the `runc` subcommands that need to read the bundle files take a
870 `--bundle/-b` flag to let you avoid switching directories.
871
872 The rest should be straightforward: create and start the container as
873 root so the `chroot(2)` call inside the container will succeed, then get
874 into it with a login shell and poke around to prove to ourselves that
875 everything is working properly. It is. Yay!
876
877 The remaining commands show shutting the container down and destroying
878 it, simply to show how these commands change relative to using the
879 Docker Engine commands. It’s “kill,” not “stop,” and it’s “delete,” not
880 “rm.”
881
882 [ecg]: https://github.com/opencontainers/runc/pull/3131
883 [jq]: https://stedolan.github.io/jq/
884
885
886 ##### Lack of Layer Sharing
887
888 The bundle export process collapses Docker’s union filesystem down to a
889 single layer. Atop that, it makes all files mutable.
890
891 All of this is fine for tiny remote hosts with a single container, or at
892 least one where none of the containers share base layers. Where it
893 becomes a problem is when you have multiple Fossil containers on a
894 single host, since they all derive from the same base image.
895
896 The full-featured container runtimes above will intelligently share
897 these immutable base layers among the containers, storing only the
898 differences in each individual container. More, when pulling images from
899 a registry host, they’ll transfer only the layers you don’t have copies
900 of locally, so you don’t have to burn bandwidth sending copies of Alpine
901 and BusyBox each time, even though they’re unlikely to change from one
902 build to the next.
903
904
905 #### 6.3.2 <a id="crun"></a>`crun`
906
907 In the same way that [Docker Engine is based on `runc`](#runc), Podman’s
908 engine is based on [`crun`][crun], a lighter-weight alternative to
909 `runc`. It’s only 1.4 MiB on the system I tested it on, yet it will run
910 the same container bundles as in my `runc` examples above. We saved
911 more than that by compressing the container’s Fossil executable with
912 UPX, making the runtime virtually free in this case. The only question
913 is whether you can put up with its limitations, which are the same as
914 for `runc`.
915
916 [crun]: https://github.com/containers/crun
917
918
919 #### 6.3.3 <a id="nspawn"></a>`systemd-nspawn`
920
921 As of `systemd` version 242, its optional `nspawn` piece
922 [reportedly](https://www.phoronix.com/news/Systemd-Nspawn-OCI-Runtime)
923 got the ability to run OCI bundles directly. You might
924 have it installed already, but if not, it’s only about 2 MiB. It’s
925 in the `systemd-containers` package as of Ubuntu 22.04 LTS:
926
927 ```
928 $ sudo apt install systemd-containers
@@ -963,12 +942,12 @@
942 --port=127.0.0.1:127.0.0.1:9999:8080
943 $ sudo machinectl list
944 No machines.
945 ```
946
947 This is why I wrote “reportedly” above: I couldn’t get it to work on two different
948 Linux distributions, and I can’t see why. I’m leaving this here to give
949 someone else a leg up, with the hope that they will work out what’s
950 needed to get the container running and registered with `machinectl`.
951
952 As of this writing, the tool expects an OCI container version of
953 “1.0.0”. I had to edit this at the top of my `config.json` file to get
954

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button