Fossil SCM
Merged two redundant discussions of the consequences of disabling private network virtualization under systemd-container infrastructure, then added better reasons why the reader might care.
Commit
70554336950bc5bc6e7b6b0582515efb6fa29c2850f714e02340935fbaeb0023
Parent
ad09d3eee0f45b7…
1 file changed
+68
-59
+68
-59
| --- www/containers.md | ||
| +++ www/containers.md | ||
| @@ -942,84 +942,93 @@ | ||
| 942 | 942 | files sucks compared to “`podman container create ...`” This |
| 943 | 943 | is but one of many affordances you will find in the runtimes |
| 944 | 944 | aimed at daily-use devops warriors. |
| 945 | 945 | |
| 946 | 946 | 5. **Network virtualization.** In the scheme above, we turn off the |
| 947 | - `systemd` virtual netorking support because in its default mode, | |
| 948 | - it wants to hide the service entirely. | |
| 949 | - | |
| 950 | - Another way to put this is that `systemd-nspawn --port` does | |
| 951 | - approximately *nothing* of what `docker create --publish` does | |
| 952 | - despite their superficial similarities. | |
| 953 | - | |
| 954 | - For this container, it doesn’t much matter, since it exposes | |
| 955 | - only a single port, and we do want that one port exposed, one way | |
| 956 | - or another. Beyond that, we get all the control we need using | |
| 957 | - Fossil options like `--localhost`. I point this out because in | |
| 958 | - more complex situations, the automatic network setup features of | |
| 959 | - the more featureful runtimes can save a lot of time and hassle. | |
| 960 | - They aren’t doing anything you couldn’t do by hand, but why | |
| 961 | - would you want to, given the choice? | |
| 962 | - | |
| 963 | -I expect there’s a lot more I neglected to think of when creating | |
| 964 | -this list, but I think it suffices to make my case as it is. If you | |
| 965 | -can afford the space of Podman or Docker, I strongly recommend using | |
| 947 | + `systemd` private networking support because in its default mode, it | |
| 948 | + wants to hide containerized services entirely. While there are | |
| 949 | + [ways][ndcmp] to expose Fossil’s single network service port under | |
| 950 | + that scheme, it adds a lot of administration complexity. In the | |
| 951 | + big-boy container runtimes, `docker create --publish` fixes all this | |
| 952 | + up in a single option, whereas `systemd-nspawn --port` does | |
| 953 | + approximately *none* of that despite the command’s superficial | |
| 954 | + similarity. | |
| 955 | + | |
| 956 | + From a purely functional point of view, this isn’t a huge problem if | |
| 957 | + you consider the “inbound” service direction only, being external | |
| 958 | + connections to the Fossil service we’re providing. Since we do want | |
| 959 | + this Fossil service to be exposed — else why are we running it? — we | |
| 960 | + get all the control we need via `fossil server --localhost` and | |
| 961 | + similar options. | |
| 962 | + | |
| 963 | + The complexity of the `systemd` networking infrastructure’s | |
| 964 | + interactions with containers make more sense when you consider the | |
| 965 | + “outbound” path. Consider what happens if you enable Fossil’s | |
| 966 | + optional TH1 docs feature plus its Tcl evaluation feature. That | |
| 967 | + would enable anyone with the rights to commit to your repository the | |
| 968 | + ability to make arbitrary network connections on the Fossil host. | |
| 969 | + Then, let us say you have a client-server DBMS server on that same | |
| 970 | + host, bound to localhost for private use by other services on the | |
| 971 | + machine. Now that DBMS is open to access by a rogue Fossil committer | |
| 972 | + because the host’s loopback interface is mapped directly into the | |
| 973 | + container’s network namespace. | |
| 974 | + | |
| 975 | + Proper network virtualization would protect you in this instance. | |
| 976 | + | |
| 977 | +This author expects that the set of considerations is broader than | |
| 978 | +presented here, but that it suffices to make our case as it is: if you | |
| 979 | +can afford the space of Podman or Docker, we strongly recommend using | |
| 966 | 980 | either of them over the much lower-level `systemd-container` |
| 967 | -infrastructure. | |
| 981 | +infrastructure. You’re getting a considerable amount of value for the | |
| 982 | +higher runtime cost; it isn’t simply overhead for little return. | |
| 968 | 983 | |
| 969 | 984 | (Incidentally, these are essentially the same reasons why we no longer |
| 970 | 985 | talk about the `crun` tool underpinning Podman in this document. It’s |
| 971 | 986 | even more limited, making it even more difficult to administer while |
| 972 | 987 | providing no runtime size advantage. The `runc` tool underpinning |
| 973 | 988 | Docker is even worse on this score, being scarcely easier to use than |
| 974 | 989 | `crun` while having a much larger footprint.) |
| 990 | + | |
| 991 | +[ndcmp]: https://wiki.archlinux.org/title/systemd-networkd#Usage_with_containers | |
| 975 | 992 | |
| 976 | 993 | |
| 977 | 994 | ### 6.3.3 <a id="nspawn-assumptions"></a>Violated Assumptions |
| 978 | 995 | |
| 979 | 996 | The `systemd-container` infrastructure has a bunch of hard-coded |
| 980 | 997 | assumptions baked into it. We papered over these problems above, |
| 981 | 998 | but if you’re using these tools for other purposes on the machine |
| 982 | 999 | you’re serving Fossil from, you may need to know which assumptions |
| 983 | -our container violates and the resulting consequences: | |
| 984 | - | |
| 985 | -1. `systemd-nspawn` works best with `machinectl`, but if you haven’t | |
| 986 | - got `btrfs` available, you run into [trouble](#nspawn-rhel). | |
| 987 | - | |
| 988 | -2. Our stock container starts a single static executable inside | |
| 989 | - a stripped-to-the-bones container rather than “boot” an OS | |
| 990 | - image, causing a bunch of commands to fail: | |
| 991 | - | |
| 992 | - * **`machinectl poweroff`** will fail because the container | |
| 993 | - isn’t running dbus. | |
| 994 | - * **`machinectl start`** will try to find an `/sbin/init` | |
| 995 | - program in the rootfs, which we haven’t got. We could | |
| 996 | - rename `/jail/bin/fossil` to `/sbin/init` and then hack | |
| 997 | - the chroot scheme to match, but ick. (This, incidentally, | |
| 998 | - is why we set `ProcessTwo=yes` above even though Fossil is | |
| 999 | - perfectly capable of running as PID 1, a fact we depend on | |
| 1000 | - in the other methods above.) | |
| 1001 | - * **`machinectl shell`** will fail because there is no login | |
| 1002 | - daemon running, which we purposefully avoided adding by | |
| 1003 | - creating a “`FROM scratch`” container. (If you need a | |
| 1004 | - shell, say: `sudo systemd-nspawn --machine=myproject /bin/sh`) | |
| 1005 | - * **`machinectl status`** won’t give you the container logs | |
| 1006 | - because we disabled the shared journal, which was in turn | |
| 1007 | - necessary because we don’t run `systemd` *inside* the | |
| 1008 | - container, just outside. | |
| 1009 | - | |
| 1010 | - If these are problems for you, you may wish to build a | |
| 1011 | - fatter container using `debootstrap` or similar. ([External | |
| 1012 | - tutorial][medtut].) | |
| 1013 | - | |
| 1014 | -3. We disable the “private networking” feature since the whole | |
| 1015 | - point of this container is to expose a network service to the | |
| 1016 | - public, one way or another. If you do things the way the defaults | |
| 1017 | - (and thus the official docs) expect, you must push through | |
| 1018 | - [a whole lot of complexity][ndcmp] to re-expose this single | |
| 1019 | - network port. That complexity is justified only if your service | |
| 1020 | - is itself complex, having both private and public service ports. | |
| 1000 | +our container violates and the resulting consequences. | |
| 1001 | + | |
| 1002 | +Some of it we discussed above already, but there’s one big class of | |
| 1003 | +problems we haven’t covered yet. It stems from the fact that our stock | |
| 1004 | +container starts a single static executable inside a barebones container | |
| 1005 | +rather than “boot” an OS image. That causes a bunch of commands to fail: | |
| 1006 | + | |
| 1007 | +* **`machinectl poweroff`** will fail because the container | |
| 1008 | + isn’t running dbus. | |
| 1009 | + | |
| 1010 | +* **`machinectl start`** will try to find an `/sbin/init` | |
| 1011 | + program in the rootfs, which we haven’t got. We could | |
| 1012 | + rename `/jail/bin/fossil` to `/sbin/init` and then hack | |
| 1013 | + the chroot scheme to match, but ick. (This, incidentally, | |
| 1014 | + is why we set `ProcessTwo=yes` above even though Fossil is | |
| 1015 | + perfectly capable of running as PID 1, a fact we depend on | |
| 1016 | + in the other methods above.) | |
| 1017 | + | |
| 1018 | +* **`machinectl shell`** will fail because there is no login | |
| 1019 | + daemon running, which we purposefully avoided adding by | |
| 1020 | + creating a “`FROM scratch`” container. (If you need a | |
| 1021 | + shell, say: `sudo systemd-nspawn --machine=myproject /bin/sh`) | |
| 1022 | + | |
| 1023 | +* **`machinectl status`** won’t give you the container logs | |
| 1024 | + because we disabled the shared journal, which was in turn | |
| 1025 | + necessary because we don’t run `systemd` *inside* the | |
| 1026 | + container, just outside. | |
| 1027 | + | |
| 1028 | +If these are problems for you, you may wish to build a | |
| 1029 | +fatter container using `debootstrap` or similar. ([External | |
| 1030 | +tutorial][medtut].) | |
| 1021 | 1031 | |
| 1022 | 1032 | [medtut]: https://medium.com/@huljar/setting-up-containers-with-systemd-nspawn-b719cff0fb8d |
| 1023 | -[ndcmp]: https://wiki.archlinux.org/title/systemd-networkd#Usage_with_containers | |
| 1024 | 1033 | |
| 1025 | 1034 | <div style="height:50em" id="this-space-intentionally-left-blank"></div> |
| 1026 | 1035 |
| --- www/containers.md | |
| +++ www/containers.md | |
| @@ -942,84 +942,93 @@ | |
| 942 | files sucks compared to “`podman container create ...`” This |
| 943 | is but one of many affordances you will find in the runtimes |
| 944 | aimed at daily-use devops warriors. |
| 945 | |
| 946 | 5. **Network virtualization.** In the scheme above, we turn off the |
| 947 | `systemd` virtual netorking support because in its default mode, |
| 948 | it wants to hide the service entirely. |
| 949 | |
| 950 | Another way to put this is that `systemd-nspawn --port` does |
| 951 | approximately *nothing* of what `docker create --publish` does |
| 952 | despite their superficial similarities. |
| 953 | |
| 954 | For this container, it doesn’t much matter, since it exposes |
| 955 | only a single port, and we do want that one port exposed, one way |
| 956 | or another. Beyond that, we get all the control we need using |
| 957 | Fossil options like `--localhost`. I point this out because in |
| 958 | more complex situations, the automatic network setup features of |
| 959 | the more featureful runtimes can save a lot of time and hassle. |
| 960 | They aren’t doing anything you couldn’t do by hand, but why |
| 961 | would you want to, given the choice? |
| 962 | |
| 963 | I expect there’s a lot more I neglected to think of when creating |
| 964 | this list, but I think it suffices to make my case as it is. If you |
| 965 | can afford the space of Podman or Docker, I strongly recommend using |
| 966 | either of them over the much lower-level `systemd-container` |
| 967 | infrastructure. |
| 968 | |
| 969 | (Incidentally, these are essentially the same reasons why we no longer |
| 970 | talk about the `crun` tool underpinning Podman in this document. It’s |
| 971 | even more limited, making it even more difficult to administer while |
| 972 | providing no runtime size advantage. The `runc` tool underpinning |
| 973 | Docker is even worse on this score, being scarcely easier to use than |
| 974 | `crun` while having a much larger footprint.) |
| 975 | |
| 976 | |
| 977 | ### 6.3.3 <a id="nspawn-assumptions"></a>Violated Assumptions |
| 978 | |
| 979 | The `systemd-container` infrastructure has a bunch of hard-coded |
| 980 | assumptions baked into it. We papered over these problems above, |
| 981 | but if you’re using these tools for other purposes on the machine |
| 982 | you’re serving Fossil from, you may need to know which assumptions |
| 983 | our container violates and the resulting consequences: |
| 984 | |
| 985 | 1. `systemd-nspawn` works best with `machinectl`, but if you haven’t |
| 986 | got `btrfs` available, you run into [trouble](#nspawn-rhel). |
| 987 | |
| 988 | 2. Our stock container starts a single static executable inside |
| 989 | a stripped-to-the-bones container rather than “boot” an OS |
| 990 | image, causing a bunch of commands to fail: |
| 991 | |
| 992 | * **`machinectl poweroff`** will fail because the container |
| 993 | isn’t running dbus. |
| 994 | * **`machinectl start`** will try to find an `/sbin/init` |
| 995 | program in the rootfs, which we haven’t got. We could |
| 996 | rename `/jail/bin/fossil` to `/sbin/init` and then hack |
| 997 | the chroot scheme to match, but ick. (This, incidentally, |
| 998 | is why we set `ProcessTwo=yes` above even though Fossil is |
| 999 | perfectly capable of running as PID 1, a fact we depend on |
| 1000 | in the other methods above.) |
| 1001 | * **`machinectl shell`** will fail because there is no login |
| 1002 | daemon running, which we purposefully avoided adding by |
| 1003 | creating a “`FROM scratch`” container. (If you need a |
| 1004 | shell, say: `sudo systemd-nspawn --machine=myproject /bin/sh`) |
| 1005 | * **`machinectl status`** won’t give you the container logs |
| 1006 | because we disabled the shared journal, which was in turn |
| 1007 | necessary because we don’t run `systemd` *inside* the |
| 1008 | container, just outside. |
| 1009 | |
| 1010 | If these are problems for you, you may wish to build a |
| 1011 | fatter container using `debootstrap` or similar. ([External |
| 1012 | tutorial][medtut].) |
| 1013 | |
| 1014 | 3. We disable the “private networking” feature since the whole |
| 1015 | point of this container is to expose a network service to the |
| 1016 | public, one way or another. If you do things the way the defaults |
| 1017 | (and thus the official docs) expect, you must push through |
| 1018 | [a whole lot of complexity][ndcmp] to re-expose this single |
| 1019 | network port. That complexity is justified only if your service |
| 1020 | is itself complex, having both private and public service ports. |
| 1021 | |
| 1022 | [medtut]: https://medium.com/@huljar/setting-up-containers-with-systemd-nspawn-b719cff0fb8d |
| 1023 | [ndcmp]: https://wiki.archlinux.org/title/systemd-networkd#Usage_with_containers |
| 1024 | |
| 1025 | <div style="height:50em" id="this-space-intentionally-left-blank"></div> |
| 1026 |
| --- www/containers.md | |
| +++ www/containers.md | |
| @@ -942,84 +942,93 @@ | |
| 942 | files sucks compared to “`podman container create ...`” This |
| 943 | is but one of many affordances you will find in the runtimes |
| 944 | aimed at daily-use devops warriors. |
| 945 | |
| 946 | 5. **Network virtualization.** In the scheme above, we turn off the |
| 947 | `systemd` private networking support because in its default mode, it |
| 948 | wants to hide containerized services entirely. While there are |
| 949 | [ways][ndcmp] to expose Fossil’s single network service port under |
| 950 | that scheme, it adds a lot of administration complexity. In the |
| 951 | big-boy container runtimes, `docker create --publish` fixes all this |
| 952 | up in a single option, whereas `systemd-nspawn --port` does |
| 953 | approximately *none* of that despite the command’s superficial |
| 954 | similarity. |
| 955 | |
| 956 | From a purely functional point of view, this isn’t a huge problem if |
| 957 | you consider the “inbound” service direction only, being external |
| 958 | connections to the Fossil service we’re providing. Since we do want |
| 959 | this Fossil service to be exposed — else why are we running it? — we |
| 960 | get all the control we need via `fossil server --localhost` and |
| 961 | similar options. |
| 962 | |
| 963 | The complexity of the `systemd` networking infrastructure’s |
| 964 | interactions with containers make more sense when you consider the |
| 965 | “outbound” path. Consider what happens if you enable Fossil’s |
| 966 | optional TH1 docs feature plus its Tcl evaluation feature. That |
| 967 | would enable anyone with the rights to commit to your repository the |
| 968 | ability to make arbitrary network connections on the Fossil host. |
| 969 | Then, let us say you have a client-server DBMS server on that same |
| 970 | host, bound to localhost for private use by other services on the |
| 971 | machine. Now that DBMS is open to access by a rogue Fossil committer |
| 972 | because the host’s loopback interface is mapped directly into the |
| 973 | container’s network namespace. |
| 974 | |
| 975 | Proper network virtualization would protect you in this instance. |
| 976 | |
| 977 | This author expects that the set of considerations is broader than |
| 978 | presented here, but that it suffices to make our case as it is: if you |
| 979 | can afford the space of Podman or Docker, we strongly recommend using |
| 980 | either of them over the much lower-level `systemd-container` |
| 981 | infrastructure. You’re getting a considerable amount of value for the |
| 982 | higher runtime cost; it isn’t simply overhead for little return. |
| 983 | |
| 984 | (Incidentally, these are essentially the same reasons why we no longer |
| 985 | talk about the `crun` tool underpinning Podman in this document. It’s |
| 986 | even more limited, making it even more difficult to administer while |
| 987 | providing no runtime size advantage. The `runc` tool underpinning |
| 988 | Docker is even worse on this score, being scarcely easier to use than |
| 989 | `crun` while having a much larger footprint.) |
| 990 | |
| 991 | [ndcmp]: https://wiki.archlinux.org/title/systemd-networkd#Usage_with_containers |
| 992 | |
| 993 | |
| 994 | ### 6.3.3 <a id="nspawn-assumptions"></a>Violated Assumptions |
| 995 | |
| 996 | The `systemd-container` infrastructure has a bunch of hard-coded |
| 997 | assumptions baked into it. We papered over these problems above, |
| 998 | but if you’re using these tools for other purposes on the machine |
| 999 | you’re serving Fossil from, you may need to know which assumptions |
| 1000 | our container violates and the resulting consequences. |
| 1001 | |
| 1002 | Some of it we discussed above already, but there’s one big class of |
| 1003 | problems we haven’t covered yet. It stems from the fact that our stock |
| 1004 | container starts a single static executable inside a barebones container |
| 1005 | rather than “boot” an OS image. That causes a bunch of commands to fail: |
| 1006 | |
| 1007 | * **`machinectl poweroff`** will fail because the container |
| 1008 | isn’t running dbus. |
| 1009 | |
| 1010 | * **`machinectl start`** will try to find an `/sbin/init` |
| 1011 | program in the rootfs, which we haven’t got. We could |
| 1012 | rename `/jail/bin/fossil` to `/sbin/init` and then hack |
| 1013 | the chroot scheme to match, but ick. (This, incidentally, |
| 1014 | is why we set `ProcessTwo=yes` above even though Fossil is |
| 1015 | perfectly capable of running as PID 1, a fact we depend on |
| 1016 | in the other methods above.) |
| 1017 | |
| 1018 | * **`machinectl shell`** will fail because there is no login |
| 1019 | daemon running, which we purposefully avoided adding by |
| 1020 | creating a “`FROM scratch`” container. (If you need a |
| 1021 | shell, say: `sudo systemd-nspawn --machine=myproject /bin/sh`) |
| 1022 | |
| 1023 | * **`machinectl status`** won’t give you the container logs |
| 1024 | because we disabled the shared journal, which was in turn |
| 1025 | necessary because we don’t run `systemd` *inside* the |
| 1026 | container, just outside. |
| 1027 | |
| 1028 | If these are problems for you, you may wish to build a |
| 1029 | fatter container using `debootstrap` or similar. ([External |
| 1030 | tutorial][medtut].) |
| 1031 | |
| 1032 | [medtut]: https://medium.com/@huljar/setting-up-containers-with-systemd-nspawn-b719cff0fb8d |
| 1033 | |
| 1034 | <div style="height:50em" id="this-space-intentionally-left-blank"></div> |
| 1035 |