Fossil SCM
Expanded the paragraph on WAL mode interactions in the container doc into a full section, placed higher up, immediately after the first use of Docker's "--volume" flag, to explain why we don't map just the repo DB file, but the whole directory it sits in. Even if we later convince ourselves WAL is safe under this scenario, it'll be conditional at best, so some remnant of this section must remain, no matter which way the experiments go.
Commit
698587d41d0110b08b23bd007d510ab624a603d0b7fefae160d391cc70bb011d
Parent
b0c9c26a9c7fadb…
1 file changed
+80
-18
+80
-18
| --- www/containers.md | ||
| +++ www/containers.md | ||
| @@ -120,33 +120,104 @@ | ||
| 120 | 120 | destroyed, too. The solution is to replace the “run” command above with |
| 121 | 121 | the following: |
| 122 | 122 | |
| 123 | 123 | ``` |
| 124 | 124 | $ docker run \ |
| 125 | - --name fossil-bind-mount -p 9999:8080 \ | |
| 126 | - -v ~/museum:/jail/museum fossil | |
| 125 | + --publish 9999:8080 \ | |
| 126 | + --name fossil-bind-mount \ | |
| 127 | + --volume ~/museum:/jail/museum \ | |
| 128 | + fossil | |
| 127 | 129 | ``` |
| 128 | 130 | |
| 129 | 131 | Because this bind mount maps a host-side directory (`~/museum`) into the |
| 130 | 132 | container, you don’t need to `docker cp` the repo into the container at |
| 131 | 133 | all. It still expects to find the repository as `repo.fossil` under that |
| 132 | -directory, but now both the host and the container can see that file. | |
| 133 | -(Beware: This may create a [risk of data corruption][dbcorr] due to | |
| 134 | -SQLite locking issues if you try to modify the DB from both sides at | |
| 135 | -once.) | |
| 134 | +directory, but now both the host and the container can see that repo DB. | |
| 136 | 135 | |
| 137 | 136 | Instead of a bind mount, you could instead set up a separate [Docker |
| 138 | 137 | volume](https://docs.docker.com/storage/volumes/), at which point you |
| 139 | 138 | _would_ need to `docker cp` the repo file into the container. |
| 140 | 139 | |
| 141 | 140 | Either way, files in these mounted directories have a lifetime |
| 142 | 141 | independent of the container(s) they’re mounted into. When you need to |
| 143 | 142 | rebuild the container or its underlying image — such as to upgrade to a |
| 144 | 143 | newer version of Fossil — the external directory remains behind and gets |
| 145 | -remapped into the new container when you recreate it with `-v`. | |
| 144 | +remapped into the new container when you recreate it with `--volume/-v`. | |
| 146 | 145 | |
| 147 | -[dbcorr]: https://www.sqlite.org/howtocorrupt.html | |
| 146 | + | |
| 147 | +#### 2.2.1 <a id="wal-mode"></a>WAL Mode Interactions | |
| 148 | + | |
| 149 | +You might be aware that OCI containers allow mapping a single file into | |
| 150 | +the repository rather than a whole directory. Since Fossil repositories | |
| 151 | +are specially-formatted SQLite databases, you might be wondering why we | |
| 152 | +don’t say things like: | |
| 153 | + | |
| 154 | +``` | |
| 155 | + --volume ~/museum/my-project.fossil:/jail/museum/repo.fossil | |
| 156 | +``` | |
| 157 | + | |
| 158 | +That lets us have a convenient file name for the project outside the | |
| 159 | +container while letting the configuration inside the container refer to | |
| 160 | +the generic “`/museum/repo.fossil`” name. Why should we have to rename | |
| 161 | +the container generically on the outside just to placate the container? | |
| 162 | + | |
| 163 | +The reason is, you might be serving that repo with [WAL mode][wal] | |
| 164 | +enabled. If you map the repo DB alone into the container, the Fossil | |
| 165 | +instance inside the container will write the `-journal` and `-wal` files | |
| 166 | +alongside the mapped-in repository inside the container. That’s fine as | |
| 167 | +far as it goes, but if you then try using the same repo DB from outside | |
| 168 | +the container while there’s an active WAL, the Fossil instance outside | |
| 169 | +won’t know about it. It will think it needs to write *its own* | |
| 170 | +`-journal` and `-wal` files *outside* the container, creating a high | |
| 171 | +risk of [database corruption][dbcorr]. | |
| 172 | + | |
| 173 | +If we map a whole directory, both sides see the same set of WAL files, | |
| 174 | +so there is at least a *hope* that WAL will work properly across that | |
| 175 | +boundary. The success of the scheme depends on the `mmap()` and shared | |
| 176 | +memory system calls being coordinated properly by the OS kernel the two | |
| 177 | +worlds share. | |
| 178 | + | |
| 179 | +At some point, someone should perform tests in the hopes of *failing* to | |
| 180 | +create database corruption in this scenario. | |
| 181 | + | |
| 182 | +Why the tortured grammar? Because you cannot prove a negative, being in | |
| 183 | +this case “SQLite will not corrupt the database in WAL mode if there’s a | |
| 184 | +container barrier in the way.” All you can prove is that a given test | |
| 185 | +didn’t cause corruption. With enough tests of sufficient power, you can | |
| 186 | +begin to make definitive statements, but even then, science is always | |
| 187 | +provisional, awaiting a single disproving experiment. Atop that, OCI | |
| 188 | +container runtimes give the sysadmin freedom to impose barriers between | |
| 189 | +the two worlds, so even if you convince yourself that WAL mode is safe | |
| 190 | +in a given setup, it’s possible to configure it to fail. As if that | |
| 191 | +weren’t enough, different container runtimes have different defaults, | |
| 192 | +including details like whether shared memory is truly shared between | |
| 193 | +the host and its containers. | |
| 194 | + | |
| 195 | +Until someone gets around to establishing this ground truth and scoping | |
| 196 | +its applicable range, my advice to those who want to use WAL mode on | |
| 197 | +containerized servers is to map the whole directory as shown in these | |
| 198 | +examples, but then isolate the two sides with a secondary clone. On the | |
| 199 | +outside, you say something like this: | |
| 200 | + | |
| 201 | +``` | |
| 202 | + $ fossil clone https://[email protected]/myproject ~/museum/myproject.fossil | |
| 203 | +``` | |
| 204 | + | |
| 205 | +That lands you with two side-by-side clones of the repository on the | |
| 206 | +server: | |
| 207 | + | |
| 208 | +``` | |
| 209 | + ~/museum/myproject.fossil ← local-use clone | |
| 210 | + ~/museum/myproject/repo.fossil ← served by container only | |
| 211 | +``` | |
| 212 | + | |
| 213 | +You open the secondary clone for local use, not the one being served by | |
| 214 | +the container. When you commit, Fossil’s autosync feature pushes the | |
| 215 | +change up through the HTTPS link to land safely inside the container. | |
| 216 | + | |
| 217 | +[dbcorr]: https://www.sqlite.org/howtocorrupt.html#_deleting_a_hot_journal | |
| 218 | +[wal]: https://www.sqlite.org/wal.html | |
| 148 | 219 | |
| 149 | 220 | |
| 150 | 221 | ## 3. <a id="security"></a>Security |
| 151 | 222 | |
| 152 | 223 | ### 3.1 <a id="chroot"></a>Why Chroot? |
| @@ -734,23 +805,14 @@ | ||
| 734 | 805 | $ make reconfig # re-generate Dockerfile from the changed .in file |
| 735 | 806 | $ docker build -t fossil:nojail . |
| 736 | 807 | $ docker create \ |
| 737 | 808 | --name fossil-nojail \ |
| 738 | 809 | --publish 127.0.0.1:9999:8080 \ |
| 739 | - --volume ~/museum/my-project.fossil:/museum/repo.fossil \ | |
| 810 | + --volume ~/museum:/museum \ | |
| 740 | 811 | fossil:nojail |
| 741 | 812 | ``` |
| 742 | 813 | |
| 743 | -This shows a new trick: mapping a single file into the container, rather | |
| 744 | -than mapping a whole directory. That’s only suitable if you aren’t using | |
| 745 | -WAL mode on that repository, or you aren’t going to use that repository | |
| 746 | -outside the container. It isn’t yet clear to me if WAL can work safely | |
| 747 | -across the container boundary, so for now, I advise that you either do | |
| 748 | -not use WAL mode with these containers, or that you clone the repository | |
| 749 | -locally for use outside the container and rely on Fossil’s autosync | |
| 750 | -feature to keep the two copies synchronized. | |
| 751 | - | |
| 752 | 814 | Do realize that by doing this, if an attacker ever managed to get shell |
| 753 | 815 | access on your container, they’d have a BusyBox installation to play |
| 754 | 816 | around in. That shouldn’t be enough to let them break out of the |
| 755 | 817 | container entirely, but they’ll have powerful tools like `wget`, and |
| 756 | 818 | they’ll be connected to the network the container runs on. Once the bad |
| 757 | 819 |
| --- www/containers.md | |
| +++ www/containers.md | |
| @@ -120,33 +120,104 @@ | |
| 120 | destroyed, too. The solution is to replace the “run” command above with |
| 121 | the following: |
| 122 | |
| 123 | ``` |
| 124 | $ docker run \ |
| 125 | --name fossil-bind-mount -p 9999:8080 \ |
| 126 | -v ~/museum:/jail/museum fossil |
| 127 | ``` |
| 128 | |
| 129 | Because this bind mount maps a host-side directory (`~/museum`) into the |
| 130 | container, you don’t need to `docker cp` the repo into the container at |
| 131 | all. It still expects to find the repository as `repo.fossil` under that |
| 132 | directory, but now both the host and the container can see that file. |
| 133 | (Beware: This may create a [risk of data corruption][dbcorr] due to |
| 134 | SQLite locking issues if you try to modify the DB from both sides at |
| 135 | once.) |
| 136 | |
| 137 | Instead of a bind mount, you could instead set up a separate [Docker |
| 138 | volume](https://docs.docker.com/storage/volumes/), at which point you |
| 139 | _would_ need to `docker cp` the repo file into the container. |
| 140 | |
| 141 | Either way, files in these mounted directories have a lifetime |
| 142 | independent of the container(s) they’re mounted into. When you need to |
| 143 | rebuild the container or its underlying image — such as to upgrade to a |
| 144 | newer version of Fossil — the external directory remains behind and gets |
| 145 | remapped into the new container when you recreate it with `-v`. |
| 146 | |
| 147 | [dbcorr]: https://www.sqlite.org/howtocorrupt.html |
| 148 | |
| 149 | |
| 150 | ## 3. <a id="security"></a>Security |
| 151 | |
| 152 | ### 3.1 <a id="chroot"></a>Why Chroot? |
| @@ -734,23 +805,14 @@ | |
| 734 | $ make reconfig # re-generate Dockerfile from the changed .in file |
| 735 | $ docker build -t fossil:nojail . |
| 736 | $ docker create \ |
| 737 | --name fossil-nojail \ |
| 738 | --publish 127.0.0.1:9999:8080 \ |
| 739 | --volume ~/museum/my-project.fossil:/museum/repo.fossil \ |
| 740 | fossil:nojail |
| 741 | ``` |
| 742 | |
| 743 | This shows a new trick: mapping a single file into the container, rather |
| 744 | than mapping a whole directory. That’s only suitable if you aren’t using |
| 745 | WAL mode on that repository, or you aren’t going to use that repository |
| 746 | outside the container. It isn’t yet clear to me if WAL can work safely |
| 747 | across the container boundary, so for now, I advise that you either do |
| 748 | not use WAL mode with these containers, or that you clone the repository |
| 749 | locally for use outside the container and rely on Fossil’s autosync |
| 750 | feature to keep the two copies synchronized. |
| 751 | |
| 752 | Do realize that by doing this, if an attacker ever managed to get shell |
| 753 | access on your container, they’d have a BusyBox installation to play |
| 754 | around in. That shouldn’t be enough to let them break out of the |
| 755 | container entirely, but they’ll have powerful tools like `wget`, and |
| 756 | they’ll be connected to the network the container runs on. Once the bad |
| 757 |
| --- www/containers.md | |
| +++ www/containers.md | |
| @@ -120,33 +120,104 @@ | |
| 120 | destroyed, too. The solution is to replace the “run” command above with |
| 121 | the following: |
| 122 | |
| 123 | ``` |
| 124 | $ docker run \ |
| 125 | --publish 9999:8080 \ |
| 126 | --name fossil-bind-mount \ |
| 127 | --volume ~/museum:/jail/museum \ |
| 128 | fossil |
| 129 | ``` |
| 130 | |
| 131 | Because this bind mount maps a host-side directory (`~/museum`) into the |
| 132 | container, you don’t need to `docker cp` the repo into the container at |
| 133 | all. It still expects to find the repository as `repo.fossil` under that |
| 134 | directory, but now both the host and the container can see that repo DB. |
| 135 | |
| 136 | Instead of a bind mount, you could instead set up a separate [Docker |
| 137 | volume](https://docs.docker.com/storage/volumes/), at which point you |
| 138 | _would_ need to `docker cp` the repo file into the container. |
| 139 | |
| 140 | Either way, files in these mounted directories have a lifetime |
| 141 | independent of the container(s) they’re mounted into. When you need to |
| 142 | rebuild the container or its underlying image — such as to upgrade to a |
| 143 | newer version of Fossil — the external directory remains behind and gets |
| 144 | remapped into the new container when you recreate it with `--volume/-v`. |
| 145 | |
| 146 | |
| 147 | #### 2.2.1 <a id="wal-mode"></a>WAL Mode Interactions |
| 148 | |
| 149 | You might be aware that OCI containers allow mapping a single file into |
| 150 | the repository rather than a whole directory. Since Fossil repositories |
| 151 | are specially-formatted SQLite databases, you might be wondering why we |
| 152 | don’t say things like: |
| 153 | |
| 154 | ``` |
| 155 | --volume ~/museum/my-project.fossil:/jail/museum/repo.fossil |
| 156 | ``` |
| 157 | |
| 158 | That lets us have a convenient file name for the project outside the |
| 159 | container while letting the configuration inside the container refer to |
| 160 | the generic “`/museum/repo.fossil`” name. Why should we have to rename |
| 161 | the container generically on the outside just to placate the container? |
| 162 | |
| 163 | The reason is, you might be serving that repo with [WAL mode][wal] |
| 164 | enabled. If you map the repo DB alone into the container, the Fossil |
| 165 | instance inside the container will write the `-journal` and `-wal` files |
| 166 | alongside the mapped-in repository inside the container. That’s fine as |
| 167 | far as it goes, but if you then try using the same repo DB from outside |
| 168 | the container while there’s an active WAL, the Fossil instance outside |
| 169 | won’t know about it. It will think it needs to write *its own* |
| 170 | `-journal` and `-wal` files *outside* the container, creating a high |
| 171 | risk of [database corruption][dbcorr]. |
| 172 | |
| 173 | If we map a whole directory, both sides see the same set of WAL files, |
| 174 | so there is at least a *hope* that WAL will work properly across that |
| 175 | boundary. The success of the scheme depends on the `mmap()` and shared |
| 176 | memory system calls being coordinated properly by the OS kernel the two |
| 177 | worlds share. |
| 178 | |
| 179 | At some point, someone should perform tests in the hopes of *failing* to |
| 180 | create database corruption in this scenario. |
| 181 | |
| 182 | Why the tortured grammar? Because you cannot prove a negative, being in |
| 183 | this case “SQLite will not corrupt the database in WAL mode if there’s a |
| 184 | container barrier in the way.” All you can prove is that a given test |
| 185 | didn’t cause corruption. With enough tests of sufficient power, you can |
| 186 | begin to make definitive statements, but even then, science is always |
| 187 | provisional, awaiting a single disproving experiment. Atop that, OCI |
| 188 | container runtimes give the sysadmin freedom to impose barriers between |
| 189 | the two worlds, so even if you convince yourself that WAL mode is safe |
| 190 | in a given setup, it’s possible to configure it to fail. As if that |
| 191 | weren’t enough, different container runtimes have different defaults, |
| 192 | including details like whether shared memory is truly shared between |
| 193 | the host and its containers. |
| 194 | |
| 195 | Until someone gets around to establishing this ground truth and scoping |
| 196 | its applicable range, my advice to those who want to use WAL mode on |
| 197 | containerized servers is to map the whole directory as shown in these |
| 198 | examples, but then isolate the two sides with a secondary clone. On the |
| 199 | outside, you say something like this: |
| 200 | |
| 201 | ``` |
| 202 | $ fossil clone https://[email protected]/myproject ~/museum/myproject.fossil |
| 203 | ``` |
| 204 | |
| 205 | That lands you with two side-by-side clones of the repository on the |
| 206 | server: |
| 207 | |
| 208 | ``` |
| 209 | ~/museum/myproject.fossil ← local-use clone |
| 210 | ~/museum/myproject/repo.fossil ← served by container only |
| 211 | ``` |
| 212 | |
| 213 | You open the secondary clone for local use, not the one being served by |
| 214 | the container. When you commit, Fossil’s autosync feature pushes the |
| 215 | change up through the HTTPS link to land safely inside the container. |
| 216 | |
| 217 | [dbcorr]: https://www.sqlite.org/howtocorrupt.html#_deleting_a_hot_journal |
| 218 | [wal]: https://www.sqlite.org/wal.html |
| 219 | |
| 220 | |
| 221 | ## 3. <a id="security"></a>Security |
| 222 | |
| 223 | ### 3.1 <a id="chroot"></a>Why Chroot? |
| @@ -734,23 +805,14 @@ | |
| 805 | $ make reconfig # re-generate Dockerfile from the changed .in file |
| 806 | $ docker build -t fossil:nojail . |
| 807 | $ docker create \ |
| 808 | --name fossil-nojail \ |
| 809 | --publish 127.0.0.1:9999:8080 \ |
| 810 | --volume ~/museum:/museum \ |
| 811 | fossil:nojail |
| 812 | ``` |
| 813 | |
| 814 | Do realize that by doing this, if an attacker ever managed to get shell |
| 815 | access on your container, they’d have a BusyBox installation to play |
| 816 | around in. That shouldn’t be enough to let them break out of the |
| 817 | container entirely, but they’ll have powerful tools like `wget`, and |
| 818 | they’ll be connected to the network the container runs on. Once the bad |
| 819 |