Fossil SCM

fossil-scm / www / serverext.wiki
Source Blame History 375 lines
893fca3… drh 1 <title>CGI Server Extensions</title>
893fca3… drh 2
4130a22… drh 3 <h2>1.0 Introduction</h2>
4130a22… drh 4
f146e21… drh 5 If you have a [./server/|Fossil server] for your project,
43fb402… drh 6 you can add [./aboutcgi.wiki|CGI]
4130a22… drh 7 extensions to that server. These extensions work like
4130a22… drh 8 any other CGI program, except that they also have access to the Fossil
71499f1… drh 9 login information and can (optionally) leverage the "[./customskin.md|skins]"
71499f1… drh 10 of Fossil so that they appear to be more tightly integrated into the project.
893fca3… drh 11
fd1282e… drh 12 An example of where this is useful is the
4130a22… drh 13 [https://sqlite.org/src/ext/checklist|checklist application] on
4130a22… drh 14 the [https://sqlite.org/|SQLite] project. The checklist
893fca3… drh 15 helps the SQLite developers track which release tests have passed,
4130a22… drh 16 or failed, or are still to be done. The checklist program began as a
893fca3… drh 17 stand-alone CGI which kept its own private user database and implemented
4130a22… drh 18 its own permissions and login system and provided its own CSS. By
fd1282e… drh 19 converting checklist into a Fossil extension, the same login that works
4130a22… drh 20 for the [https://sqlite.org/src|main SQLite source repository] also works
4130a22… drh 21 for the checklist. Permission to change elements of the checklist
4130a22… drh 22 is tied on permission to check-in to the main source repository. And
4130a22… drh 23 the standard Fossil header menu and footer appear on each page of
4130a22… drh 24 the checklist.
4130a22… drh 25
4130a22… drh 26 <h2>2.0 How It Works</h2>
4130a22… drh 27
4130a22… drh 28 CGI Extensions are disabled by default.
893fca3… drh 29 An administrator activates the CGI extension mechanism by specifying
71499f1… drh 30 an "Extension Root Directory" or "extroot" as part of the
71499f1… drh 31 [./server/index.html|server setup].
fd1282e… drh 32 If the Fossil server is itself run as
fd1282e… drh 33 [./server/any/cgi.md|CGI], then add a line to the
fbc3b2f… drh 34 [./cgi.wiki#extroot|CGI script file] that says:
fbc3b2f… drh 35
8a1ba49… wyoung 36 <pre>
7f82b38… drh 37 extroot: <i>DIRECTORY</i>
8a1ba49… wyoung 38 </pre>
fd1282e… drh 39
fd1282e… drh 40 Or, if the Fossil server is being run using the
fd1282e… drh 41 "[./server/any/none.md|fossil server]" or
fd1282e… drh 42 "[./server/any/none.md|fossil ui]" or
fd1282e… drh 43 "[./server/any/inetd.md|fossil http]" commands, then add an extra
893fca3… drh 44 "--extroot <i>DIRECTORY</i>" option to that command.
893fca3… drh 45
4130a22… drh 46 The <i>DIRECTORY</i> is the DOCUMENT_ROOT for the CGI.
4130a22… drh 47 Files in the DOCUMENT_ROOT are accessed via URLs like this:
893fca3… drh 48
8a1ba49… wyoung 49 <pre>
7f82b38… drh 50 https://example-project.org/ext/<i>FILENAME</i>
8a1ba49… wyoung 51 </pre>
893fca3… drh 52
893fca3… drh 53 In other words, access files in DOCUMENT_ROOT by appending the filename
c64f28d… drh 54 relative to DOCUMENT_ROOT to the [/help/www/ext|/ext]
893fca3… drh 55 page of the Fossil server.
7f82b38… drh 56
7f82b38… drh 57 * Files that are readable but not executable are returned as static
7f82b38… drh 58 content.
7f82b38… drh 59
7f82b38… drh 60 * Files that are executable are run as CGI.
4130a22… drh 61
4130a22… drh 62 <h3>2.1 Example #1</h3>
4130a22… drh 63
4130a22… drh 64 The source code repository for SQLite is a Fossil server that is run
4130a22… drh 65 as CGI. The URL for the source code repository is [https://sqlite.org/src].
4130a22… drh 66 The CGI script looks like this:
4130a22… drh 67
8a1ba49… wyoung 68 <verbatim>
4130a22… drh 69 #!/usr/bin/fossil
4130a22… drh 70 repository: /fossil/sqlite.fossil
4130a22… drh 71 errorlog: /logs/errors.txt
4130a22… drh 72 extroot: /sqlite-src-ext
8a1ba49… wyoung 73 </verbatim>
4130a22… drh 74
4130a22… drh 75 The "extroot: /sqlite-src-ext" line tells Fossil that it should look for
4130a22… drh 76 extension CGIs in the /sqlite-src-ext directory. (All of this is happening
4130a22… drh 77 inside of a chroot jail, so putting the document root in a top-level
4130a22… drh 78 directory is a reasonable thing to do.)
4130a22… drh 79
4130a22… drh 80 When a URL like "https://sqlite.org/src/ext/checklist" is received by the
4130a22… drh 81 main webserver, it figures out that the /src part refers to the main
4130a22… drh 82 Fossil CGI script and so it runs that script. Fossil gets the remainder
4130a22… drh 83 of the URL to work with: "/ext/checklist". Fossil extracts the "/ext"
4130a22… drh 84 prefix and uses that to determine that this a CGI extension request.
4130a22… drh 85 Then it takes the leftover "/checklist" part and appends it to the
4130a22… drh 86 "extroot" to get the filename "/sqlite-src-ext/checklist". Fossil finds
4130a22… drh 87 that file to be executable, so it runs it as CGI and returns the result.
4130a22… drh 88
4130a22… drh 89 The /sqlite-src-ext/checklist file is a
4130a22… drh 90 [https://wapp.tcl.tk|Wapp program]. The current source code to the
4130a22… drh 91 this program can be seen at
b6c36e8… drh 92 [https://www.sqlite.org/src/ext/checklist/3070700/self] and
4130a22… drh 93 recent historical versions are available at
4130a22… drh 94 [https://sqlite.org/docsrc/finfo/misc/checklist.tcl] with
78819fd… drh 95 older legacy at [https://sqlite.org/checklistapp/timeline?n1=all]
4130a22… drh 96
0996347… wyoung 97 There is a cascade of CGIs happening here. The web server that receives
4130a22… drh 98 the initial HTTP request runs Fossil as a CGI based on the
4130a22… drh 99 "https://sqlite.org/src" portion of the URL. The Fossil instance then
4130a22… drh 100 runs the checklist sub-CGI based on the "/ext/checklists" suffix. The
4130a22… drh 101 output of the sub-CGI is read by Fossil and then relayed on to the
0996347… wyoung 102 main web server which in turn relays the result back to the original client.
4130a22… drh 103
4130a22… drh 104 <h3>2.2 Example #2</h3>
4130a22… drh 105
4130a22… drh 106 The [https://fossil-scm.org/home|Fossil self-hosting repository] is also
4130a22… drh 107 a CGI that looks like this:
4130a22… drh 108
8a1ba49… wyoung 109 <verbatim>
4130a22… drh 110 #!/usr/bin/fossil
4130a22… drh 111 repository: /fossil/fossil.fossil
4130a22… drh 112 errorlog: /logs/errors.txt
4130a22… drh 113 extroot: /fossil-extroot
8a1ba49… wyoung 114 </verbatim>
4130a22… drh 115
4130a22… drh 116 The extroot for this Fossil server is /fossil-extroot and in that directory
4130a22… drh 117 is an executable file named "fileup1" - another [https://wapp.tcl.tk|Wapp]
fbc3b2f… drh 118 script. (The extension mechanism is not required to use Wapp. You can use
4130a22… drh 119 any kind of program you like. But the creator of SQLite and Fossil is fond
4130a22… drh 120 of [https://www.tcl.tk|Tcl/Tk] and so he tends to gravitate toward Tcl-based
4130a22… drh 121 technologies like Wapp.) The fileup1 script is a demo program that lets
4130a22… drh 122 the user upload a file using a form, and then displays that file in the reply.
4130a22… drh 123 There is a link on the page that causes the fileup1 script to return a copy
4130a22… drh 124 of its own source-code, so you can see how it works.
4130a22… drh 125
7f82b38… drh 126 <h3>2.3 Example #3</h3>
7f82b38… drh 127
7f82b38… drh 128 For Fossil versions dated 2025-03-23 and later, the "--extpage FILENAME"
c64f28d… drh 129 option to the [/help/ui|fossil ui] command is a short cut that treats
7f82b38… drh 130 FILENAME as a CGI extension. When the ui command starts up a new web browser
7f82b38… drh 131 pages, it points that page to the FILENAME extension. So if FILENAME is
7f82b38… drh 132 a static content file (such as an HTML file or
7f82b38… drh 133 [/md_rules|Markdown] or [/wiki_rules|Wiki] document), then the
7f82b38… drh 134 rendered content of the file is displayed. Meanwhile, the user can be
7f82b38… drh 135 editing the source text for that document in a separate window, and
7f82b38… drh 136 periodically pressing "Reload" on the web browser to instantly view the
7f82b38… drh 137 rendered results.
7f82b38… drh 138
7f82b38… drh 139 For example, the author of this documentation page is running
7f82b38… drh 140 "<tt>fossil ui --extpage www/serverext.wiki</tt>" while editing this
7f82b38… drh 141 very paragraph, and presses Reload from time to time to view his
7f82b38… drh 142 edits.
7f82b38… drh 143
bd70ec5… drh 144 The same idea applies when developing new CGI applications using a script
7f82b38… drh 145 language (for example using [https://wapp.tcl.tk|Wapp]). Run the
7f82b38… drh 146 command "<tt>fossil ui --extpage SCRIPT</tt>" where SCRIPT is the name
7f82b38… drh 147 of the application script, while editing that script in a separate
7f82b38… drh 148 window, then press Reload periodically on the web browser to test the
7f82b38… drh 149 script.
7f82b38… drh 150
6eeb7ec… florian 151 <h2 id="cgi-inputs">3.0 CGI Inputs</h2>
4130a22… drh 152
4130a22… drh 153 The /ext extension mechanism is an ordinary CGI interface. Parameters
4130a22… drh 154 are passed to the CGI program using environment variables. The following
7f82b38… drh 155 standard CGI environment variables are supplied:
4130a22… drh 156
4130a22… drh 157 * AUTH_TYPE
4130a22… drh 158 * AUTH_CONTENT
4130a22… drh 159 * CONTENT_LENGTH
4130a22… drh 160 * CONTENT_TYPE
4130a22… drh 161 * DOCUMENT_ROOT
4130a22… drh 162 * GATEWAY_INTERFACE
f101e94… drh 163 * HTTPS
4130a22… drh 164 * HTTP_ACCEPT
4130a22… drh 165 * HTTP_ACCEPT_ENCODING
4130a22… drh 166 * HTTP_COOKIE
4130a22… drh 167 * HTTP_HOST
4130a22… drh 168 * HTTP_IF_MODIFIED_SINCE
4130a22… drh 169 * HTTP_IF_NONE_MATCH
4130a22… drh 170 * HTTP_REFERER
4130a22… drh 171 * HTTP_USER_AGENT
4130a22… drh 172 * PATH_INFO
4130a22… drh 173 * QUERY_STRING
4130a22… drh 174 * REMOTE_ADDR
4130a22… drh 175 * REMOTE_USER
4130a22… drh 176 * REQUEST_METHOD
282bdf0… drh 177 * REQUEST_SCHEME
4130a22… drh 178 * REQUEST_URI
4130a22… drh 179 * SCRIPT_DIRECTORY
4130a22… drh 180 * SCRIPT_FILENAME
4130a22… drh 181 * SCRIPT_NAME
4130a22… drh 182 * SERVER_NAME
4130a22… drh 183 * SERVER_PORT
4130a22… drh 184 * SERVER_PROTOCOL
6eeb7ec… florian 185 * SERVER_SOFTWARE
4130a22… drh 186
4130a22… drh 187 Do a web search for
4130a22… drh 188 "[https://duckduckgo.com/?q=cgi+environment_variables|cgi environment variables]"
4130a22… drh 189 to find more detail about what each of the above variables mean and how
4130a22… drh 190 they are used.
4130a22… drh 191 Live listings of the values of some or all of these environment variables
4130a22… drh 192 can be found at links like these:
4130a22… drh 193
5df726a… drh 194 * [https://fossil-scm.org/home/test-env]
fbc3b2f… drh 195 * [https://sqlite.org/src/ext/checklist/top/env]
4130a22… drh 196
fd1282e… drh 197 In addition to the standard CGI environment variables listed above,
4130a22… drh 198 Fossil adds the following:
4130a22… drh 199
4130a22… drh 200 * FOSSIL_CAPABILITIES
fbc3b2f… drh 201 * FOSSIL_NONCE
4130a22… drh 202 * FOSSIL_REPOSITORY
7b2b9d6… drh 203 * FOSSIL_URI
4130a22… drh 204 * FOSSIL_USER
4130a22… drh 205
4130a22… drh 206 The FOSSIL_USER string is the name of the logged-in user. This variable
4130a22… drh 207 is missing or is an empty string if the user is not logged in. The
fd1282e… drh 208 FOSSIL_CAPABILITIES string is a list of
779ddef… wyoung 209 [./caps/ref.html|Fossil capabilities] that
4130a22… drh 210 indicate what permissions the user has on the Fossil repository.
4130a22… drh 211 The FOSSIL_REPOSITORY environment variable gives the filename of the
7b2b9d6… drh 212 Fossil repository that is running. The FOSSIL_URI variable shows the
7b2b9d6… drh 213 prefix of the REQUEST_URI that is the Fossil CGI script, or is an
7b2b9d6… drh 214 empty string if Fossil is being run by some method other than CGI.
4130a22… drh 215
4130a22… drh 216 The [https://sqlite.org/src/ext/checklist|checklist application] uses the
4130a22… drh 217 FOSSIL_USER environment variable to determine the name of the user and
4130a22… drh 218 the FOSSIL_CAPABILITIES variable to determine if the user is allowed to
4130a22… drh 219 mark off changes to the checklist. Only users with check-in permission
4130a22… drh 220 to the Fossil repository are allowed to mark off checklist items. That
4130a22… drh 221 means that the FOSSIL_CAPABILITIES string must contain the letter "i".
4130a22… drh 222 Search for "FOSSIL_CAPABILITIES" in the
fbc3b2f… drh 223 [https://sqlite.org/src/ext/checklist/top/self|source listing] to see how
4130a22… drh 224 this happens.
fbc3b2f… drh 225
fbc3b2f… drh 226 If the CGI output is one of the forms for which Fossil inserts its own
fbc3b2f… drh 227 header and footer, then the inserted header will include a
fbc3b2f… drh 228 Content Security Policy (CSP) restriction on the use of javascript within
fd1282e… drh 229 the webpage. Any &lt;script&gt;...&lt;/script&gt; elements within the
fbc3b2f… drh 230 CGI output must include a nonce or else they will be suppressed by the
fbc3b2f… drh 231 web browser. The FOSSIL_NONCE variable contains the value of that nonce.
fbc3b2f… drh 232 So, in other words, to get javascript to work, it must be enclosed in:
fbc3b2f… drh 233
8a1ba49… wyoung 234 <verbatim>
fbc3b2f… drh 235 <script nonce='$FOSSIL_NONCE'>...</script>
8a1ba49… wyoung 236 </verbatim>
fbc3b2f… drh 237
5590fb9… stephan 238 Except, of course, the $FOSSIL_NONCE is replaced by the value of the
fbc3b2f… drh 239 FOSSIL_NONCE environment variable.
5590fb9… stephan 240
7874664… drh 241 <h3>3.1 Input Content</h3>
7874664… drh 242
4130a22… drh 243 If the HTTP request includes content (for example if this is a POST request)
4130a22… drh 244 then the CONTENT_LENGTH value will be positive and the data for the content
4130a22… drh 245 will be readable on standard input.
7874664… drh 246
e3fbbdc… stephan 247
4130a22… drh 248 <h2>4.0 CGI Outputs</h2>
4130a22… drh 249
4130a22… drh 250 CGI programs construct a reply by writing to standard output. The first
0996347… wyoung 251 few lines of output are parameters intended for the web server that invoked
4130a22… drh 252 the CGI. These are followed by a blank line and then the content.
4130a22… drh 253
4130a22… drh 254 Typical parameter output looks like this:
4130a22… drh 255
8a1ba49… wyoung 256 <verbatim>
b92e460… wyoung 257 Status: 200 OK
4130a22… drh 258 Content-Type: text/html
8a1ba49… wyoung 259 </verbatim>
4130a22… drh 260
4130a22… drh 261 CGI programs can return any content type they want - they are not restricted
4130a22… drh 262 to text replies. It is OK for a CGI program to return (for example)
4130a22… drh 263 image/png.
4130a22… drh 264
4130a22… drh 265 The fields of the CGI response header can be any valid HTTP header fields.
4130a22… drh 266 Those that Fossil does not understand are simply relayed back to up the
4130a22… drh 267 line to the requester.
4130a22… drh 268
4130a22… drh 269 Fossil takes special action with some content types. If the Content-Type
85c58af… drh 270 is "text/x-fossil-wiki" or "text/x-markdown" then Fossil
fd1282e… drh 271 converts the content from [/wiki_rules|Fossil-Wiki] or
4130a22… drh 272 [/md_rules|Markdown] into HTML, adding its
4130a22… drh 273 own header and footer text according to the repository skin. Content
4130a22… drh 274 of type "text/html" is normally passed straight through
4130a22… drh 275 unchanged. However, if the text/html content is of the form:
4130a22… drh 276
8a1ba49… wyoung 277 <verbatim>
4130a22… drh 278 <div class='fossil-doc' data-title='DOCUMENT TITLE'>
4130a22… drh 279 ... HTML content there ...
4130a22… drh 280 </div>
8a1ba49… wyoung 281 </verbatim>
4130a22… drh 282
4130a22… drh 283 In other words, if the outer-most markup of the HTML is a &lt;div&gt;
fd1282e… drh 284 element with a single class of "fossil-doc",
4130a22… drh 285 then Fossil will adds its own header and footer to the HTML. The
4130a22… drh 286 page title contained in the added header will be extracted from the
4130a22… drh 287 "data-title" attribute.
4130a22… drh 288
4130a22… drh 289 Except for the three cases noted above, Fossil makes no changes or
4130a22… drh 290 additions to the CGI-generated content. Fossil just passes the verbatim
4130a22… drh 291 content back up the stack towards the requester.
e3fbbdc… stephan 292
e3fbbdc… stephan 293 <h3>4.1 <tt>GATEWAY_INTERFACE</tt> and Recursive Calls to fossil</h3>
e3fbbdc… stephan 294
e3fbbdc… stephan 295 Like many CGI-aware applications, if fossil sees the environment
e3fbbdc… stephan 296 variable <tt>GATEWAY_INTERFACE</tt> when it starts up, it assumes it
e3fbbdc… stephan 297 is running in a CGI environment and behaves differently than when it
e3fbbdc… stephan 298 is run in a non-CGI interactive session. If you intend to run fossil
e3fbbdc… stephan 299 itself from within an extension CGI script, e.g. to run a query
e3fbbdc… stephan 300 against the repository or simply fetch the fossil binary version, make
e3fbbdc… stephan 301 sure to <em>unset</em> the <tt>GATEWAY_INTERFACE</tt> environment
e3fbbdc… stephan 302 variable before doing so, otherwise the invocation will behave as if
e3fbbdc… stephan 303 it's being run in CGI mode.
4130a22… drh 304
4130a22… drh 305 <h2>5.0 Filename Restrictions</h2>
4130a22… drh 306
4130a22… drh 307 For security reasons, Fossil places restrictions on the names of files
4130a22… drh 308 in the extroot directory that can participate in the extension CGI
4130a22… drh 309 mechanism:
4130a22… drh 310
4130a22… drh 311 1. Filenames must consist of only ASCII alphanumeric characters,
4130a22… drh 312 ".", "_", and "-", and of course "/" as the file separator.
4130a22… drh 313 Files with names that includes spaces or
4130a22… drh 314 other punctuation or special characters are ignored.
4130a22… drh 315
4130a22… drh 316 2. No element of the pathname can begin with "." or "-". Files or
4130a22… drh 317 directories whose names begin with "." or "-" are ignored.
4130a22… drh 318
4130a22… drh 319 If a CGI program requires separate data files, it is safe to put those
4130a22… drh 320 files in the same directory as the CGI program itself as long as the names
4130a22… drh 321 of the data files contain special characters that cause them to be ignored
4130a22… drh 322 by Fossil.
4130a22… drh 323
3089408… drh 324 <h2>6.0 Access Permissions</h2>
3089408… drh 325
3089408… drh 326 CGI extension files and programs are accessible to everyone.
3089408… drh 327
3089408… drh 328 When CGI extensions have been enabled (using either "extroot:" in the
3089408… drh 329 CGI file or the --extroot option for other server methods) all files
3089408… drh 330 in the extension root directory hierarchy, except special filenames
3089408… drh 331 identified previously, are accessible to all users. Users do not
3089408… drh 332 have to have "Read" privilege, or any other privilege, in order to
3089408… drh 333 access the extensions.
3089408… drh 334
3089408… drh 335 This is by design. The CGI extension mechanism is intended to operate
3089408… drh 336 in the same way as a traditional web-server.
3089408… drh 337
3089408… drh 338 CGI programs that want to restrict access
3089408… drh 339 can examine the FOSSIL_CAPABILITIES and/or FOSSIL_USER environment variables.
3089408… drh 340 In other words, access control is the responsibility of the individual
3089408… drh 341 extension programs.
3089408… drh 342
3481761… drh 343 <h3>6.1 Restricting Robot Access To Extensions</h3>
3481761… drh 344
3481761… drh 345 If the "ext" tag is found in the [/help/robot-restrict|robot-restrict setting]
3481761… drh 346 then clients are tested to see if they are robots before granting
3481761… drh 347 access to any extension. If the "ext" tag is omitted but a tag
3481761… drh 348 of the form "ext/PATH" is found on the robot-restrict setting, then
3481761… drh 349 robots are restricted from the particular extension at PATH.
3089408… drh 350
3089408… drh 351 <h2>7.0 Trouble-Shooting Hints</h2>
4130a22… drh 352
4130a22… drh 353 Remember that the /ext will return any file in the extroot directory
4130a22… drh 354 hierarchy as static content if the file is readable but not executable.
4130a22… drh 355 When initially setting up the /ext mechanism, it is sometimes helpful
7874664… drh 356 to verify that you are able to receive static content prior to starting
7874664… drh 357 work on your CGIs. Also remember that CGIs must be
4130a22… drh 358 executable files.
4130a22… drh 359
4130a22… drh 360 Fossil likes to run inside a chroot jail, and will automatically put
4130a22… drh 361 itself inside a chroot jail if it can. The sub-CGI program will also
4130a22… drh 362 run inside this same chroot jail. Make sure all embedded pathnames
4130a22… drh 363 have been adjusted accordingly and that all resources needed by the
4130a22… drh 364 CGI program are available within the chroot jail.
4130a22… drh 365
4130a22… drh 366 If anything goes wrong while trying to process an /ext page, Fossil
4130a22… drh 367 returns a 404 Not Found error with no details. However, if the requester
fd1282e… drh 368 is logged in as a user that has <b>[./caps/ref.html#D | Debug]</b> capability
4130a22… drh 369 then additional diagnostic information may be included in the output.
4130a22… drh 370
4130a22… drh 371 If the /ext page has a "fossil-ext-debug=1" query parameter and if
4130a22… drh 372 the requester is logged in as a user with Debug privilege, then the
4130a22… drh 373 CGI output is returned verbatim, as text/plain and with the original
b4ac00d… drh 374 header intact. This is useful for diagnosing problems with the
4130a22… drh 375 CGI script.

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button