Fossil SCM

fossil-scm / www / aboutcgi.wiki

Source Rendered

Blame History Raw 302 lines

1	`<title>How CGI Works In Fossil</title>`
2
3	`<h2>Introduction</h2>`
4
5	`CGI or "Common Gateway Interface" is a venerable yet reliable technique for`
6	`generating dynamic web content. This article gives a quick background on how`
7	`CGI works and describes how Fossil can act as a CGI service.`
8
9	`This is a "how it works" guide. This document provides background`
10	`information on the CGI protocol so that you can better understand what`
11	`is going on behind the scenes. If you just want to set up Fossil`
12	`as a CGI server, see the [./server/ \| Fossil Server Setup] page. Or`
13	`if you want to develop CGI-based extensions to Fossil, see`
14	`the [./serverext.wiki\|CGI Server Extensions] page.`
15
16	`<h2>A Quick Review Of CGI</h2>`
17
18	`An HTTP request is a block of text that is sent by a client application`
19	`(usually a web browser) and arrives at the web server over a network`
20	`connection. The HTTP request contains a URL that describes the information`
21	`being requested. The URL in the HTTP request is typically the same URL`
22	`that appears in the URL bar at the top of the web browser that is making`
23	`the request. The URL might contain a "?" character followed by`
24	`query parameters. The HTTP will usually also contain other information`
25	`such as the name of the application that made the request, whether or`
26	`not the requesting application can accept a compressed reply, POST`
27	`parameters from forms, and so forth.`
28
29	`The job of the web server is to interpret the HTTP request and formulate`
30	`an appropriate reply.`
31	`The web server is free to interpret the HTTP request in any way it wants,`
32	`but most web servers follow a similar pattern, described below.`
33	`(Note: details may vary from one web server to another.)`
34
35	`Suppose the filename component of the URL in the HTTP request looks like this:`
36
37	`<pre>/one/two/timeline/four</pre>`
38
39	`Most web servers will search their content area for files that match`
40	`some prefix of the URL. The search starts with <b>/one</b>, then goes to`
41	`<b>/one/two</b>, then <b>/one/two/timeline</b>, and finally`
42	`<b>/one/two/timeline/four</b> is checked. The search stops at the first`
43	`match.`
44
45	`Suppose the first match is <b>/one/two</b>. If <b>/one/two</b> is an`
46	`ordinary file in the content area, then that file is returned as static`
47	`content. The "<b>/timeline/four</b>" suffix is silently ignored.`
48
49	`If <b>/one/two</b> is a CGI script (or program), then the web server`
50	`executes the <b>/one/two</b> script. The output generated by`
51	`the script is collected and repackaged as the HTTP reply.`
52
53	`Before executing the CGI script, the web server will set up various`
54	`environment variables with information useful to the CGI script:`
55	`<table>`
56	`<tr><th>Variable<th>Meaning`
57	`<tr><td>GATEWAY_INTERFACE<td>Always set to "CGI/1.0"`
58	`<tr><td>REQUEST_URI`
59	`<td>The input URL from the HTTP request.`
60	`<tr><td>SCRIPT_NAME`
61	`<td>The prefix of the input URL that matches the CGI script name.`
62	`In this example: "/one/two".`
63	`<tr><td>PATH_INFO`
64	`<td>The suffix of the URL beyond the name of the CGI script.`
65	`In this example: "timeline/four".`
66	`<tr><td>QUERY_STRING`
67	`<td>The query string that follows the "?" in the URL, if there is one.`
68	`</table>`
69
70	`There are other CGI environment variables beyond those listed above.`
71	`Many Fossil servers implement the`
72	`[https://fossil-scm.org/home/test-env/two/three?abc=xyz\|test-env]`
73	`webpage that shows some of the CGI environment`
74	`variables that Fossil pays attention to.`
75
76	`In addition to setting various CGI environment variables, if the HTTP`
77	`request contains POST content, then the web server relays the POST content`
78	`to standard input of the CGI script.`
79
80	`In summary, the task of the`
81	`CGI script is to read the various CGI environment variables and`
82	`the POST content on standard input (if any), figure out an appropriate`
83	`reply, then write that reply on standard output.`
84	`The web server will read the output from the CGI script, reformat it`
85	`into an appropriate HTTP reply, and relay the result back to the`
86	`requesting application.`
87	`The CGI script exits as soon as it generates a single reply.`
88	`The web server will (usually) persist and handle multiple HTTP requests,`
89	`but a CGI script handles just one HTTP request and then exits.`
90
91	`The above is a rough outline of how CGI works.`
92	`There are many details omitted from this brief discussion.`
93	`See other on-line CGI tutorials for further information.`
94
95	`<h2>How Fossil Acts As A CGI Program</h2>`
96
97	`An appropriate CGI script for running Fossil will look something`
98	`like the following:`
99
100	`<pre>`
101	`#!/usr/bin/fossil`
102	`repository: /home/www/repos/project.fossil`
103	`</pre>`
104
105	`The first line of the script is a`
106	`"[https://en.wikipedia.org/wiki/Shebang_%28Unix%29\|shebang]"`
107	`that tells the operating system what program to use as the interpreter`
108	`for this script. On unix, when you execute a script that starts with`
109	`a shebang, the operating system runs the program identified by the`
110	`shebang with a single argument that is the full pathname of the script`
111	`itself.`
112	`In our example, the interpreter is Fossil, and the argument might`
113	`be something like "/var/www/cgi-bin/one/two" (depending on how your`
114	`particular web server is configured).`
115
116	`The Fossil program that is run as the script interpreter`
117	`is the same Fossil that runs when`
118	`you type ordinary Fossil commands like "fossil sync" or "fossil commit".`
119	`But in this case, as soon as it launches, the Fossil program`
120	`recognizes that the GATEWAY_INTERFACE environment variable is`
121	`set to "CGI/1.0" and it therefore knows that it is being used as`
122	`CGI rather than as an ordinary command-line tool, and behaves accordingly.`
123
124	`When Fossil recognizes that it is being run as CGI, it opens and reads`
125	`the file identified by its sole argument (the file named by`
126	`<code>argv[1]</code>). In our example, the second line of that file`
127	`tells Fossil the location of the repository it will be serving.`
128	`Fossil then starts looking at the CGI environment variables to figure`
129	`out what web page is being requested, generates that one web page,`
130	`then exits.`
131
132	`Usually, the webpage being requested is the first term of the`
133	`PATH_INFO environment variable. (Exceptions to this rule are noted`
134	`in the sequel.) For our example, the first term of PATH_INFO`
135	`is "timeline", which means that Fossil will generate`
136	`the [/help/www/timeline\|/timeline] webpage.`
137
138	`With Fossil, terms of PATH_INFO beyond the webpage name are converted into`
139	`the "name" query parameter. Hence, the following two URLs mean`
140	`exactly the same thing to Fossil:`
141	`<ol type='A'>`
142	`<li> [https://fossil-scm.org/home/info/c14ecc43]`
143	`<li> [https://fossil-scm.org/home/info?name=c14ecc43]`
144	`</ol>`
145
146	`In both cases, the CGI script is called "/fossil". For case (A),`
147	`the PATH_INFO variable will be "info/c14ecc43" and so the`
148	`"[/help/www/info\|/info]" webpage will be generated and the suffix of`
149	`PATH_INFO will be converted into the "name" query parameter, which`
150	`identifies the artifact about which information is requested.`
151	`In case (B), the PATH_INFO is just "info", but the same "name"`
152	`query parameter is set explicitly by the URL itself.`
153
154	`<h2>Serving Multiple Fossil Repositories From One CGI Script</h2>`
155
156	`The previous example showed how to serve a single Fossil repository`
157	`using a single CGI script.`
158	`On a website that wants to serve multiple repositories, one could`
159	`simply create multiple CGI scripts, one script for each repository.`
160	`But it is also possible to serve multiple Fossil repositories from`
161	`a single CGI script.`
162
163	`If the CGI script for Fossil contains a "directory:" line instead of`
164	`a "repository:" line, then the argument to "directory:" is the name`
165	`of a directory that contains multiple repository files, each ending`
166	`with ".fossil". For example:`
167
168	`<pre>`
169	`#!/usr/bin/fossil`
170	`directory: /home/www/repos`
171	`</pre>`
172
173	`Suppose the /home/www/repos directory contains files named`
174	`<b>one.fossil</b>, <b>two.fossil</b>, and <b>subdir/three.fossil</b>.`
175	`Further suppose that the name of the CGI script (relative to the root`
176	`of the webserver document area) is "cgis/example2". Then to`
177	`see the timeline for the "three.fossil" repository, the URL would be:`
178
179	`<pre>`
180	`http://example.com/cgis/example2/subdir/three/timeline`
181	`</pre>`
182
183	`Here is what happens:`
184	`<ol>`
185	`<li> The input URI on the HTTP request is`
186	`<b>/cgis/example2/subdir/three/timeline</b>`
187	`<li> The web server searches prefixes of the input URI until it finds`
188	`the "cgis/example2" script. The web server then sets`
189	`PATH_INFO to the "subdir/three/timeline" suffix and invokes the`
190	`"cgis/example2" script.`
191	`<li> Fossil runs and sees the "directory:" line pointing to`
192	`"/home/www/repos". Fossil then starts pulling terms off the`
193	`front of the PATH_INFO looking for a repository. It first looks`
194	`at "/home/www/resps/subdir.fossil" but there is no such repository.`
195	`So then it looks at "/home/www/repos/subdir/three.fossil" and finds`
196	`a repository. The PATH_INFO is shortened by removing`
197	`"subdir/three/" leaving it at just "timeline".`
198	`<li> Fossil looks at the rest of PATH_INFO to see that the webpage`
199	`requested is "timeline".`
200	`</ol>`
201	`<a id="cgivar"></a>`
202
203	`The web server sets many environment variables in step 2 in addition`
204	`to just PATH_INFO. The following diagram shows a few of these variables`
205	`and their relationship to the request URL:`
206
207	`<verbatim type="pikchr">`
208	`charwid = 0.075`
209	`thickness = 0`
210
211	`SCHEME: box "https://" mono fit`
212	`DOMAIN: box "example.com" mono fit`
213	`SCRIPT: box "/cgis/example2" mono fit`
214	`PATH: box "/subdir/three/timeline" mono fit`
215	`QUERY: box "?c=55d7e1" mono fit`
216
217	`thickness = 0.01`
218
219	`DB: box at 0.3 below DOMAIN "HTTP_HOST" mono fit invis`
220	`SB: box at 0.3 below SCRIPT "SCRIPT_NAME" mono fit invis`
221	`PB: box at 0.3 below PATH "PATH_INFO" mono fit invis`
222	`QB: box at 0.3 below QUERY "QUERY_STRING" mono fit invis`
223	`RB: box at 0.5 above PATH "REQUEST_URI" mono fit invis`
224
225	`color = lightgray`
226
227	`box at SCHEME width SCHEME.width height SCHEME.height`
228	`line fill 0x7799CC behind QUERY \`
229	`from SCRIPT.nw \`
230	`to RB.sw \`
231	`to RB.se \`
232	`to QUERY.ne \`
233	`close`
234	`line fill 0x99CCFF behind DOMAIN \`
235	`from DOMAIN.nw \`
236	`to DOMAIN.sw \`
237	`to DB.n \`
238	`to DOMAIN.se \`
239	`to DOMAIN.ne \`
240	`close`
241	`line fill 0xCCEEFF behind SCRIPT \`
242	`from SCRIPT.nw \`
243	`to SCRIPT.sw \`
244	`to SB.n \`
245	`to SCRIPT.se \`
246	`to SCRIPT.ne \`
247	`close`
248	`line fill 0x99CCFF behind PATH \`
249	`from PATH.nw \`
250	`to PATH.sw \`
251	`to PB.n \`
252	`to PATH.se \`
253	`to PATH.ne \`
254	`close`
255	`line fill 0xCCEEFF behind QUERY \`
256	`from QUERY.nw \`
257	`to QUERY.sw \`
258	`to QB.n \`
259	`to QUERY.se \`
260	`to QUERY.ne \`
261	`close`
262	`</verbatim>`
263
264	`<h2>Additional CGI Script Options</h2>`
265
266	`The CGI script can have additional options used to fine-tune`
267	`Fossil's behavior. See the [./cgi.wiki\|CGI script documentation]`
268	`for details.`
269
270	`<h2>Additional Observations</h2>`
271	`<ol type="I">`
272	`<li><p>`
273	`Fossil does not distinguish between the various HTTP methods (GET, PUT,`
274	`DELETE, etc). Fossil figures out what it needs to do purely from the`
275	`webpage term of the URI.</p></li>`
276	`<li><p>`
277	`Fossil does not distinguish between query parameters that are part of the`
278	`URI, application/x-www-form-urlencoded or multipart/form-data encoded`
279	`parameter that are part of the POST content, and cookies. Each information`
280	`source is seen as a space of key/value pairs which are loaded into an`
281	`internal property hash table. The code that runs to generate the reply`
282	`can then reference various properties values.`
283	`Fossil does not care where the value of each property comes from (POST`
284	`content, cookies, or query parameters) only that the property exists`
285	`and has a value.</p></li>`
286	`<li><p>`
287	`The "[/help/ui\|fossil ui]" and "[/help/server\|fossil server]" commands`
288	`are implemented using a simple built-in web server that accepts incoming HTTP`
289	`requests, translates each request into a CGI invocation, then creates a`
290	`separate child Fossil process to handle each request. In other words, CGI`
291	`is used internally to implement "fossil ui/server".`
292	`<br><br>`
293	`SCGI is processed using the same built-in web server, just modified`
294	`to parse SCGI requests instead of HTTP requests. Each SCGI request is`
295	`converted into CGI, then Fossil creates a separate child Fossil`
296	`process to handle each CGI request.</p></li>`
297	`<li><p>`
298	`Fossil is itself often launched using CGI. But Fossil can also then`
299	`turn around and launch [./serverext.wiki\|sub-CGI scripts to implement`
300	`extensions].</p></li>`
301	`</ol>`
302

Fossil SCM

Keyboard Shortcuts