Fossil SCM

Substantial and minor changes to the file globs document taking advice from Warren Young's email.

rberteig 2017-04-18 01:29 glob-docs
Commit 1239b6c47041082e13f7784a7476975dbcae3310bddd0f6a5cf9081dd860e8c2
1 file changed +186 -101
+186 -101
--- www/globs.md
+++ www/globs.md
@@ -1,113 +1,189 @@
11
File Name GLOB Patterns
22
=======================
33
4
-A number of settings (and options to certain commands as well as query
5
-parameters to certain pages) are documented as one or more GLOB
6
-patterns that will match files either on the disk or in the active
7
-checkout.
8
-
9
-A GLOB pattern is described as a pattern that matches file names, and
10
-some of the individual commands show examples of simple GLOBs. The
11
-examples show use of `*` as a wild card, and hint that more is
12
-possible.
13
-
14
-In many cases more than one GLOB may be specified as a comma or
15
-white space separated list of GLOB patterns. Several spots in the
16
-command help mention that GLOB patterns may be quoted with single or
17
-double quotes so that spaces and commas may be included in the pattern
18
-if needed.
19
-
20
-Outside of this document, only the source code contains the exact
21
-specification of the complete syntax of a GLOB pattern.
4
+A [glob pattern][glob] is a text expression that matches one or more
5
+file names using wild cards familiar to most users of a command line.
6
+For example, `*` is a glob that matches any name at all and
7
+`Readme.txt` is a glob that matches exactly one file. Note that
8
+although they are related, glob patterns are not the same thing as a
9
+[regular expression or regexp][regexp].
10
+
11
+[glob]: https://en.wikipedia.org/wiki/Glob_(programming) (Wikipedia)
12
+[regexp]: https://en.wikipedia.org/wiki/Regular_expression
13
+
14
+
15
+A number of fossil setting values hold one or more file glob patterns
16
+that will match files either on the disk or in the active checkout.
17
+Glob patterns are also accepted in options to certain commands as well
18
+as query parameters to certain pages.
19
+
20
+In many cases more than one glob may be specified in a setting,
21
+option, or query parameter by listing multiple globs separated by a
22
+comma or white space. If a glob must contain commas or spaces,
23
+surround it with single or double quotation marks.
24
+
25
+Of course, many fossil commands also accept lists of files to act on,
26
+and those also may be specified with globs. Although those glob
27
+patterns are similar to what is described here, they are not defined
28
+by fossil, but rather by the conventions of the operating system in
29
+use.
30
+
2231
2332
## Syntax
2433
25
- any Any character not mentioned matches exactly that character
34
+A list of glob patterns is simply one or more glob patterns separated
35
+by white space or commas. If a glob must contain white spaces or
36
+commas, it can be quoted with either single or double quotation marks.
37
+A list is said to match if any one (or more) globs in the list
38
+matches.
39
+
40
+A glob pattern is a collection of characters compared to a target
41
+text, usually a file name. The whole glob is said to match if it
42
+successfully consumes and matches the entire target text. Glob
43
+patterns are made up of ordinary characters and special characters.
44
+
45
+Ordinary characters consume a single character of the target and must
46
+match it exactly.
47
+
48
+Special characters (and special character sequences) consume zero or
49
+more characters from the target and describe what matches. The special
50
+characters (and sequences) are:
51
+
2652
* Matches any sequence of zero or more characters.
2753
? Matches exactly one character.
2854
[...] Matches one character from the enclosed list of characters.
2955
[^...] Matches one character not in the enclosed list.
3056
31
-Lists of characters have some additional features.
32
-
33
- * A range of characters may be specified with `-`, so `[a-d]` matches
34
- exactly the same characters as `[abcd]`.
35
- * Include `-` in a list by placing it last, just before the `]`.
36
- * Include `]` in a list by making the first character after the `[` or
37
- `[^`. At any other place, `]` ends the list.
38
- * Include `^` in a list by placing anywhere except first after the
39
- `[`.
40
-
41
-
42
-Some examples:
43
-
44
- [a-d] Matches any one of `a`, `b`, `c`, or `d`
45
- [a-] Matches either `a` or `-`
46
- [][] Matches either `]` or `[`
47
- [^]] Matches exactly one character other than `]`
48
- []^] Matches either `]` or `^`
49
-
50
-The glob is compared to the canonical name of the file in the checkout
51
-tree, and must match the entire name to be considered a match.
52
-
53
-Unlike typical Unix shell globs, wildcard sequences are allowed to
54
-match `/` directory separators as well as the initial `.` in the name
55
-of a hidden file or directory.
56
-
57
-A list of GLOBs is simply one or more GLOBs separated by whitespace or
58
-commas. If a GLOB must contain a space or comma, it can be quoted with
59
-either single or double quotation marks.
60
-
61
-Since a newline is considered to be whitespace, a list of GLOBs in a
62
-file (as for a versioned setting) may have one GLOB per line.
63
-
64
-
65
-## File names to match
66
-
67
-Before comparing to a GLOB pattern, each file name is transformed to a
68
-canonical form. Although the real process is more complicated, the
69
-canonical name of a file has all directory separators changed to `/`,
70
-and all `/./` and `/../` sequences removed. The goal is a name that is
71
-the simplest possible while still specific to each particular file.
72
-
73
-This has some consequences.
74
-
75
-The simplest GLOB pattern is just a bare name of a file named with the
76
-usual assortment of allowed file name characters. Such a pattern
77
-matches that one file: the GLOB `README` matches only a file named
78
-`README` in the root of the tree. The GLOB `*/README` would match a
79
-file named `README` anywhere except the root, since the glob requires
80
-that at least one `/` be in the name. (Recall that `/` matches the
81
-directory separator regardless of whether it is `/` or `\` on your
82
-system.)
83
-
84
-
85
-
86
-
87
-## Where are they used
88
-
89
-### Settings that use GLOBs
90
-
91
-These settings are all lists of GLOBs. All may be global, local, or
92
-versioned. Use `fossil settings` to manage global and local settings,
93
-or file in the repository's `.fossil-settings/` folder named for each
94
-for versioned setting.
57
+Special character sequences have some additional features:
58
+
59
+ * A range of characters may be specified with `-`, so `[a-d]` matches
60
+ exactly the same characters as `[abcd]`. Ranges reflect Unicode
61
+ code points without any locale-specific collation sequence.
62
+ * Include `-` in a list by placing it last, just before the `]`.
63
+ * Include `]` in a list by making the first character after the `[` or
64
+ `[^`. At any other place, `]` ends the list.
65
+ * Include `^` in a list by placing anywhere except first after the
66
+ `[`.
67
+ * Some examples of character lists:
68
+ `[a-d]` Matches any one of `a`, `b`, `c`, or `d` but not `ä`;
69
+ `[^a-d]` Matches exactly one character other than `a`, `b`, `c`,
70
+ or `d`;
71
+ `[0-9a-fA-F]` Matches exactly one hexadecimal digit;
72
+ `[a-]` Matches either `a` or `-`;
73
+ `[][]` Matches either `]` or `[`;
74
+ `[^]]` Matches exactly one character other than `]`;
75
+ `[]^]` Matches either `]` or `^`; and
76
+ `[^-]` Matches exactly one character other than `-`.
77
+ * Beware that ranges in lists may include more than you expect:
78
+ `[A-z]` Matches `A` and `Z`, but also matches `a` and some less
79
+ obvious characters such as `[`, `\`, and `]` with code point
80
+ values between `Z` and `a`.
81
+ * Beware that a range must be specified from low value to high
82
+ value: `[z-a]` does not match any character at all, preventing the
83
+ entire glob from matching.
84
+ * Note that unlike typical Unix shell globs, wildcards (`*`, `?`,
85
+ and character lists) are allowed to match `/` directory
86
+ separators as well as the initial `.` in the name of a hidden
87
+ file or directory.
88
+
89
+
90
+White space means the ASCII characters TAB, LF, VT, FF, CR, and SPACE.
91
+Note that this does not include any of the many additional spacing
92
+characters available in Unicode, and specifically does not include
93
+U+00A0 NO-BREAK SPACE.
94
+
95
+Because both LF and CR are white space and leading and trailing spaces
96
+are stripped from each glob in a list, a list of globs may be broken
97
+into lines between globs when the list is stored in a file (as for a
98
+versioned setting).
99
+
100
+Similarly 'single quotes' and "double quotes" are the ASCII straight
101
+quote characters, not any of the other quotation marks provided in
102
+Unicode and specifically not the "curly" quotes preferred by
103
+typesetters and word processors.
104
+
105
+
106
+## File Names to Match
107
+
108
+Before it is compared to a glob pattern, each file name is transformed
109
+to a canonical form. The glob must match the entire canonical file
110
+name to be considered a match.
111
+
112
+The canonical name of a file has all directory separators changed to
113
+`/`, redundant slashes are removed, all `.` path components are
114
+removed, and all `..` path components are resolved. (There are
115
+additional details we won’t go into here.)
116
+
117
+The goal is a name that is the simplest possible for each particular
118
+file, and will be the same on Windows, Unix, and any other platform
119
+where fossil is run.
120
+
121
+Beware, however, that all glob matching is case sensitive. This will
122
+not be a surprise on Unix where all file names are also case
123
+sensitive. However, most Windows file systems are case preserving and
124
+case insensitive. On Windows, the names `ReadMe` and `README` are
125
+names of the same file; on Unix they are different files.
126
+
127
+Some example cases:
128
+
129
+ * The glob `README` matches only a file named `README` in the root of
130
+ the tree. It does not match a file named `src/README` because it
131
+ does not include any characters that consumed the `src/` part.
132
+ * The glob `*/README` does match `src/README`. Unlike Unix file
133
+ globs, it also matches `src/library/README`. However it does not
134
+ match the file `README` in the root of the tree.
135
+ * The glob `src/README` does match the file named `src\README` on
136
+ Windows because all directory separators are rewritten as `/` in
137
+ the canonical name before the glob is matched. This makes it much
138
+ easier to write globs that work on both Unix and Windows.
139
+ * The glob `*.[ch]` matches every C source or header file in the
140
+ tree at the root or at any depth. Again, this is (deliberately)
141
+ different from Unix file globs and Windows wild cards.
142
+
143
+
144
+
145
+## Where Globs are Used
146
+
147
+### Settings that are Globs
148
+
149
+These settings are all lists of glob patterns:
95150
96151
* `binary-glob`
97152
* `clean-glob`
98153
* `crlf-glob`
99154
* `crnl-glob`
100155
* `encoding-glob`
101156
* `ignore-glob`
102157
* `keep-glob`
103158
159
+All may be [versioned, local, or global][settings]. Use `fossil
160
+settings` to manage local and global settings, or a file in the
161
+repository's `.fossil-settings/` folder at the root of the tree named
162
+for each for versioned setting.
163
+
164
+ [settings]: /doc/trunk/www/settings.wiki
165
+
166
+Using versioned settings for these not only has the advantage that
167
+they are tracked in the repository just like the rest of your project,
168
+but you can more easily keep longer lists of more complicated glob
169
+patterns than would be practical in either local or global settings.
170
+
171
+The `ignore-glob` is an example of one setting that frequently grows
172
+to be an elaborate list of files that should be ignored by most
173
+commands. This is especially true when one (or more) IDEs are used in
174
+a project because each IDE has its own ideas of how and where to cache
175
+information that speeds up its browsing and building tasks but which
176
+need not be preserved in your project's history.
177
+
104178
105
-### Commands that refer to GLOBs
179
+### Commands that Refer to Globs
106180
107
-Many of the commands that respect the settings containing GLOBs have
108
-options to override some or all of the settings.
181
+Many of the commands that respect the settings containing globs have
182
+options to override some or all of the settings. These options are
183
+usually named to correspond to the setting they override, such as
184
+`--ignore` to override the `ignore-glob` setting. These commands are:
109185
110186
* `add`
111187
* `addremove`
112188
* `changes`
113189
* `clean`
@@ -115,23 +191,24 @@
115191
* `merge`
116192
* `settings`
117193
* `status`
118194
* `unset`
119195
120
-The commands `tarball` and `zip` produce compressed archives of a specific
121
-checkin. They may be further restricted by options that specify GLOBs
122
-that name files to include or exclude rather than taking the entire
123
-checkin.
124
-
125
-The commands `http`, `cgi`, `server`, and `ui` that implement or support with web servers
126
-provide a mechanism to name some files to serve with static content
127
-where a list of GLOBs specifies what content may be served.
196
+The commands `tarball` and `zip` produce compressed archives of a
197
+specific checkin. They may be further restricted by options that
198
+specify glob patterns that name files to include or exclude rather
199
+than archiving the entire checkin.
200
+
201
+The commands `http`, `cgi`, `server`, and `ui` that implement or
202
+support with web servers provide a mechanism to name some files to
203
+serve with static content where a list of GLOBs specifies what content
204
+may be served.
128205
129206
130207
### Web pages that refer to GLOBs
131208
132
-The /timeline page supports a query parameter that names a GLOB of
209
+The `/timeline` page supports a query parameter that names a GLOB of
133210
files to focus the timeline on. It also can use `GLOB`, `LIKE`, or
134211
`REGEXP` matching on tag names, where each is implemented by the
135212
corresponding operator in [SQLite][].
136213
137214
The pages `/tarball` and `/zip` generate compressed archives of a
@@ -203,15 +280,23 @@
203280
all the files.
204281
205282
206283
## Implementation
207284
208
-Most of the implementation of GLOB handling is found in
209
-[`src/glob.c`][glob.c].
285
+Most of the implementation of glob pattern handling in fossil is found
286
+in [`src/glob.c`][glob.c]. The canonical name of a file is implemented
287
+in [`src/file.c`][file.c]. Each command that references a glob
288
+constructs the target text from information specific to that command.
210289
211
-The actual matching is implemented in SQL, so the documentation for
212
-`GLOB` and the other string matching operators in [SQLite][] is
213
-useful.
214290
215291
[glob.c]: https://www.fossil-scm.org/index.html/file/src/glob.c
216
-[SQLite]: https://sqlite.org/lang_expr.html#like
292
+[file.c]: https://www.fossil-scm.org/index.html/file/src/file.c
293
+
294
+The actual matching is implemented in SQL, so the documentation for
295
+`GLOB` and the other string matching operators in [SQLite]
296
+(https://sqlite.org/lang_expr.html#like) is useful. Of course, the
297
+SQLite source code and test harnesses also make entertaining reading:
217298
299
+ * `src/func.c` [lines 570-768]
300
+ (https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768)
301
+ * `test/expr.test` [lines 586-673]
302
+ (https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673)
218303
--- www/globs.md
+++ www/globs.md
@@ -1,113 +1,189 @@
1 File Name GLOB Patterns
2 =======================
3
4 A number of settings (and options to certain commands as well as query
5 parameters to certain pages) are documented as one or more GLOB
6 patterns that will match files either on the disk or in the active
7 checkout.
8
9 A GLOB pattern is described as a pattern that matches file names, and
10 some of the individual commands show examples of simple GLOBs. The
11 examples show use of `*` as a wild card, and hint that more is
12 possible.
13
14 In many cases more than one GLOB may be specified as a comma or
15 white space separated list of GLOB patterns. Several spots in the
16 command help mention that GLOB patterns may be quoted with single or
17 double quotes so that spaces and commas may be included in the pattern
18 if needed.
19
20 Outside of this document, only the source code contains the exact
21 specification of the complete syntax of a GLOB pattern.
 
 
 
 
 
 
 
 
 
22
23 ## Syntax
24
25 any Any character not mentioned matches exactly that character
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26 * Matches any sequence of zero or more characters.
27 ? Matches exactly one character.
28 [...] Matches one character from the enclosed list of characters.
29 [^...] Matches one character not in the enclosed list.
30
31 Lists of characters have some additional features.
32
33 * A range of characters may be specified with `-`, so `[a-d]` matches
34 exactly the same characters as `[abcd]`.
35 * Include `-` in a list by placing it last, just before the `]`.
36 * Include `]` in a list by making the first character after the `[` or
37 `[^`. At any other place, `]` ends the list.
38 * Include `^` in a list by placing anywhere except first after the
39 `[`.
40
41
42 Some examples:
43
44 [a-d] Matches any one of `a`, `b`, `c`, or `d`
45 [a-] Matches either `a` or `-`
46 [][] Matches either `]` or `[`
47 [^]] Matches exactly one character other than `]`
48 []^] Matches either `]` or `^`
49
50 The glob is compared to the canonical name of the file in the checkout
51 tree, and must match the entire name to be considered a match.
52
53 Unlike typical Unix shell globs, wildcard sequences are allowed to
54 match `/` directory separators as well as the initial `.` in the name
55 of a hidden file or directory.
56
57 A list of GLOBs is simply one or more GLOBs separated by whitespace or
58 commas. If a GLOB must contain a space or comma, it can be quoted with
59 either single or double quotation marks.
60
61 Since a newline is considered to be whitespace, a list of GLOBs in a
62 file (as for a versioned setting) may have one GLOB per line.
63
64
65 ## File names to match
66
67 Before comparing to a GLOB pattern, each file name is transformed to a
68 canonical form. Although the real process is more complicated, the
69 canonical name of a file has all directory separators changed to `/`,
70 and all `/./` and `/../` sequences removed. The goal is a name that is
71 the simplest possible while still specific to each particular file.
72
73 This has some consequences.
74
75 The simplest GLOB pattern is just a bare name of a file named with the
76 usual assortment of allowed file name characters. Such a pattern
77 matches that one file: the GLOB `README` matches only a file named
78 `README` in the root of the tree. The GLOB `*/README` would match a
79 file named `README` anywhere except the root, since the glob requires
80 that at least one `/` be in the name. (Recall that `/` matches the
81 directory separator regardless of whether it is `/` or `\` on your
82 system.)
83
84
85
86
87 ## Where are they used
88
89 ### Settings that use GLOBs
90
91 These settings are all lists of GLOBs. All may be global, local, or
92 versioned. Use `fossil settings` to manage global and local settings,
93 or file in the repository's `.fossil-settings/` folder named for each
94 for versioned setting.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
96 * `binary-glob`
97 * `clean-glob`
98 * `crlf-glob`
99 * `crnl-glob`
100 * `encoding-glob`
101 * `ignore-glob`
102 * `keep-glob`
103
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
105 ### Commands that refer to GLOBs
106
107 Many of the commands that respect the settings containing GLOBs have
108 options to override some or all of the settings.
 
 
109
110 * `add`
111 * `addremove`
112 * `changes`
113 * `clean`
@@ -115,23 +191,24 @@
115 * `merge`
116 * `settings`
117 * `status`
118 * `unset`
119
120 The commands `tarball` and `zip` produce compressed archives of a specific
121 checkin. They may be further restricted by options that specify GLOBs
122 that name files to include or exclude rather than taking the entire
123 checkin.
124
125 The commands `http`, `cgi`, `server`, and `ui` that implement or support with web servers
126 provide a mechanism to name some files to serve with static content
127 where a list of GLOBs specifies what content may be served.
 
128
129
130 ### Web pages that refer to GLOBs
131
132 The /timeline page supports a query parameter that names a GLOB of
133 files to focus the timeline on. It also can use `GLOB`, `LIKE`, or
134 `REGEXP` matching on tag names, where each is implemented by the
135 corresponding operator in [SQLite][].
136
137 The pages `/tarball` and `/zip` generate compressed archives of a
@@ -203,15 +280,23 @@
203 all the files.
204
205
206 ## Implementation
207
208 Most of the implementation of GLOB handling is found in
209 [`src/glob.c`][glob.c].
 
 
210
211 The actual matching is implemented in SQL, so the documentation for
212 `GLOB` and the other string matching operators in [SQLite][] is
213 useful.
214
215 [glob.c]: https://www.fossil-scm.org/index.html/file/src/glob.c
216 [SQLite]: https://sqlite.org/lang_expr.html#like
 
 
 
 
 
217
 
 
 
 
218
--- www/globs.md
+++ www/globs.md
@@ -1,113 +1,189 @@
1 File Name GLOB Patterns
2 =======================
3
4 A [glob pattern][glob] is a text expression that matches one or more
5 file names using wild cards familiar to most users of a command line.
6 For example, `*` is a glob that matches any name at all and
7 `Readme.txt` is a glob that matches exactly one file. Note that
8 although they are related, glob patterns are not the same thing as a
9 [regular expression or regexp][regexp].
10
11 [glob]: https://en.wikipedia.org/wiki/Glob_(programming) (Wikipedia)
12 [regexp]: https://en.wikipedia.org/wiki/Regular_expression
13
14
15 A number of fossil setting values hold one or more file glob patterns
16 that will match files either on the disk or in the active checkout.
17 Glob patterns are also accepted in options to certain commands as well
18 as query parameters to certain pages.
19
20 In many cases more than one glob may be specified in a setting,
21 option, or query parameter by listing multiple globs separated by a
22 comma or white space. If a glob must contain commas or spaces,
23 surround it with single or double quotation marks.
24
25 Of course, many fossil commands also accept lists of files to act on,
26 and those also may be specified with globs. Although those glob
27 patterns are similar to what is described here, they are not defined
28 by fossil, but rather by the conventions of the operating system in
29 use.
30
31
32 ## Syntax
33
34 A list of glob patterns is simply one or more glob patterns separated
35 by white space or commas. If a glob must contain white spaces or
36 commas, it can be quoted with either single or double quotation marks.
37 A list is said to match if any one (or more) globs in the list
38 matches.
39
40 A glob pattern is a collection of characters compared to a target
41 text, usually a file name. The whole glob is said to match if it
42 successfully consumes and matches the entire target text. Glob
43 patterns are made up of ordinary characters and special characters.
44
45 Ordinary characters consume a single character of the target and must
46 match it exactly.
47
48 Special characters (and special character sequences) consume zero or
49 more characters from the target and describe what matches. The special
50 characters (and sequences) are:
51
52 * Matches any sequence of zero or more characters.
53 ? Matches exactly one character.
54 [...] Matches one character from the enclosed list of characters.
55 [^...] Matches one character not in the enclosed list.
56
57 Special character sequences have some additional features:
58
59 * A range of characters may be specified with `-`, so `[a-d]` matches
60 exactly the same characters as `[abcd]`. Ranges reflect Unicode
61 code points without any locale-specific collation sequence.
62 * Include `-` in a list by placing it last, just before the `]`.
63 * Include `]` in a list by making the first character after the `[` or
64 `[^`. At any other place, `]` ends the list.
65 * Include `^` in a list by placing anywhere except first after the
66 `[`.
67 * Some examples of character lists:
68 `[a-d]` Matches any one of `a`, `b`, `c`, or `d` but not `ä`;
69 `[^a-d]` Matches exactly one character other than `a`, `b`, `c`,
70 or `d`;
71 `[0-9a-fA-F]` Matches exactly one hexadecimal digit;
72 `[a-]` Matches either `a` or `-`;
73 `[][]` Matches either `]` or `[`;
74 `[^]]` Matches exactly one character other than `]`;
75 `[]^]` Matches either `]` or `^`; and
76 `[^-]` Matches exactly one character other than `-`.
77 * Beware that ranges in lists may include more than you expect:
78 `[A-z]` Matches `A` and `Z`, but also matches `a` and some less
79 obvious characters such as `[`, `\`, and `]` with code point
80 values between `Z` and `a`.
81 * Beware that a range must be specified from low value to high
82 value: `[z-a]` does not match any character at all, preventing the
83 entire glob from matching.
84 * Note that unlike typical Unix shell globs, wildcards (`*`, `?`,
85 and character lists) are allowed to match `/` directory
86 separators as well as the initial `.` in the name of a hidden
87 file or directory.
88
89
90 White space means the ASCII characters TAB, LF, VT, FF, CR, and SPACE.
91 Note that this does not include any of the many additional spacing
92 characters available in Unicode, and specifically does not include
93 U+00A0 NO-BREAK SPACE.
94
95 Because both LF and CR are white space and leading and trailing spaces
96 are stripped from each glob in a list, a list of globs may be broken
97 into lines between globs when the list is stored in a file (as for a
98 versioned setting).
99
100 Similarly 'single quotes' and "double quotes" are the ASCII straight
101 quote characters, not any of the other quotation marks provided in
102 Unicode and specifically not the "curly" quotes preferred by
103 typesetters and word processors.
104
105
106 ## File Names to Match
107
108 Before it is compared to a glob pattern, each file name is transformed
109 to a canonical form. The glob must match the entire canonical file
110 name to be considered a match.
111
112 The canonical name of a file has all directory separators changed to
113 `/`, redundant slashes are removed, all `.` path components are
114 removed, and all `..` path components are resolved. (There are
115 additional details we won’t go into here.)
116
117 The goal is a name that is the simplest possible for each particular
118 file, and will be the same on Windows, Unix, and any other platform
119 where fossil is run.
120
121 Beware, however, that all glob matching is case sensitive. This will
122 not be a surprise on Unix where all file names are also case
123 sensitive. However, most Windows file systems are case preserving and
124 case insensitive. On Windows, the names `ReadMe` and `README` are
125 names of the same file; on Unix they are different files.
126
127 Some example cases:
128
129 * The glob `README` matches only a file named `README` in the root of
130 the tree. It does not match a file named `src/README` because it
131 does not include any characters that consumed the `src/` part.
132 * The glob `*/README` does match `src/README`. Unlike Unix file
133 globs, it also matches `src/library/README`. However it does not
134 match the file `README` in the root of the tree.
135 * The glob `src/README` does match the file named `src\README` on
136 Windows because all directory separators are rewritten as `/` in
137 the canonical name before the glob is matched. This makes it much
138 easier to write globs that work on both Unix and Windows.
139 * The glob `*.[ch]` matches every C source or header file in the
140 tree at the root or at any depth. Again, this is (deliberately)
141 different from Unix file globs and Windows wild cards.
142
143
144
145 ## Where Globs are Used
146
147 ### Settings that are Globs
148
149 These settings are all lists of glob patterns:
150
151 * `binary-glob`
152 * `clean-glob`
153 * `crlf-glob`
154 * `crnl-glob`
155 * `encoding-glob`
156 * `ignore-glob`
157 * `keep-glob`
158
159 All may be [versioned, local, or global][settings]. Use `fossil
160 settings` to manage local and global settings, or a file in the
161 repository's `.fossil-settings/` folder at the root of the tree named
162 for each for versioned setting.
163
164 [settings]: /doc/trunk/www/settings.wiki
165
166 Using versioned settings for these not only has the advantage that
167 they are tracked in the repository just like the rest of your project,
168 but you can more easily keep longer lists of more complicated glob
169 patterns than would be practical in either local or global settings.
170
171 The `ignore-glob` is an example of one setting that frequently grows
172 to be an elaborate list of files that should be ignored by most
173 commands. This is especially true when one (or more) IDEs are used in
174 a project because each IDE has its own ideas of how and where to cache
175 information that speeds up its browsing and building tasks but which
176 need not be preserved in your project's history.
177
178
179 ### Commands that Refer to Globs
180
181 Many of the commands that respect the settings containing globs have
182 options to override some or all of the settings. These options are
183 usually named to correspond to the setting they override, such as
184 `--ignore` to override the `ignore-glob` setting. These commands are:
185
186 * `add`
187 * `addremove`
188 * `changes`
189 * `clean`
@@ -115,23 +191,24 @@
191 * `merge`
192 * `settings`
193 * `status`
194 * `unset`
195
196 The commands `tarball` and `zip` produce compressed archives of a
197 specific checkin. They may be further restricted by options that
198 specify glob patterns that name files to include or exclude rather
199 than archiving the entire checkin.
200
201 The commands `http`, `cgi`, `server`, and `ui` that implement or
202 support with web servers provide a mechanism to name some files to
203 serve with static content where a list of GLOBs specifies what content
204 may be served.
205
206
207 ### Web pages that refer to GLOBs
208
209 The `/timeline` page supports a query parameter that names a GLOB of
210 files to focus the timeline on. It also can use `GLOB`, `LIKE`, or
211 `REGEXP` matching on tag names, where each is implemented by the
212 corresponding operator in [SQLite][].
213
214 The pages `/tarball` and `/zip` generate compressed archives of a
@@ -203,15 +280,23 @@
280 all the files.
281
282
283 ## Implementation
284
285 Most of the implementation of glob pattern handling in fossil is found
286 in [`src/glob.c`][glob.c]. The canonical name of a file is implemented
287 in [`src/file.c`][file.c]. Each command that references a glob
288 constructs the target text from information specific to that command.
289
 
 
 
290
291 [glob.c]: https://www.fossil-scm.org/index.html/file/src/glob.c
292 [file.c]: https://www.fossil-scm.org/index.html/file/src/file.c
293
294 The actual matching is implemented in SQL, so the documentation for
295 `GLOB` and the other string matching operators in [SQLite]
296 (https://sqlite.org/lang_expr.html#like) is useful. Of course, the
297 SQLite source code and test harnesses also make entertaining reading:
298
299 * `src/func.c` [lines 570-768]
300 (https://www.sqlite.org/src/artifact?name=9d52522cc8ae7f5c&ln=570-768)
301 * `test/expr.test` [lines 586-673]
302 (https://www.sqlite.org/src/artifact?name=66a2c9ac34f74f03&ln=586-673)
303

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button