Fossil SCM

Moved the documentation for Fossil's grep implementation out of src/regexp.c into a new document with greatly-expanded content, www/grep.md, which is now referenced both from the source code and in the output for "fossil help grep".

wyoung 2018-10-03 20:09 trunk
Commit 2e1775e23af45c4b108ef52b8ef7ed31d6d351dea33e54d73238ed44cf38fdbf
2 files changed +5 -38 +19
+5 -38
--- src/regexp.c
+++ src/regexp.c
@@ -16,46 +16,11 @@
1616
*******************************************************************************
1717
**
1818
** This file was adapted from the test_regexp.c file in SQLite3. That
1919
** file is in the public domain.
2020
**
21
-** The code in this file implements a compact but reasonably
22
-** efficient regular-expression matcher for posix extended regular
23
-** expressions against UTF8 text. The following syntax is supported:
24
-**
25
-** X* zero or more occurrences of X
26
-** X+ one or more occurrences of X
27
-** X? zero or one occurrences of X
28
-** X{p,q} between p and q occurrences of X
29
-** (X) match X
30
-** X|Y X or Y
31
-** ^X X occurring at the beginning of the string
32
-** X$ X occurring at the end of the string
33
-** . Match any single character
34
-** \c Character c where c is one of \{}()[]|*+?.
35
-** \c C-language escapes for c in afnrtv. ex: \t or \n
36
-** \uXXXX Where XXXX is exactly 4 hex digits, unicode value XXXX
37
-** \xXX Where XX is exactly 2 hex digits, unicode value XX
38
-** [abc] Any single character from the set abc
39
-** [^abc] Any single character not in the set abc
40
-** [a-z] Any single character in the range a-z
41
-** [^a-z] Any single character not in the range a-z
42
-** \b Word boundary
43
-** \w Word character. [A-Za-z0-9_]
44
-** \W Non-word character
45
-** \d Digit
46
-** \D Non-digit
47
-** \s Whitespace character
48
-** \S Non-whitespace character
49
-**
50
-** A nondeterministic finite automaton (NFA) is used for matching, so the
51
-** performance is bounded by O(N*M) where N is the size of the regular
52
-** expression and M is the size of the input string. The matcher never
53
-** exhibits exponential behavior. Note that the X{p,q} operator expands
54
-** to p copies of X following by q-p copies of X? and that the size of the
55
-** regular expression in the O(N*M) performance bound is computed after
56
-** this expansion.
21
+** See ../www/grep.md for details of the algorithm and RE dialect.
5722
*/
5823
#include "config.h"
5924
#include "regexp.h"
6025
6126
/* The end-of-input character */
@@ -827,16 +792,18 @@
827792
/*
828793
** COMMAND: grep
829794
**
830795
** Usage: %fossil grep [OPTIONS] PATTERN FILENAME
831796
**
832
-** Run grep over all historic version of FILENAME
797
+** Attempt to match the given POSIX extended regular expression PATTERN
798
+** over all historic versions of FILENAME. For details of the supported
799
+** RE dialect, see https://fossil-scm.org/doc/trunk/www/grep.md.
833800
**
834801
** Options:
835802
**
836803
** -i|--ignore-case Ignore case
837
-** -l|--files-with-matches Print only filenames that match
804
+** -l|--files-with-matches List only checkin ID for versions that match
838805
** -v|--verbose Show each file as it is analyzed
839806
*/
840807
void re_grep_cmd(void){
841808
u32 flags = 0;
842809
int bVerbose = 0;
843810
844811
ADDED www/grep.md
--- src/regexp.c
+++ src/regexp.c
@@ -16,46 +16,11 @@
16 *******************************************************************************
17 **
18 ** This file was adapted from the test_regexp.c file in SQLite3. That
19 ** file is in the public domain.
20 **
21 ** The code in this file implements a compact but reasonably
22 ** efficient regular-expression matcher for posix extended regular
23 ** expressions against UTF8 text. The following syntax is supported:
24 **
25 ** X* zero or more occurrences of X
26 ** X+ one or more occurrences of X
27 ** X? zero or one occurrences of X
28 ** X{p,q} between p and q occurrences of X
29 ** (X) match X
30 ** X|Y X or Y
31 ** ^X X occurring at the beginning of the string
32 ** X$ X occurring at the end of the string
33 ** . Match any single character
34 ** \c Character c where c is one of \{}()[]|*+?.
35 ** \c C-language escapes for c in afnrtv. ex: \t or \n
36 ** \uXXXX Where XXXX is exactly 4 hex digits, unicode value XXXX
37 ** \xXX Where XX is exactly 2 hex digits, unicode value XX
38 ** [abc] Any single character from the set abc
39 ** [^abc] Any single character not in the set abc
40 ** [a-z] Any single character in the range a-z
41 ** [^a-z] Any single character not in the range a-z
42 ** \b Word boundary
43 ** \w Word character. [A-Za-z0-9_]
44 ** \W Non-word character
45 ** \d Digit
46 ** \D Non-digit
47 ** \s Whitespace character
48 ** \S Non-whitespace character
49 **
50 ** A nondeterministic finite automaton (NFA) is used for matching, so the
51 ** performance is bounded by O(N*M) where N is the size of the regular
52 ** expression and M is the size of the input string. The matcher never
53 ** exhibits exponential behavior. Note that the X{p,q} operator expands
54 ** to p copies of X following by q-p copies of X? and that the size of the
55 ** regular expression in the O(N*M) performance bound is computed after
56 ** this expansion.
57 */
58 #include "config.h"
59 #include "regexp.h"
60
61 /* The end-of-input character */
@@ -827,16 +792,18 @@
827 /*
828 ** COMMAND: grep
829 **
830 ** Usage: %fossil grep [OPTIONS] PATTERN FILENAME
831 **
832 ** Run grep over all historic version of FILENAME
 
 
833 **
834 ** Options:
835 **
836 ** -i|--ignore-case Ignore case
837 ** -l|--files-with-matches Print only filenames that match
838 ** -v|--verbose Show each file as it is analyzed
839 */
840 void re_grep_cmd(void){
841 u32 flags = 0;
842 int bVerbose = 0;
843
844 DDED www/grep.md
--- src/regexp.c
+++ src/regexp.c
@@ -16,46 +16,11 @@
16 *******************************************************************************
17 **
18 ** This file was adapted from the test_regexp.c file in SQLite3. That
19 ** file is in the public domain.
20 **
21 ** See ../www/grep.md for details of the algorithm and RE dialect.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22 */
23 #include "config.h"
24 #include "regexp.h"
25
26 /* The end-of-input character */
@@ -827,16 +792,18 @@
792 /*
793 ** COMMAND: grep
794 **
795 ** Usage: %fossil grep [OPTIONS] PATTERN FILENAME
796 **
797 ** Attempt to match the given POSIX extended regular expression PATTERN
798 ** over all historic versions of FILENAME. For details of the supported
799 ** RE dialect, see https://fossil-scm.org/doc/trunk/www/grep.md.
800 **
801 ** Options:
802 **
803 ** -i|--ignore-case Ignore case
804 ** -l|--files-with-matches List only checkin ID for versions that match
805 ** -v|--verbose Show each file as it is analyzed
806 */
807 void re_grep_cmd(void){
808 u32 flags = 0;
809 int bVerbose = 0;
810
811 DDED www/grep.md
+19
--- a/www/grep.md
+++ b/www/grep.md
@@ -0,0 +1,19 @@
1
+# Fossil's Internal 'grep' Command
2
+
3
+As of Fossil 2.7, there is asomething likea
4
+| `\d` | No equivalent of other POSIXs of Fosoptions currently exist.
5
+R`, either
6
+implicitly or explicit
7
+POSIX
8
+`grep` Patches to
9
+ system's
10
+ Fossil also
11
+ currently.
12
+
13
+ Instead,
14
+ so it your repository are in
15
+ with ASCII, such as
16
+ X grep
17
+
18
+As of Fossil # Fossil grep vs POSIX gr their ASCII-compatible
19
+
--- a/www/grep.md
+++ b/www/grep.md
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/www/grep.md
+++ b/www/grep.md
@@ -0,0 +1,19 @@
1 # Fossil's Internal 'grep' Command
2
3 As of Fossil 2.7, there is asomething likea
4 | `\d` | No equivalent of other POSIXs of Fosoptions currently exist.
5 R`, either
6 implicitly or explicit
7 POSIX
8 `grep` Patches to
9 system's
10 Fossil also
11 currently.
12
13 Instead,
14 so it your repository are in
15 with ASCII, such as
16 X grep
17
18 As of Fossil # Fossil grep vs POSIX gr their ASCII-compatible
19

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button