Fossil SCM
Merged from trunk to verify fix in [62352847].
Commit
4077357a38ddb2f17c466960c473fb2e1e1408d10f774b28ef0104519fa4f580
Parent
5ee57d84b07781a…
29 files changed
+1
-1
-209
-630
-972
-687
-103
+1
+1
+4
+17
-3
+21
-7
+1
-1
+90
+137
-20
+4
-1
+3
-3
+1
+3
+35
-32
+6
-1
+58
-1
+1
-1
+1
+36
+11
+20
+2
-2
+1
+6
~
VERSION
-
compat/zlib/doc/algorithm.txt
-
compat/zlib/doc/rfc1950.txt
-
compat/zlib/doc/rfc1951.txt
-
compat/zlib/doc/rfc1952.txt
-
compat/zlib/doc/txtvsbin.txt
~
src/clone.c
~
src/configure.c
~
src/content.c
~
src/db.c
~
src/diffcmd.c
~
src/doc.c
~
src/encode.c
~
src/hname.c
~
src/main.c
~
src/sha3.c
~
src/shun.c
~
src/sqlcmd.c
~
src/stash.c
~
src/stat.c
~
src/unversioned.c
~
src/wiki.c
~
src/xfer.c
~
win/Makefile.mingw.mistachkin
~
www/changes.wiki
~
www/hashpolicy.wiki
~
www/mkdownload.tcl
~
www/mkindex.tcl
~
www/permutedindex.html
M
VERSION
+1
-1
| --- VERSION | ||
| +++ VERSION | ||
| @@ -1,1 +1,1 @@ | ||
| 1 | -2.0 | |
| 1 | +2.1 | |
| 2 | 2 | |
| 3 | 3 | DELETED compat/zlib/doc/algorithm.txt |
| 4 | 4 | DELETED compat/zlib/doc/rfc1950.txt |
| 5 | 5 | DELETED compat/zlib/doc/rfc1951.txt |
| 6 | 6 | DELETED compat/zlib/doc/rfc1952.txt |
| 7 | 7 | DELETED compat/zlib/doc/txtvsbin.txt |
| --- VERSION | |
| +++ VERSION | |
| @@ -1,1 +1,1 @@ | |
| 1 | 2.0 |
| 2 | |
| 3 | ELETED compat/zlib/doc/algorithm.txt |
| 4 | ELETED compat/zlib/doc/rfc1950.txt |
| 5 | ELETED compat/zlib/doc/rfc1951.txt |
| 6 | ELETED compat/zlib/doc/rfc1952.txt |
| 7 | ELETED compat/zlib/doc/txtvsbin.txt |
| --- VERSION | |
| +++ VERSION | |
| @@ -1,1 +1,1 @@ | |
| 1 | 2.1 |
| 2 | |
| 3 | ELETED compat/zlib/doc/algorithm.txt |
| 4 | ELETED compat/zlib/doc/rfc1950.txt |
| 5 | ELETED compat/zlib/doc/rfc1951.txt |
| 6 | ELETED compat/zlib/doc/rfc1952.txt |
| 7 | ELETED compat/zlib/doc/txtvsbin.txt |
D
compat/zlib/doc/algorithm.txt
-209
| --- a/compat/zlib/doc/algorithm.txt | ||
| +++ b/compat/zlib/doc/algorithm.txt | ||
| @@ -1,209 +0,0 @@ | ||
| 1 | -1. Compression algorithm (deflate) | |
| 2 | - | |
| 3 | -The deflation algorithm used by gzip (also zip and zlib) is a variation of | |
| 4 | -LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in | |
| 5 | -the input data. The second occurrence of a string is replaced by a | |
| 6 | -pointer to the previous string, in the form of a pair (distance, | |
| 7 | -length). Distances are limited to 32K bytes, and lengths are limited | |
| 8 | -to 258 bytes. When a string does not occur anywhere in the previous | |
| 9 | -32K bytes, it is emitted as a sequence of literal bytes. (In this | |
| 10 | -description, `string' must be taken as an arbitrary sequence of bytes, | |
| 11 | -and is not restricted to printable characters.) | |
| 12 | - | |
| 13 | -Literals or match lengths are compressed with one Huffman tree, and | |
| 14 | -match distances are compressed with another tree. The trees are stored | |
| 15 | -in a compact form at the start of each block. The blocks can have any | |
| 16 | -size (except that the compressed data for one block must fit in | |
| 17 | -available memory). A block is terminated when deflate() determines that | |
| 18 | -it would be useful to start another block with fresh trees. (This is | |
| 19 | -somewhat similar to the behavior of LZW-based _compress_.) | |
| 20 | - | |
| 21 | -Duplicated strings are found using a hash table. All input strings of | |
| 22 | -length 3 are inserted in the hash table. A hash index is computed for | |
| 23 | -the next 3 bytes. If the hash chain for this index is not empty, all | |
| 24 | -strings in the chain are compared with the current input string, and | |
| 25 | -the longest match is selected. | |
| 26 | - | |
| 27 | -The hash chains are searched starting with the most recent strings, to | |
| 28 | -favor small distances and thus take advantage of the Huffman encoding. | |
| 29 | -The hash chains are singly linked. There are no deletions from the | |
| 30 | -hash chains, the algorithm simply discards matches that are too old. | |
| 31 | - | |
| 32 | -To avoid a worst-case situation, very long hash chains are arbitrarily | |
| 33 | -truncated at a certain length, determined by a runtime option (level | |
| 34 | -parameter of deflateInit). So deflate() does not always find the longest | |
| 35 | -possible match but generally finds a match which is long enough. | |
| 36 | - | |
| 37 | -deflate() also defers the selection of matches with a lazy evaluation | |
| 38 | -mechanism. After a match of length N has been found, deflate() searches for | |
| 39 | -a longer match at the next input byte. If a longer match is found, the | |
| 40 | -previous match is truncated to a length of one (thus producing a single | |
| 41 | -literal byte) and the process of lazy evaluation begins again. Otherwise, | |
| 42 | -the original match is kept, and the next match search is attempted only N | |
| 43 | -steps later. | |
| 44 | - | |
| 45 | -The lazy match evaluation is also subject to a runtime parameter. If | |
| 46 | -the current match is long enough, deflate() reduces the search for a longer | |
| 47 | -match, thus speeding up the whole process. If compression ratio is more | |
| 48 | -important than speed, deflate() attempts a complete second search even if | |
| 49 | -the first match is already long enough. | |
| 50 | - | |
| 51 | -The lazy match evaluation is not performed for the fastest compression | |
| 52 | -modes (level parameter 1 to 3). For these fast modes, new strings | |
| 53 | -are inserted in the hash table only when no match was found, or | |
| 54 | -when the match is not too long. This degrades the compression ratio | |
| 55 | -but saves time since there are both fewer insertions and fewer searches. | |
| 56 | - | |
| 57 | - | |
| 58 | -2. Decompression algorithm (inflate) | |
| 59 | - | |
| 60 | -2.1 Introduction | |
| 61 | - | |
| 62 | -The key question is how to represent a Huffman code (or any prefix code) so | |
| 63 | -that you can decode fast. The most important characteristic is that shorter | |
| 64 | -codes are much more common than longer codes, so pay attention to decoding the | |
| 65 | -short codes fast, and let the long codes take longer to decode. | |
| 66 | - | |
| 67 | -inflate() sets up a first level table that covers some number of bits of | |
| 68 | -input less than the length of longest code. It gets that many bits from the | |
| 69 | -stream, and looks it up in the table. The table will tell if the next | |
| 70 | -code is that many bits or less and how many, and if it is, it will tell | |
| 71 | -the value, else it will point to the next level table for which inflate() | |
| 72 | -grabs more bits and tries to decode a longer code. | |
| 73 | - | |
| 74 | -How many bits to make the first lookup is a tradeoff between the time it | |
| 75 | -takes to decode and the time it takes to build the table. If building the | |
| 76 | -table took no time (and if you had infinite memory), then there would only | |
| 77 | -be a first level table to cover all the way to the longest code. However, | |
| 78 | -building the table ends up taking a lot longer for more bits since short | |
| 79 | -codes are replicated many times in such a table. What inflate() does is | |
| 80 | -simply to make the number of bits in the first table a variable, and then | |
| 81 | -to set that variable for the maximum speed. | |
| 82 | - | |
| 83 | -For inflate, which has 286 possible codes for the literal/length tree, the size | |
| 84 | -of the first table is nine bits. Also the distance trees have 30 possible | |
| 85 | -values, and the size of the first table is six bits. Note that for each of | |
| 86 | -those cases, the table ended up one bit longer than the ``average'' code | |
| 87 | -length, i.e. the code length of an approximately flat code which would be a | |
| 88 | -little more than eight bits for 286 symbols and a little less than five bits | |
| 89 | -for 30 symbols. | |
| 90 | - | |
| 91 | - | |
| 92 | -2.2 More details on the inflate table lookup | |
| 93 | - | |
| 94 | -Ok, you want to know what this cleverly obfuscated inflate tree actually | |
| 95 | -looks like. You are correct that it's not a Huffman tree. It is simply a | |
| 96 | -lookup table for the first, let's say, nine bits of a Huffman symbol. The | |
| 97 | -symbol could be as short as one bit or as long as 15 bits. If a particular | |
| 98 | -symbol is shorter than nine bits, then that symbol's translation is duplicated | |
| 99 | -in all those entries that start with that symbol's bits. For example, if the | |
| 100 | -symbol is four bits, then it's duplicated 32 times in a nine-bit table. If a | |
| 101 | -symbol is nine bits long, it appears in the table once. | |
| 102 | - | |
| 103 | -If the symbol is longer than nine bits, then that entry in the table points | |
| 104 | -to another similar table for the remaining bits. Again, there are duplicated | |
| 105 | -entries as needed. The idea is that most of the time the symbol will be short | |
| 106 | -and there will only be one table look up. (That's whole idea behind data | |
| 107 | -compression in the first place.) For the less frequent long symbols, there | |
| 108 | -will be two lookups. If you had a compression method with really long | |
| 109 | -symbols, you could have as many levels of lookups as is efficient. For | |
| 110 | -inflate, two is enough. | |
| 111 | - | |
| 112 | -So a table entry either points to another table (in which case nine bits in | |
| 113 | -the above example are gobbled), or it contains the translation for the symbol | |
| 114 | -and the number of bits to gobble. Then you start again with the next | |
| 115 | -ungobbled bit. | |
| 116 | - | |
| 117 | -You may wonder: why not just have one lookup table for how ever many bits the | |
| 118 | -longest symbol is? The reason is that if you do that, you end up spending | |
| 119 | -more time filling in duplicate symbol entries than you do actually decoding. | |
| 120 | -At least for deflate's output that generates new trees every several 10's of | |
| 121 | -kbytes. You can imagine that filling in a 2^15 entry table for a 15-bit code | |
| 122 | -would take too long if you're only decoding several thousand symbols. At the | |
| 123 | -other extreme, you could make a new table for every bit in the code. In fact, | |
| 124 | -that's essentially a Huffman tree. But then you spend too much time | |
| 125 | -traversing the tree while decoding, even for short symbols. | |
| 126 | - | |
| 127 | -So the number of bits for the first lookup table is a trade of the time to | |
| 128 | -fill out the table vs. the time spent looking at the second level and above of | |
| 129 | -the table. | |
| 130 | - | |
| 131 | -Here is an example, scaled down: | |
| 132 | - | |
| 133 | -The code being decoded, with 10 symbols, from 1 to 6 bits long: | |
| 134 | - | |
| 135 | -A: 0 | |
| 136 | -B: 10 | |
| 137 | -C: 1100 | |
| 138 | -D: 11010 | |
| 139 | -E: 11011 | |
| 140 | -F: 11100 | |
| 141 | -G: 11101 | |
| 142 | -H: 11110 | |
| 143 | -I: 111110 | |
| 144 | -J: 111111 | |
| 145 | - | |
| 146 | -Let's make the first table three bits long (eight entries): | |
| 147 | - | |
| 148 | -000: A,1 | |
| 149 | -001: A,1 | |
| 150 | -010: A,1 | |
| 151 | -011: A,1 | |
| 152 | -100: B,2 | |
| 153 | -101: B,2 | |
| 154 | -110: -> table X (gobble 3 bits) | |
| 155 | -111: -> table Y (gobble 3 bits) | |
| 156 | - | |
| 157 | -Each entry is what the bits decode as and how many bits that is, i.e. how | |
| 158 | -many bits to gobble. Or the entry points to another table, with the number of | |
| 159 | -bits to gobble implicit in the size of the table. | |
| 160 | - | |
| 161 | -Table X is two bits long since the longest code starting with 110 is five bits | |
| 162 | -long: | |
| 163 | - | |
| 164 | -00: C,1 | |
| 165 | -01: C,1 | |
| 166 | -10: D,2 | |
| 167 | -11: E,2 | |
| 168 | - | |
| 169 | -Table Y is three bits long since the longest code starting with 111 is six | |
| 170 | -bits long: | |
| 171 | - | |
| 172 | -000: F,2 | |
| 173 | -001: F,2 | |
| 174 | -010: G,2 | |
| 175 | -011: G,2 | |
| 176 | -100: H,2 | |
| 177 | -101: H,2 | |
| 178 | -110: I,3 | |
| 179 | -111: J,3 | |
| 180 | - | |
| 181 | -So what we have here are three tables with a total of 20 entries that had to | |
| 182 | -be constructed. That's compared to 64 entries for a single table. Or | |
| 183 | -compared to 16 entries for a Huffman tree (six two entry tables and one four | |
| 184 | -entry table). Assuming that the code ideally represents the probability of | |
| 185 | -the symbols, it takes on the average 1.25 lookups per symbol. That's compared | |
| 186 | -to one lookup for the single table, or 1.66 lookups per symbol for the | |
| 187 | -Huffman tree. | |
| 188 | - | |
| 189 | -There, I think that gives you a picture of what's going on. For inflate, the | |
| 190 | -meaning of a particular symbol is often more than just a letter. It can be a | |
| 191 | -byte (a "literal"), or it can be either a length or a distance which | |
| 192 | -indicates a base value and a number of bits to fetch after the code that is | |
| 193 | -added to the base value. Or it might be the special end-of-block code. The | |
| 194 | -data structures created in inftrees.c try to encode all that information | |
| 195 | -compactly in the tables. | |
| 196 | - | |
| 197 | - | |
| 198 | -Jean-loup Gailly Mark Adler | |
| 199 | -[email protected] [email protected] | |
| 200 | - | |
| 201 | - | |
| 202 | -References: | |
| 203 | - | |
| 204 | -[LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data | |
| 205 | -Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3, | |
| 206 | -pp. 337-343. | |
| 207 | - | |
| 208 | -``DEFLATE Compressed Data Format Specification'' available in | |
| 209 | -http://tools.ietf.org/html/rfc1951 |
| --- a/compat/zlib/doc/algorithm.txt | |
| +++ b/compat/zlib/doc/algorithm.txt | |
| @@ -1,209 +0,0 @@ | |
| 1 | 1. Compression algorithm (deflate) |
| 2 | |
| 3 | The deflation algorithm used by gzip (also zip and zlib) is a variation of |
| 4 | LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in |
| 5 | the input data. The second occurrence of a string is replaced by a |
| 6 | pointer to the previous string, in the form of a pair (distance, |
| 7 | length). Distances are limited to 32K bytes, and lengths are limited |
| 8 | to 258 bytes. When a string does not occur anywhere in the previous |
| 9 | 32K bytes, it is emitted as a sequence of literal bytes. (In this |
| 10 | description, `string' must be taken as an arbitrary sequence of bytes, |
| 11 | and is not restricted to printable characters.) |
| 12 | |
| 13 | Literals or match lengths are compressed with one Huffman tree, and |
| 14 | match distances are compressed with another tree. The trees are stored |
| 15 | in a compact form at the start of each block. The blocks can have any |
| 16 | size (except that the compressed data for one block must fit in |
| 17 | available memory). A block is terminated when deflate() determines that |
| 18 | it would be useful to start another block with fresh trees. (This is |
| 19 | somewhat similar to the behavior of LZW-based _compress_.) |
| 20 | |
| 21 | Duplicated strings are found using a hash table. All input strings of |
| 22 | length 3 are inserted in the hash table. A hash index is computed for |
| 23 | the next 3 bytes. If the hash chain for this index is not empty, all |
| 24 | strings in the chain are compared with the current input string, and |
| 25 | the longest match is selected. |
| 26 | |
| 27 | The hash chains are searched starting with the most recent strings, to |
| 28 | favor small distances and thus take advantage of the Huffman encoding. |
| 29 | The hash chains are singly linked. There are no deletions from the |
| 30 | hash chains, the algorithm simply discards matches that are too old. |
| 31 | |
| 32 | To avoid a worst-case situation, very long hash chains are arbitrarily |
| 33 | truncated at a certain length, determined by a runtime option (level |
| 34 | parameter of deflateInit). So deflate() does not always find the longest |
| 35 | possible match but generally finds a match which is long enough. |
| 36 | |
| 37 | deflate() also defers the selection of matches with a lazy evaluation |
| 38 | mechanism. After a match of length N has been found, deflate() searches for |
| 39 | a longer match at the next input byte. If a longer match is found, the |
| 40 | previous match is truncated to a length of one (thus producing a single |
| 41 | literal byte) and the process of lazy evaluation begins again. Otherwise, |
| 42 | the original match is kept, and the next match search is attempted only N |
| 43 | steps later. |
| 44 | |
| 45 | The lazy match evaluation is also subject to a runtime parameter. If |
| 46 | the current match is long enough, deflate() reduces the search for a longer |
| 47 | match, thus speeding up the whole process. If compression ratio is more |
| 48 | important than speed, deflate() attempts a complete second search even if |
| 49 | the first match is already long enough. |
| 50 | |
| 51 | The lazy match evaluation is not performed for the fastest compression |
| 52 | modes (level parameter 1 to 3). For these fast modes, new strings |
| 53 | are inserted in the hash table only when no match was found, or |
| 54 | when the match is not too long. This degrades the compression ratio |
| 55 | but saves time since there are both fewer insertions and fewer searches. |
| 56 | |
| 57 | |
| 58 | 2. Decompression algorithm (inflate) |
| 59 | |
| 60 | 2.1 Introduction |
| 61 | |
| 62 | The key question is how to represent a Huffman code (or any prefix code) so |
| 63 | that you can decode fast. The most important characteristic is that shorter |
| 64 | codes are much more common than longer codes, so pay attention to decoding the |
| 65 | short codes fast, and let the long codes take longer to decode. |
| 66 | |
| 67 | inflate() sets up a first level table that covers some number of bits of |
| 68 | input less than the length of longest code. It gets that many bits from the |
| 69 | stream, and looks it up in the table. The table will tell if the next |
| 70 | code is that many bits or less and how many, and if it is, it will tell |
| 71 | the value, else it will point to the next level table for which inflate() |
| 72 | grabs more bits and tries to decode a longer code. |
| 73 | |
| 74 | How many bits to make the first lookup is a tradeoff between the time it |
| 75 | takes to decode and the time it takes to build the table. If building the |
| 76 | table took no time (and if you had infinite memory), then there would only |
| 77 | be a first level table to cover all the way to the longest code. However, |
| 78 | building the table ends up taking a lot longer for more bits since short |
| 79 | codes are replicated many times in such a table. What inflate() does is |
| 80 | simply to make the number of bits in the first table a variable, and then |
| 81 | to set that variable for the maximum speed. |
| 82 | |
| 83 | For inflate, which has 286 possible codes for the literal/length tree, the size |
| 84 | of the first table is nine bits. Also the distance trees have 30 possible |
| 85 | values, and the size of the first table is six bits. Note that for each of |
| 86 | those cases, the table ended up one bit longer than the ``average'' code |
| 87 | length, i.e. the code length of an approximately flat code which would be a |
| 88 | little more than eight bits for 286 symbols and a little less than five bits |
| 89 | for 30 symbols. |
| 90 | |
| 91 | |
| 92 | 2.2 More details on the inflate table lookup |
| 93 | |
| 94 | Ok, you want to know what this cleverly obfuscated inflate tree actually |
| 95 | looks like. You are correct that it's not a Huffman tree. It is simply a |
| 96 | lookup table for the first, let's say, nine bits of a Huffman symbol. The |
| 97 | symbol could be as short as one bit or as long as 15 bits. If a particular |
| 98 | symbol is shorter than nine bits, then that symbol's translation is duplicated |
| 99 | in all those entries that start with that symbol's bits. For example, if the |
| 100 | symbol is four bits, then it's duplicated 32 times in a nine-bit table. If a |
| 101 | symbol is nine bits long, it appears in the table once. |
| 102 | |
| 103 | If the symbol is longer than nine bits, then that entry in the table points |
| 104 | to another similar table for the remaining bits. Again, there are duplicated |
| 105 | entries as needed. The idea is that most of the time the symbol will be short |
| 106 | and there will only be one table look up. (That's whole idea behind data |
| 107 | compression in the first place.) For the less frequent long symbols, there |
| 108 | will be two lookups. If you had a compression method with really long |
| 109 | symbols, you could have as many levels of lookups as is efficient. For |
| 110 | inflate, two is enough. |
| 111 | |
| 112 | So a table entry either points to another table (in which case nine bits in |
| 113 | the above example are gobbled), or it contains the translation for the symbol |
| 114 | and the number of bits to gobble. Then you start again with the next |
| 115 | ungobbled bit. |
| 116 | |
| 117 | You may wonder: why not just have one lookup table for how ever many bits the |
| 118 | longest symbol is? The reason is that if you do that, you end up spending |
| 119 | more time filling in duplicate symbol entries than you do actually decoding. |
| 120 | At least for deflate's output that generates new trees every several 10's of |
| 121 | kbytes. You can imagine that filling in a 2^15 entry table for a 15-bit code |
| 122 | would take too long if you're only decoding several thousand symbols. At the |
| 123 | other extreme, you could make a new table for every bit in the code. In fact, |
| 124 | that's essentially a Huffman tree. But then you spend too much time |
| 125 | traversing the tree while decoding, even for short symbols. |
| 126 | |
| 127 | So the number of bits for the first lookup table is a trade of the time to |
| 128 | fill out the table vs. the time spent looking at the second level and above of |
| 129 | the table. |
| 130 | |
| 131 | Here is an example, scaled down: |
| 132 | |
| 133 | The code being decoded, with 10 symbols, from 1 to 6 bits long: |
| 134 | |
| 135 | A: 0 |
| 136 | B: 10 |
| 137 | C: 1100 |
| 138 | D: 11010 |
| 139 | E: 11011 |
| 140 | F: 11100 |
| 141 | G: 11101 |
| 142 | H: 11110 |
| 143 | I: 111110 |
| 144 | J: 111111 |
| 145 | |
| 146 | Let's make the first table three bits long (eight entries): |
| 147 | |
| 148 | 000: A,1 |
| 149 | 001: A,1 |
| 150 | 010: A,1 |
| 151 | 011: A,1 |
| 152 | 100: B,2 |
| 153 | 101: B,2 |
| 154 | 110: -> table X (gobble 3 bits) |
| 155 | 111: -> table Y (gobble 3 bits) |
| 156 | |
| 157 | Each entry is what the bits decode as and how many bits that is, i.e. how |
| 158 | many bits to gobble. Or the entry points to another table, with the number of |
| 159 | bits to gobble implicit in the size of the table. |
| 160 | |
| 161 | Table X is two bits long since the longest code starting with 110 is five bits |
| 162 | long: |
| 163 | |
| 164 | 00: C,1 |
| 165 | 01: C,1 |
| 166 | 10: D,2 |
| 167 | 11: E,2 |
| 168 | |
| 169 | Table Y is three bits long since the longest code starting with 111 is six |
| 170 | bits long: |
| 171 | |
| 172 | 000: F,2 |
| 173 | 001: F,2 |
| 174 | 010: G,2 |
| 175 | 011: G,2 |
| 176 | 100: H,2 |
| 177 | 101: H,2 |
| 178 | 110: I,3 |
| 179 | 111: J,3 |
| 180 | |
| 181 | So what we have here are three tables with a total of 20 entries that had to |
| 182 | be constructed. That's compared to 64 entries for a single table. Or |
| 183 | compared to 16 entries for a Huffman tree (six two entry tables and one four |
| 184 | entry table). Assuming that the code ideally represents the probability of |
| 185 | the symbols, it takes on the average 1.25 lookups per symbol. That's compared |
| 186 | to one lookup for the single table, or 1.66 lookups per symbol for the |
| 187 | Huffman tree. |
| 188 | |
| 189 | There, I think that gives you a picture of what's going on. For inflate, the |
| 190 | meaning of a particular symbol is often more than just a letter. It can be a |
| 191 | byte (a "literal"), or it can be either a length or a distance which |
| 192 | indicates a base value and a number of bits to fetch after the code that is |
| 193 | added to the base value. Or it might be the special end-of-block code. The |
| 194 | data structures created in inftrees.c try to encode all that information |
| 195 | compactly in the tables. |
| 196 | |
| 197 | |
| 198 | Jean-loup Gailly Mark Adler |
| 199 | [email protected] [email protected] |
| 200 | |
| 201 | |
| 202 | References: |
| 203 | |
| 204 | [LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data |
| 205 | Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3, |
| 206 | pp. 337-343. |
| 207 | |
| 208 | ``DEFLATE Compressed Data Format Specification'' available in |
| 209 | http://tools.ietf.org/html/rfc1951 |
| --- a/compat/zlib/doc/algorithm.txt | |
| +++ b/compat/zlib/doc/algorithm.txt | |
| @@ -1,209 +0,0 @@ | |
D
compat/zlib/doc/rfc1950.txt
-630
| --- a/compat/zlib/doc/rfc1950.txt | ||
| +++ b/compat/zlib/doc/rfc1950.txt | ||
| @@ -1,630 +0,0 @@ | ||
| 1 | - | |
| 2 | - | |
| 3 | - | |
| 4 | - | |
| 5 | - | |
| 6 | - | |
| 7 | -Network Working Group P. Deutsch | |
| 8 | -Request for Comments: 1950 Aladdin Enterprises | |
| 9 | -Category: Informational J-L. Gailly | |
| 10 | - Info-ZIP | |
| 11 | - May 1996 | |
| 12 | - | |
| 13 | - | |
| 14 | - ZLIB Compressed Data Format Specification version 3.3 | |
| 15 | - | |
| 16 | -Status of This Memo | |
| 17 | - | |
| 18 | - This memo provides information for the Internet community. This memo | |
| 19 | - does not specify an Internet standard of any kind. Distribution of | |
| 20 | - this memo is unlimited. | |
| 21 | - | |
| 22 | -IESG Note: | |
| 23 | - | |
| 24 | - The IESG takes no position on the validity of any Intellectual | |
| 25 | - Property Rights statements contained in this document. | |
| 26 | - | |
| 27 | -Notices | |
| 28 | - | |
| 29 | - Copyright (c) 1996 L. Peter Deutsch and Jean-Loup Gailly | |
| 30 | - | |
| 31 | - Permission is granted to copy and distribute this document for any | |
| 32 | - purpose and without charge, including translations into other | |
| 33 | - languages and incorporation into compilations, provided that the | |
| 34 | - copyright notice and this notice are preserved, and that any | |
| 35 | - substantive changes or deletions from the original are clearly | |
| 36 | - marked. | |
| 37 | - | |
| 38 | - A pointer to the latest version of this and related documentation in | |
| 39 | - HTML format can be found at the URL | |
| 40 | - <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>. | |
| 41 | - | |
| 42 | -Abstract | |
| 43 | - | |
| 44 | - This specification defines a lossless compressed data format. The | |
| 45 | - data can be produced or consumed, even for an arbitrarily long | |
| 46 | - sequentially presented input data stream, using only an a priori | |
| 47 | - bounded amount of intermediate storage. The format presently uses | |
| 48 | - the DEFLATE compression method but can be easily extended to use | |
| 49 | - other compression methods. It can be implemented readily in a manner | |
| 50 | - not covered by patents. This specification also defines the ADLER-32 | |
| 51 | - checksum (an extension and improvement of the Fletcher checksum), | |
| 52 | - used for detection of data corruption, and provides an algorithm for | |
| 53 | - computing it. | |
| 54 | - | |
| 55 | - | |
| 56 | - | |
| 57 | - | |
| 58 | -Deutsch & Gailly Informational [Page 1] | |
| 59 | - | |
| 60 | - | |
| 61 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 62 | - | |
| 63 | - | |
| 64 | -Table of Contents | |
| 65 | - | |
| 66 | - 1. Introduction ................................................... 2 | |
| 67 | - 1.1. Purpose ................................................... 2 | |
| 68 | - 1.2. Intended audience ......................................... 3 | |
| 69 | - 1.3. Scope ..................................................... 3 | |
| 70 | - 1.4. Compliance ................................................ 3 | |
| 71 | - 1.5. Definitions of terms and conventions used ................ 3 | |
| 72 | - 1.6. Changes from previous versions ............................ 3 | |
| 73 | - 2. Detailed specification ......................................... 3 | |
| 74 | - 2.1. Overall conventions ....................................... 3 | |
| 75 | - 2.2. Data format ............................................... 4 | |
| 76 | - 2.3. Compliance ................................................ 7 | |
| 77 | - 3. References ..................................................... 7 | |
| 78 | - 4. Source code .................................................... 8 | |
| 79 | - 5. Security Considerations ........................................ 8 | |
| 80 | - 6. Acknowledgements ............................................... 8 | |
| 81 | - 7. Authors' Addresses ............................................. 8 | |
| 82 | - 8. Appendix: Rationale ............................................ 9 | |
| 83 | - 9. Appendix: Sample code ..........................................10 | |
| 84 | - | |
| 85 | -1. Introduction | |
| 86 | - | |
| 87 | - 1.1. Purpose | |
| 88 | - | |
| 89 | - The purpose of this specification is to define a lossless | |
| 90 | - compressed data format that: | |
| 91 | - | |
| 92 | - * Is independent of CPU type, operating system, file system, | |
| 93 | - and character set, and hence can be used for interchange; | |
| 94 | - | |
| 95 | - * Can be produced or consumed, even for an arbitrarily long | |
| 96 | - sequentially presented input data stream, using only an a | |
| 97 | - priori bounded amount of intermediate storage, and hence can | |
| 98 | - be used in data communications or similar structures such as | |
| 99 | - Unix filters; | |
| 100 | - | |
| 101 | - * Can use a number of different compression methods; | |
| 102 | - | |
| 103 | - * Can be implemented readily in a manner not covered by | |
| 104 | - patents, and hence can be practiced freely. | |
| 105 | - | |
| 106 | - The data format defined by this specification does not attempt to | |
| 107 | - allow random access to compressed data. | |
| 108 | - | |
| 109 | - | |
| 110 | - | |
| 111 | - | |
| 112 | - | |
| 113 | - | |
| 114 | - | |
| 115 | -Deutsch & Gailly Informational [Page 2] | |
| 116 | - | |
| 117 | - | |
| 118 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 119 | - | |
| 120 | - | |
| 121 | - 1.2. Intended audience | |
| 122 | - | |
| 123 | - This specification is intended for use by implementors of software | |
| 124 | - to compress data into zlib format and/or decompress data from zlib | |
| 125 | - format. | |
| 126 | - | |
| 127 | - The text of the specification assumes a basic background in | |
| 128 | - programming at the level of bits and other primitive data | |
| 129 | - representations. | |
| 130 | - | |
| 131 | - 1.3. Scope | |
| 132 | - | |
| 133 | - The specification specifies a compressed data format that can be | |
| 134 | - used for in-memory compression of a sequence of arbitrary bytes. | |
| 135 | - | |
| 136 | - 1.4. Compliance | |
| 137 | - | |
| 138 | - Unless otherwise indicated below, a compliant decompressor must be | |
| 139 | - able to accept and decompress any data set that conforms to all | |
| 140 | - the specifications presented here; a compliant compressor must | |
| 141 | - produce data sets that conform to all the specifications presented | |
| 142 | - here. | |
| 143 | - | |
| 144 | - 1.5. Definitions of terms and conventions used | |
| 145 | - | |
| 146 | - byte: 8 bits stored or transmitted as a unit (same as an octet). | |
| 147 | - (For this specification, a byte is exactly 8 bits, even on | |
| 148 | - machines which store a character on a number of bits different | |
| 149 | - from 8.) See below, for the numbering of bits within a byte. | |
| 150 | - | |
| 151 | - 1.6. Changes from previous versions | |
| 152 | - | |
| 153 | - Version 3.1 was the first public release of this specification. | |
| 154 | - In version 3.2, some terminology was changed and the Adler-32 | |
| 155 | - sample code was rewritten for clarity. In version 3.3, the | |
| 156 | - support for a preset dictionary was introduced, and the | |
| 157 | - specification was converted to RFC style. | |
| 158 | - | |
| 159 | -2. Detailed specification | |
| 160 | - | |
| 161 | - 2.1. Overall conventions | |
| 162 | - | |
| 163 | - In the diagrams below, a box like this: | |
| 164 | - | |
| 165 | - +---+ | |
| 166 | - | | <-- the vertical bars might be missing | |
| 167 | - +---+ | |
| 168 | - | |
| 169 | - | |
| 170 | - | |
| 171 | - | |
| 172 | -Deutsch & Gailly Informational [Page 3] | |
| 173 | - | |
| 174 | - | |
| 175 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 176 | - | |
| 177 | - | |
| 178 | - represents one byte; a box like this: | |
| 179 | - | |
| 180 | - +==============+ | |
| 181 | - | | | |
| 182 | - +==============+ | |
| 183 | - | |
| 184 | - represents a variable number of bytes. | |
| 185 | - | |
| 186 | - Bytes stored within a computer do not have a "bit order", since | |
| 187 | - they are always treated as a unit. However, a byte considered as | |
| 188 | - an integer between 0 and 255 does have a most- and least- | |
| 189 | - significant bit, and since we write numbers with the most- | |
| 190 | - significant digit on the left, we also write bytes with the most- | |
| 191 | - significant bit on the left. In the diagrams below, we number the | |
| 192 | - bits of a byte so that bit 0 is the least-significant bit, i.e., | |
| 193 | - the bits are numbered: | |
| 194 | - | |
| 195 | - +--------+ | |
| 196 | - |76543210| | |
| 197 | - +--------+ | |
| 198 | - | |
| 199 | - Within a computer, a number may occupy multiple bytes. All | |
| 200 | - multi-byte numbers in the format described here are stored with | |
| 201 | - the MOST-significant byte first (at the lower memory address). | |
| 202 | - For example, the decimal number 520 is stored as: | |
| 203 | - | |
| 204 | - 0 1 | |
| 205 | - +--------+--------+ | |
| 206 | - |00000010|00001000| | |
| 207 | - +--------+--------+ | |
| 208 | - ^ ^ | |
| 209 | - | | | |
| 210 | - | + less significant byte = 8 | |
| 211 | - + more significant byte = 2 x 256 | |
| 212 | - | |
| 213 | - 2.2. Data format | |
| 214 | - | |
| 215 | - A zlib stream has the following structure: | |
| 216 | - | |
| 217 | - 0 1 | |
| 218 | - +---+---+ | |
| 219 | - |CMF|FLG| (more-->) | |
| 220 | - +---+---+ | |
| 221 | - | |
| 222 | - | |
| 223 | - | |
| 224 | - | |
| 225 | - | |
| 226 | - | |
| 227 | - | |
| 228 | - | |
| 229 | -Deutsch & Gailly Informational [Page 4] | |
| 230 | - | |
| 231 | - | |
| 232 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 233 | - | |
| 234 | - | |
| 235 | - (if FLG.FDICT set) | |
| 236 | - | |
| 237 | - 0 1 2 3 | |
| 238 | - +---+---+---+---+ | |
| 239 | - | DICTID | (more-->) | |
| 240 | - +---+---+---+---+ | |
| 241 | - | |
| 242 | - +=====================+---+---+---+---+ | |
| 243 | - |...compressed data...| ADLER32 | | |
| 244 | - +=====================+---+---+---+---+ | |
| 245 | - | |
| 246 | - Any data which may appear after ADLER32 are not part of the zlib | |
| 247 | - stream. | |
| 248 | - | |
| 249 | - CMF (Compression Method and flags) | |
| 250 | - This byte is divided into a 4-bit compression method and a 4- | |
| 251 | - bit information field depending on the compression method. | |
| 252 | - | |
| 253 | - bits 0 to 3 CM Compression method | |
| 254 | - bits 4 to 7 CINFO Compression info | |
| 255 | - | |
| 256 | - CM (Compression method) | |
| 257 | - This identifies the compression method used in the file. CM = 8 | |
| 258 | - denotes the "deflate" compression method with a window size up | |
| 259 | - to 32K. This is the method used by gzip and PNG (see | |
| 260 | - references [1] and [2] in Chapter 3, below, for the reference | |
| 261 | - documents). CM = 15 is reserved. It might be used in a future | |
| 262 | - version of this specification to indicate the presence of an | |
| 263 | - extra field before the compressed data. | |
| 264 | - | |
| 265 | - CINFO (Compression info) | |
| 266 | - For CM = 8, CINFO is the base-2 logarithm of the LZ77 window | |
| 267 | - size, minus eight (CINFO=7 indicates a 32K window size). Values | |
| 268 | - of CINFO above 7 are not allowed in this version of the | |
| 269 | - specification. CINFO is not defined in this specification for | |
| 270 | - CM not equal to 8. | |
| 271 | - | |
| 272 | - FLG (FLaGs) | |
| 273 | - This flag byte is divided as follows: | |
| 274 | - | |
| 275 | - bits 0 to 4 FCHECK (check bits for CMF and FLG) | |
| 276 | - bit 5 FDICT (preset dictionary) | |
| 277 | - bits 6 to 7 FLEVEL (compression level) | |
| 278 | - | |
| 279 | - The FCHECK value must be such that CMF and FLG, when viewed as | |
| 280 | - a 16-bit unsigned integer stored in MSB order (CMF*256 + FLG), | |
| 281 | - is a multiple of 31. | |
| 282 | - | |
| 283 | - | |
| 284 | - | |
| 285 | - | |
| 286 | -Deutsch & Gailly Informational [Page 5] | |
| 287 | - | |
| 288 | - | |
| 289 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 290 | - | |
| 291 | - | |
| 292 | - FDICT (Preset dictionary) | |
| 293 | - If FDICT is set, a DICT dictionary identifier is present | |
| 294 | - immediately after the FLG byte. The dictionary is a sequence of | |
| 295 | - bytes which are initially fed to the compressor without | |
| 296 | - producing any compressed output. DICT is the Adler-32 checksum | |
| 297 | - of this sequence of bytes (see the definition of ADLER32 | |
| 298 | - below). The decompressor can use this identifier to determine | |
| 299 | - which dictionary has been used by the compressor. | |
| 300 | - | |
| 301 | - FLEVEL (Compression level) | |
| 302 | - These flags are available for use by specific compression | |
| 303 | - methods. The "deflate" method (CM = 8) sets these flags as | |
| 304 | - follows: | |
| 305 | - | |
| 306 | - 0 - compressor used fastest algorithm | |
| 307 | - 1 - compressor used fast algorithm | |
| 308 | - 2 - compressor used default algorithm | |
| 309 | - 3 - compressor used maximum compression, slowest algorithm | |
| 310 | - | |
| 311 | - The information in FLEVEL is not needed for decompression; it | |
| 312 | - is there to indicate if recompression might be worthwhile. | |
| 313 | - | |
| 314 | - compressed data | |
| 315 | - For compression method 8, the compressed data is stored in the | |
| 316 | - deflate compressed data format as described in the document | |
| 317 | - "DEFLATE Compressed Data Format Specification" by L. Peter | |
| 318 | - Deutsch. (See reference [3] in Chapter 3, below) | |
| 319 | - | |
| 320 | - Other compressed data formats are not specified in this version | |
| 321 | - of the zlib specification. | |
| 322 | - | |
| 323 | - ADLER32 (Adler-32 checksum) | |
| 324 | - This contains a checksum value of the uncompressed data | |
| 325 | - (excluding any dictionary data) computed according to Adler-32 | |
| 326 | - algorithm. This algorithm is a 32-bit extension and improvement | |
| 327 | - of the Fletcher algorithm, used in the ITU-T X.224 / ISO 8073 | |
| 328 | - standard. See references [4] and [5] in Chapter 3, below) | |
| 329 | - | |
| 330 | - Adler-32 is composed of two sums accumulated per byte: s1 is | |
| 331 | - the sum of all bytes, s2 is the sum of all s1 values. Both sums | |
| 332 | - are done modulo 65521. s1 is initialized to 1, s2 to zero. The | |
| 333 | - Adler-32 checksum is stored as s2*65536 + s1 in most- | |
| 334 | - significant-byte first (network) order. | |
| 335 | - | |
| 336 | - | |
| 337 | - | |
| 338 | - | |
| 339 | - | |
| 340 | - | |
| 341 | - | |
| 342 | - | |
| 343 | -Deutsch & Gailly Informational [Page 6] | |
| 344 | - | |
| 345 | - | |
| 346 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 347 | - | |
| 348 | - | |
| 349 | - 2.3. Compliance | |
| 350 | - | |
| 351 | - A compliant compressor must produce streams with correct CMF, FLG | |
| 352 | - and ADLER32, but need not support preset dictionaries. When the | |
| 353 | - zlib data format is used as part of another standard data format, | |
| 354 | - the compressor may use only preset dictionaries that are specified | |
| 355 | - by this other data format. If this other format does not use the | |
| 356 | - preset dictionary feature, the compressor must not set the FDICT | |
| 357 | - flag. | |
| 358 | - | |
| 359 | - A compliant decompressor must check CMF, FLG, and ADLER32, and | |
| 360 | - provide an error indication if any of these have incorrect values. | |
| 361 | - A compliant decompressor must give an error indication if CM is | |
| 362 | - not one of the values defined in this specification (only the | |
| 363 | - value 8 is permitted in this version), since another value could | |
| 364 | - indicate the presence of new features that would cause subsequent | |
| 365 | - data to be interpreted incorrectly. A compliant decompressor must | |
| 366 | - give an error indication if FDICT is set and DICTID is not the | |
| 367 | - identifier of a known preset dictionary. A decompressor may | |
| 368 | - ignore FLEVEL and still be compliant. When the zlib data format | |
| 369 | - is being used as a part of another standard format, a compliant | |
| 370 | - decompressor must support all the preset dictionaries specified by | |
| 371 | - the other format. When the other format does not use the preset | |
| 372 | - dictionary feature, a compliant decompressor must reject any | |
| 373 | - stream in which the FDICT flag is set. | |
| 374 | - | |
| 375 | -3. References | |
| 376 | - | |
| 377 | - [1] Deutsch, L.P.,"GZIP Compressed Data Format Specification", | |
| 378 | - available in ftp://ftp.uu.net/pub/archiving/zip/doc/ | |
| 379 | - | |
| 380 | - [2] Thomas Boutell, "PNG (Portable Network Graphics) specification", | |
| 381 | - available in ftp://ftp.uu.net/graphics/png/documents/ | |
| 382 | - | |
| 383 | - [3] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification", | |
| 384 | - available in ftp://ftp.uu.net/pub/archiving/zip/doc/ | |
| 385 | - | |
| 386 | - [4] Fletcher, J. G., "An Arithmetic Checksum for Serial | |
| 387 | - Transmissions," IEEE Transactions on Communications, Vol. COM-30, | |
| 388 | - No. 1, January 1982, pp. 247-252. | |
| 389 | - | |
| 390 | - [5] ITU-T Recommendation X.224, Annex D, "Checksum Algorithms," | |
| 391 | - November, 1993, pp. 144, 145. (Available from | |
| 392 | - gopher://info.itu.ch). ITU-T X.244 is also the same as ISO 8073. | |
| 393 | - | |
| 394 | - | |
| 395 | - | |
| 396 | - | |
| 397 | - | |
| 398 | - | |
| 399 | - | |
| 400 | -Deutsch & Gailly Informational [Page 7] | |
| 401 | - | |
| 402 | - | |
| 403 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 404 | - | |
| 405 | - | |
| 406 | -4. Source code | |
| 407 | - | |
| 408 | - Source code for a C language implementation of a "zlib" compliant | |
| 409 | - library is available at ftp://ftp.uu.net/pub/archiving/zip/zlib/. | |
| 410 | - | |
| 411 | -5. Security Considerations | |
| 412 | - | |
| 413 | - A decoder that fails to check the ADLER32 checksum value may be | |
| 414 | - subject to undetected data corruption. | |
| 415 | - | |
| 416 | -6. Acknowledgements | |
| 417 | - | |
| 418 | - Trademarks cited in this document are the property of their | |
| 419 | - respective owners. | |
| 420 | - | |
| 421 | - Jean-Loup Gailly and Mark Adler designed the zlib format and wrote | |
| 422 | - the related software described in this specification. Glenn | |
| 423 | - Randers-Pehrson converted this document to RFC and HTML format. | |
| 424 | - | |
| 425 | -7. Authors' Addresses | |
| 426 | - | |
| 427 | - L. Peter Deutsch | |
| 428 | - Aladdin Enterprises | |
| 429 | - 203 Santa Margarita Ave. | |
| 430 | - Menlo Park, CA 94025 | |
| 431 | - | |
| 432 | - Phone: (415) 322-0103 (AM only) | |
| 433 | - FAX: (415) 322-1734 | |
| 434 | - EMail: <[email protected]> | |
| 435 | - | |
| 436 | - | |
| 437 | - Jean-Loup Gailly | |
| 438 | - | |
| 439 | - EMail: <[email protected]> | |
| 440 | - | |
| 441 | - Questions about the technical content of this specification can be | |
| 442 | - sent by email to | |
| 443 | - | |
| 444 | - Jean-Loup Gailly <[email protected]> and | |
| 445 | - Mark Adler <[email protected]> | |
| 446 | - | |
| 447 | - Editorial comments on this specification can be sent by email to | |
| 448 | - | |
| 449 | - L. Peter Deutsch <[email protected]> and | |
| 450 | - Glenn Randers-Pehrson <[email protected]> | |
| 451 | - | |
| 452 | - | |
| 453 | - | |
| 454 | - | |
| 455 | - | |
| 456 | - | |
| 457 | -Deutsch & Gailly Informational [Page 8] | |
| 458 | - | |
| 459 | - | |
| 460 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 461 | - | |
| 462 | - | |
| 463 | -8. Appendix: Rationale | |
| 464 | - | |
| 465 | - 8.1. Preset dictionaries | |
| 466 | - | |
| 467 | - A preset dictionary is specially useful to compress short input | |
| 468 | - sequences. The compressor can take advantage of the dictionary | |
| 469 | - context to encode the input in a more compact manner. The | |
| 470 | - decompressor can be initialized with the appropriate context by | |
| 471 | - virtually decompressing a compressed version of the dictionary | |
| 472 | - without producing any output. However for certain compression | |
| 473 | - algorithms such as the deflate algorithm this operation can be | |
| 474 | - achieved without actually performing any decompression. | |
| 475 | - | |
| 476 | - The compressor and the decompressor must use exactly the same | |
| 477 | - dictionary. The dictionary may be fixed or may be chosen among a | |
| 478 | - certain number of predefined dictionaries, according to the kind | |
| 479 | - of input data. The decompressor can determine which dictionary has | |
| 480 | - been chosen by the compressor by checking the dictionary | |
| 481 | - identifier. This document does not specify the contents of | |
| 482 | - predefined dictionaries, since the optimal dictionaries are | |
| 483 | - application specific. Standard data formats using this feature of | |
| 484 | - the zlib specification must precisely define the allowed | |
| 485 | - dictionaries. | |
| 486 | - | |
| 487 | - 8.2. The Adler-32 algorithm | |
| 488 | - | |
| 489 | - The Adler-32 algorithm is much faster than the CRC32 algorithm yet | |
| 490 | - still provides an extremely low probability of undetected errors. | |
| 491 | - | |
| 492 | - The modulo on unsigned long accumulators can be delayed for 5552 | |
| 493 | - bytes, so the modulo operation time is negligible. If the bytes | |
| 494 | - are a, b, c, the second sum is 3a + 2b + c + 3, and so is position | |
| 495 | - and order sensitive, unlike the first sum, which is just a | |
| 496 | - checksum. That 65521 is prime is important to avoid a possible | |
| 497 | - large class of two-byte errors that leave the check unchanged. | |
| 498 | - (The Fletcher checksum uses 255, which is not prime and which also | |
| 499 | - makes the Fletcher check insensitive to single byte changes 0 <-> | |
| 500 | - 255.) | |
| 501 | - | |
| 502 | - The sum s1 is initialized to 1 instead of zero to make the length | |
| 503 | - of the sequence part of s2, so that the length does not have to be | |
| 504 | - checked separately. (Any sequence of zeroes has a Fletcher | |
| 505 | - checksum of zero.) | |
| 506 | - | |
| 507 | - | |
| 508 | - | |
| 509 | - | |
| 510 | - | |
| 511 | - | |
| 512 | - | |
| 513 | - | |
| 514 | -Deutsch & Gailly Informational [Page 9] | |
| 515 | - | |
| 516 | - | |
| 517 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 518 | - | |
| 519 | - | |
| 520 | -9. Appendix: Sample code | |
| 521 | - | |
| 522 | - The following C code computes the Adler-32 checksum of a data buffer. | |
| 523 | - It is written for clarity, not for speed. The sample code is in the | |
| 524 | - ANSI C programming language. Non C users may find it easier to read | |
| 525 | - with these hints: | |
| 526 | - | |
| 527 | - & Bitwise AND operator. | |
| 528 | - >> Bitwise right shift operator. When applied to an | |
| 529 | - unsigned quantity, as here, right shift inserts zero bit(s) | |
| 530 | - at the left. | |
| 531 | - << Bitwise left shift operator. Left shift inserts zero | |
| 532 | - bit(s) at the right. | |
| 533 | - ++ "n++" increments the variable n. | |
| 534 | - % modulo operator: a % b is the remainder of a divided by b. | |
| 535 | - | |
| 536 | - #define BASE 65521 /* largest prime smaller than 65536 */ | |
| 537 | - | |
| 538 | - /* | |
| 539 | - Update a running Adler-32 checksum with the bytes buf[0..len-1] | |
| 540 | - and return the updated checksum. The Adler-32 checksum should be | |
| 541 | - initialized to 1. | |
| 542 | - | |
| 543 | - Usage example: | |
| 544 | - | |
| 545 | - unsigned long adler = 1L; | |
| 546 | - | |
| 547 | - while (read_buffer(buffer, length) != EOF) { | |
| 548 | - adler = update_adler32(adler, buffer, length); | |
| 549 | - } | |
| 550 | - if (adler != original_adler) error(); | |
| 551 | - */ | |
| 552 | - unsigned long update_adler32(unsigned long adler, | |
| 553 | - unsigned char *buf, int len) | |
| 554 | - { | |
| 555 | - unsigned long s1 = adler & 0xffff; | |
| 556 | - unsigned long s2 = (adler >> 16) & 0xffff; | |
| 557 | - int n; | |
| 558 | - | |
| 559 | - for (n = 0; n < len; n++) { | |
| 560 | - s1 = (s1 + buf[n]) % BASE; | |
| 561 | - s2 = (s2 + s1) % BASE; | |
| 562 | - } | |
| 563 | - return (s2 << 16) + s1; | |
| 564 | - } | |
| 565 | - | |
| 566 | - /* Return the adler32 of the bytes buf[0..len-1] */ | |
| 567 | - | |
| 568 | - | |
| 569 | - | |
| 570 | - | |
| 571 | -Deutsch & Gailly Informational [Page 10] | |
| 572 | - | |
| 573 | - | |
| 574 | -RFC 1950 ZLIB Compressed Data Format Specification May 1996 | |
| 575 | - | |
| 576 | - | |
| 577 | - unsigned long adler32(unsigned char *buf, int len) | |
| 578 | - { | |
| 579 | - return update_adler32(1L, buf, len); | |
| 580 | - } | |
| 581 | - | |
| 582 | - | |
| 583 | - | |
| 584 | - | |
| 585 | - | |
| 586 | - | |
| 587 | - | |
| 588 | - | |
| 589 | - | |
| 590 | - | |
| 591 | - | |
| 592 | - | |
| 593 | - | |
| 594 | - | |
| 595 | - | |
| 596 | - | |
| 597 | - | |
| 598 | - | |
| 599 | - | |
| 600 | - | |
| 601 | - | |
| 602 | - | |
| 603 | - | |
| 604 | - | |
| 605 | - | |
| 606 | - | |
| 607 | - | |
| 608 | - | |
| 609 | - | |
| 610 | - | |
| 611 | - | |
| 612 | - | |
| 613 | - | |
| 614 | - | |
| 615 | - | |
| 616 | - | |
| 617 | - | |
| 618 | - | |
| 619 | - | |
| 620 | - | |
| 621 | - | |
| 622 | - | |
| 623 | - | |
| 624 | - | |
| 625 | - | |
| 626 | - | |
| 627 | - | |
| 628 | -Deutsch & Gailly Informational [Page 11] | |
| 629 | - | |
| 630 | - |
| --- a/compat/zlib/doc/rfc1950.txt | |
| +++ b/compat/zlib/doc/rfc1950.txt | |
| @@ -1,630 +0,0 @@ | |
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | Network Working Group P. Deutsch |
| 8 | Request for Comments: 1950 Aladdin Enterprises |
| 9 | Category: Informational J-L. Gailly |
| 10 | Info-ZIP |
| 11 | May 1996 |
| 12 | |
| 13 | |
| 14 | ZLIB Compressed Data Format Specification version 3.3 |
| 15 | |
| 16 | Status of This Memo |
| 17 | |
| 18 | This memo provides information for the Internet community. This memo |
| 19 | does not specify an Internet standard of any kind. Distribution of |
| 20 | this memo is unlimited. |
| 21 | |
| 22 | IESG Note: |
| 23 | |
| 24 | The IESG takes no position on the validity of any Intellectual |
| 25 | Property Rights statements contained in this document. |
| 26 | |
| 27 | Notices |
| 28 | |
| 29 | Copyright (c) 1996 L. Peter Deutsch and Jean-Loup Gailly |
| 30 | |
| 31 | Permission is granted to copy and distribute this document for any |
| 32 | purpose and without charge, including translations into other |
| 33 | languages and incorporation into compilations, provided that the |
| 34 | copyright notice and this notice are preserved, and that any |
| 35 | substantive changes or deletions from the original are clearly |
| 36 | marked. |
| 37 | |
| 38 | A pointer to the latest version of this and related documentation in |
| 39 | HTML format can be found at the URL |
| 40 | <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>. |
| 41 | |
| 42 | Abstract |
| 43 | |
| 44 | This specification defines a lossless compressed data format. The |
| 45 | data can be produced or consumed, even for an arbitrarily long |
| 46 | sequentially presented input data stream, using only an a priori |
| 47 | bounded amount of intermediate storage. The format presently uses |
| 48 | the DEFLATE compression method but can be easily extended to use |
| 49 | other compression methods. It can be implemented readily in a manner |
| 50 | not covered by patents. This specification also defines the ADLER-32 |
| 51 | checksum (an extension and improvement of the Fletcher checksum), |
| 52 | used for detection of data corruption, and provides an algorithm for |
| 53 | computing it. |
| 54 | |
| 55 | |
| 56 | |
| 57 | |
| 58 | Deutsch & Gailly Informational [Page 1] |
| 59 | |
| 60 | |
| 61 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 62 | |
| 63 | |
| 64 | Table of Contents |
| 65 | |
| 66 | 1. Introduction ................................................... 2 |
| 67 | 1.1. Purpose ................................................... 2 |
| 68 | 1.2. Intended audience ......................................... 3 |
| 69 | 1.3. Scope ..................................................... 3 |
| 70 | 1.4. Compliance ................................................ 3 |
| 71 | 1.5. Definitions of terms and conventions used ................ 3 |
| 72 | 1.6. Changes from previous versions ............................ 3 |
| 73 | 2. Detailed specification ......................................... 3 |
| 74 | 2.1. Overall conventions ....................................... 3 |
| 75 | 2.2. Data format ............................................... 4 |
| 76 | 2.3. Compliance ................................................ 7 |
| 77 | 3. References ..................................................... 7 |
| 78 | 4. Source code .................................................... 8 |
| 79 | 5. Security Considerations ........................................ 8 |
| 80 | 6. Acknowledgements ............................................... 8 |
| 81 | 7. Authors' Addresses ............................................. 8 |
| 82 | 8. Appendix: Rationale ............................................ 9 |
| 83 | 9. Appendix: Sample code ..........................................10 |
| 84 | |
| 85 | 1. Introduction |
| 86 | |
| 87 | 1.1. Purpose |
| 88 | |
| 89 | The purpose of this specification is to define a lossless |
| 90 | compressed data format that: |
| 91 | |
| 92 | * Is independent of CPU type, operating system, file system, |
| 93 | and character set, and hence can be used for interchange; |
| 94 | |
| 95 | * Can be produced or consumed, even for an arbitrarily long |
| 96 | sequentially presented input data stream, using only an a |
| 97 | priori bounded amount of intermediate storage, and hence can |
| 98 | be used in data communications or similar structures such as |
| 99 | Unix filters; |
| 100 | |
| 101 | * Can use a number of different compression methods; |
| 102 | |
| 103 | * Can be implemented readily in a manner not covered by |
| 104 | patents, and hence can be practiced freely. |
| 105 | |
| 106 | The data format defined by this specification does not attempt to |
| 107 | allow random access to compressed data. |
| 108 | |
| 109 | |
| 110 | |
| 111 | |
| 112 | |
| 113 | |
| 114 | |
| 115 | Deutsch & Gailly Informational [Page 2] |
| 116 | |
| 117 | |
| 118 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 119 | |
| 120 | |
| 121 | 1.2. Intended audience |
| 122 | |
| 123 | This specification is intended for use by implementors of software |
| 124 | to compress data into zlib format and/or decompress data from zlib |
| 125 | format. |
| 126 | |
| 127 | The text of the specification assumes a basic background in |
| 128 | programming at the level of bits and other primitive data |
| 129 | representations. |
| 130 | |
| 131 | 1.3. Scope |
| 132 | |
| 133 | The specification specifies a compressed data format that can be |
| 134 | used for in-memory compression of a sequence of arbitrary bytes. |
| 135 | |
| 136 | 1.4. Compliance |
| 137 | |
| 138 | Unless otherwise indicated below, a compliant decompressor must be |
| 139 | able to accept and decompress any data set that conforms to all |
| 140 | the specifications presented here; a compliant compressor must |
| 141 | produce data sets that conform to all the specifications presented |
| 142 | here. |
| 143 | |
| 144 | 1.5. Definitions of terms and conventions used |
| 145 | |
| 146 | byte: 8 bits stored or transmitted as a unit (same as an octet). |
| 147 | (For this specification, a byte is exactly 8 bits, even on |
| 148 | machines which store a character on a number of bits different |
| 149 | from 8.) See below, for the numbering of bits within a byte. |
| 150 | |
| 151 | 1.6. Changes from previous versions |
| 152 | |
| 153 | Version 3.1 was the first public release of this specification. |
| 154 | In version 3.2, some terminology was changed and the Adler-32 |
| 155 | sample code was rewritten for clarity. In version 3.3, the |
| 156 | support for a preset dictionary was introduced, and the |
| 157 | specification was converted to RFC style. |
| 158 | |
| 159 | 2. Detailed specification |
| 160 | |
| 161 | 2.1. Overall conventions |
| 162 | |
| 163 | In the diagrams below, a box like this: |
| 164 | |
| 165 | +---+ |
| 166 | | | <-- the vertical bars might be missing |
| 167 | +---+ |
| 168 | |
| 169 | |
| 170 | |
| 171 | |
| 172 | Deutsch & Gailly Informational [Page 3] |
| 173 | |
| 174 | |
| 175 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 176 | |
| 177 | |
| 178 | represents one byte; a box like this: |
| 179 | |
| 180 | +==============+ |
| 181 | | | |
| 182 | +==============+ |
| 183 | |
| 184 | represents a variable number of bytes. |
| 185 | |
| 186 | Bytes stored within a computer do not have a "bit order", since |
| 187 | they are always treated as a unit. However, a byte considered as |
| 188 | an integer between 0 and 255 does have a most- and least- |
| 189 | significant bit, and since we write numbers with the most- |
| 190 | significant digit on the left, we also write bytes with the most- |
| 191 | significant bit on the left. In the diagrams below, we number the |
| 192 | bits of a byte so that bit 0 is the least-significant bit, i.e., |
| 193 | the bits are numbered: |
| 194 | |
| 195 | +--------+ |
| 196 | |76543210| |
| 197 | +--------+ |
| 198 | |
| 199 | Within a computer, a number may occupy multiple bytes. All |
| 200 | multi-byte numbers in the format described here are stored with |
| 201 | the MOST-significant byte first (at the lower memory address). |
| 202 | For example, the decimal number 520 is stored as: |
| 203 | |
| 204 | 0 1 |
| 205 | +--------+--------+ |
| 206 | |00000010|00001000| |
| 207 | +--------+--------+ |
| 208 | ^ ^ |
| 209 | | | |
| 210 | | + less significant byte = 8 |
| 211 | + more significant byte = 2 x 256 |
| 212 | |
| 213 | 2.2. Data format |
| 214 | |
| 215 | A zlib stream has the following structure: |
| 216 | |
| 217 | 0 1 |
| 218 | +---+---+ |
| 219 | |CMF|FLG| (more-->) |
| 220 | +---+---+ |
| 221 | |
| 222 | |
| 223 | |
| 224 | |
| 225 | |
| 226 | |
| 227 | |
| 228 | |
| 229 | Deutsch & Gailly Informational [Page 4] |
| 230 | |
| 231 | |
| 232 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 233 | |
| 234 | |
| 235 | (if FLG.FDICT set) |
| 236 | |
| 237 | 0 1 2 3 |
| 238 | +---+---+---+---+ |
| 239 | | DICTID | (more-->) |
| 240 | +---+---+---+---+ |
| 241 | |
| 242 | +=====================+---+---+---+---+ |
| 243 | |...compressed data...| ADLER32 | |
| 244 | +=====================+---+---+---+---+ |
| 245 | |
| 246 | Any data which may appear after ADLER32 are not part of the zlib |
| 247 | stream. |
| 248 | |
| 249 | CMF (Compression Method and flags) |
| 250 | This byte is divided into a 4-bit compression method and a 4- |
| 251 | bit information field depending on the compression method. |
| 252 | |
| 253 | bits 0 to 3 CM Compression method |
| 254 | bits 4 to 7 CINFO Compression info |
| 255 | |
| 256 | CM (Compression method) |
| 257 | This identifies the compression method used in the file. CM = 8 |
| 258 | denotes the "deflate" compression method with a window size up |
| 259 | to 32K. This is the method used by gzip and PNG (see |
| 260 | references [1] and [2] in Chapter 3, below, for the reference |
| 261 | documents). CM = 15 is reserved. It might be used in a future |
| 262 | version of this specification to indicate the presence of an |
| 263 | extra field before the compressed data. |
| 264 | |
| 265 | CINFO (Compression info) |
| 266 | For CM = 8, CINFO is the base-2 logarithm of the LZ77 window |
| 267 | size, minus eight (CINFO=7 indicates a 32K window size). Values |
| 268 | of CINFO above 7 are not allowed in this version of the |
| 269 | specification. CINFO is not defined in this specification for |
| 270 | CM not equal to 8. |
| 271 | |
| 272 | FLG (FLaGs) |
| 273 | This flag byte is divided as follows: |
| 274 | |
| 275 | bits 0 to 4 FCHECK (check bits for CMF and FLG) |
| 276 | bit 5 FDICT (preset dictionary) |
| 277 | bits 6 to 7 FLEVEL (compression level) |
| 278 | |
| 279 | The FCHECK value must be such that CMF and FLG, when viewed as |
| 280 | a 16-bit unsigned integer stored in MSB order (CMF*256 + FLG), |
| 281 | is a multiple of 31. |
| 282 | |
| 283 | |
| 284 | |
| 285 | |
| 286 | Deutsch & Gailly Informational [Page 5] |
| 287 | |
| 288 | |
| 289 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 290 | |
| 291 | |
| 292 | FDICT (Preset dictionary) |
| 293 | If FDICT is set, a DICT dictionary identifier is present |
| 294 | immediately after the FLG byte. The dictionary is a sequence of |
| 295 | bytes which are initially fed to the compressor without |
| 296 | producing any compressed output. DICT is the Adler-32 checksum |
| 297 | of this sequence of bytes (see the definition of ADLER32 |
| 298 | below). The decompressor can use this identifier to determine |
| 299 | which dictionary has been used by the compressor. |
| 300 | |
| 301 | FLEVEL (Compression level) |
| 302 | These flags are available for use by specific compression |
| 303 | methods. The "deflate" method (CM = 8) sets these flags as |
| 304 | follows: |
| 305 | |
| 306 | 0 - compressor used fastest algorithm |
| 307 | 1 - compressor used fast algorithm |
| 308 | 2 - compressor used default algorithm |
| 309 | 3 - compressor used maximum compression, slowest algorithm |
| 310 | |
| 311 | The information in FLEVEL is not needed for decompression; it |
| 312 | is there to indicate if recompression might be worthwhile. |
| 313 | |
| 314 | compressed data |
| 315 | For compression method 8, the compressed data is stored in the |
| 316 | deflate compressed data format as described in the document |
| 317 | "DEFLATE Compressed Data Format Specification" by L. Peter |
| 318 | Deutsch. (See reference [3] in Chapter 3, below) |
| 319 | |
| 320 | Other compressed data formats are not specified in this version |
| 321 | of the zlib specification. |
| 322 | |
| 323 | ADLER32 (Adler-32 checksum) |
| 324 | This contains a checksum value of the uncompressed data |
| 325 | (excluding any dictionary data) computed according to Adler-32 |
| 326 | algorithm. This algorithm is a 32-bit extension and improvement |
| 327 | of the Fletcher algorithm, used in the ITU-T X.224 / ISO 8073 |
| 328 | standard. See references [4] and [5] in Chapter 3, below) |
| 329 | |
| 330 | Adler-32 is composed of two sums accumulated per byte: s1 is |
| 331 | the sum of all bytes, s2 is the sum of all s1 values. Both sums |
| 332 | are done modulo 65521. s1 is initialized to 1, s2 to zero. The |
| 333 | Adler-32 checksum is stored as s2*65536 + s1 in most- |
| 334 | significant-byte first (network) order. |
| 335 | |
| 336 | |
| 337 | |
| 338 | |
| 339 | |
| 340 | |
| 341 | |
| 342 | |
| 343 | Deutsch & Gailly Informational [Page 6] |
| 344 | |
| 345 | |
| 346 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 347 | |
| 348 | |
| 349 | 2.3. Compliance |
| 350 | |
| 351 | A compliant compressor must produce streams with correct CMF, FLG |
| 352 | and ADLER32, but need not support preset dictionaries. When the |
| 353 | zlib data format is used as part of another standard data format, |
| 354 | the compressor may use only preset dictionaries that are specified |
| 355 | by this other data format. If this other format does not use the |
| 356 | preset dictionary feature, the compressor must not set the FDICT |
| 357 | flag. |
| 358 | |
| 359 | A compliant decompressor must check CMF, FLG, and ADLER32, and |
| 360 | provide an error indication if any of these have incorrect values. |
| 361 | A compliant decompressor must give an error indication if CM is |
| 362 | not one of the values defined in this specification (only the |
| 363 | value 8 is permitted in this version), since another value could |
| 364 | indicate the presence of new features that would cause subsequent |
| 365 | data to be interpreted incorrectly. A compliant decompressor must |
| 366 | give an error indication if FDICT is set and DICTID is not the |
| 367 | identifier of a known preset dictionary. A decompressor may |
| 368 | ignore FLEVEL and still be compliant. When the zlib data format |
| 369 | is being used as a part of another standard format, a compliant |
| 370 | decompressor must support all the preset dictionaries specified by |
| 371 | the other format. When the other format does not use the preset |
| 372 | dictionary feature, a compliant decompressor must reject any |
| 373 | stream in which the FDICT flag is set. |
| 374 | |
| 375 | 3. References |
| 376 | |
| 377 | [1] Deutsch, L.P.,"GZIP Compressed Data Format Specification", |
| 378 | available in ftp://ftp.uu.net/pub/archiving/zip/doc/ |
| 379 | |
| 380 | [2] Thomas Boutell, "PNG (Portable Network Graphics) specification", |
| 381 | available in ftp://ftp.uu.net/graphics/png/documents/ |
| 382 | |
| 383 | [3] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification", |
| 384 | available in ftp://ftp.uu.net/pub/archiving/zip/doc/ |
| 385 | |
| 386 | [4] Fletcher, J. G., "An Arithmetic Checksum for Serial |
| 387 | Transmissions," IEEE Transactions on Communications, Vol. COM-30, |
| 388 | No. 1, January 1982, pp. 247-252. |
| 389 | |
| 390 | [5] ITU-T Recommendation X.224, Annex D, "Checksum Algorithms," |
| 391 | November, 1993, pp. 144, 145. (Available from |
| 392 | gopher://info.itu.ch). ITU-T X.244 is also the same as ISO 8073. |
| 393 | |
| 394 | |
| 395 | |
| 396 | |
| 397 | |
| 398 | |
| 399 | |
| 400 | Deutsch & Gailly Informational [Page 7] |
| 401 | |
| 402 | |
| 403 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 404 | |
| 405 | |
| 406 | 4. Source code |
| 407 | |
| 408 | Source code for a C language implementation of a "zlib" compliant |
| 409 | library is available at ftp://ftp.uu.net/pub/archiving/zip/zlib/. |
| 410 | |
| 411 | 5. Security Considerations |
| 412 | |
| 413 | A decoder that fails to check the ADLER32 checksum value may be |
| 414 | subject to undetected data corruption. |
| 415 | |
| 416 | 6. Acknowledgements |
| 417 | |
| 418 | Trademarks cited in this document are the property of their |
| 419 | respective owners. |
| 420 | |
| 421 | Jean-Loup Gailly and Mark Adler designed the zlib format and wrote |
| 422 | the related software described in this specification. Glenn |
| 423 | Randers-Pehrson converted this document to RFC and HTML format. |
| 424 | |
| 425 | 7. Authors' Addresses |
| 426 | |
| 427 | L. Peter Deutsch |
| 428 | Aladdin Enterprises |
| 429 | 203 Santa Margarita Ave. |
| 430 | Menlo Park, CA 94025 |
| 431 | |
| 432 | Phone: (415) 322-0103 (AM only) |
| 433 | FAX: (415) 322-1734 |
| 434 | EMail: <[email protected]> |
| 435 | |
| 436 | |
| 437 | Jean-Loup Gailly |
| 438 | |
| 439 | EMail: <[email protected]> |
| 440 | |
| 441 | Questions about the technical content of this specification can be |
| 442 | sent by email to |
| 443 | |
| 444 | Jean-Loup Gailly <[email protected]> and |
| 445 | Mark Adler <[email protected]> |
| 446 | |
| 447 | Editorial comments on this specification can be sent by email to |
| 448 | |
| 449 | L. Peter Deutsch <[email protected]> and |
| 450 | Glenn Randers-Pehrson <[email protected]> |
| 451 | |
| 452 | |
| 453 | |
| 454 | |
| 455 | |
| 456 | |
| 457 | Deutsch & Gailly Informational [Page 8] |
| 458 | |
| 459 | |
| 460 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 461 | |
| 462 | |
| 463 | 8. Appendix: Rationale |
| 464 | |
| 465 | 8.1. Preset dictionaries |
| 466 | |
| 467 | A preset dictionary is specially useful to compress short input |
| 468 | sequences. The compressor can take advantage of the dictionary |
| 469 | context to encode the input in a more compact manner. The |
| 470 | decompressor can be initialized with the appropriate context by |
| 471 | virtually decompressing a compressed version of the dictionary |
| 472 | without producing any output. However for certain compression |
| 473 | algorithms such as the deflate algorithm this operation can be |
| 474 | achieved without actually performing any decompression. |
| 475 | |
| 476 | The compressor and the decompressor must use exactly the same |
| 477 | dictionary. The dictionary may be fixed or may be chosen among a |
| 478 | certain number of predefined dictionaries, according to the kind |
| 479 | of input data. The decompressor can determine which dictionary has |
| 480 | been chosen by the compressor by checking the dictionary |
| 481 | identifier. This document does not specify the contents of |
| 482 | predefined dictionaries, since the optimal dictionaries are |
| 483 | application specific. Standard data formats using this feature of |
| 484 | the zlib specification must precisely define the allowed |
| 485 | dictionaries. |
| 486 | |
| 487 | 8.2. The Adler-32 algorithm |
| 488 | |
| 489 | The Adler-32 algorithm is much faster than the CRC32 algorithm yet |
| 490 | still provides an extremely low probability of undetected errors. |
| 491 | |
| 492 | The modulo on unsigned long accumulators can be delayed for 5552 |
| 493 | bytes, so the modulo operation time is negligible. If the bytes |
| 494 | are a, b, c, the second sum is 3a + 2b + c + 3, and so is position |
| 495 | and order sensitive, unlike the first sum, which is just a |
| 496 | checksum. That 65521 is prime is important to avoid a possible |
| 497 | large class of two-byte errors that leave the check unchanged. |
| 498 | (The Fletcher checksum uses 255, which is not prime and which also |
| 499 | makes the Fletcher check insensitive to single byte changes 0 <-> |
| 500 | 255.) |
| 501 | |
| 502 | The sum s1 is initialized to 1 instead of zero to make the length |
| 503 | of the sequence part of s2, so that the length does not have to be |
| 504 | checked separately. (Any sequence of zeroes has a Fletcher |
| 505 | checksum of zero.) |
| 506 | |
| 507 | |
| 508 | |
| 509 | |
| 510 | |
| 511 | |
| 512 | |
| 513 | |
| 514 | Deutsch & Gailly Informational [Page 9] |
| 515 | |
| 516 | |
| 517 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 518 | |
| 519 | |
| 520 | 9. Appendix: Sample code |
| 521 | |
| 522 | The following C code computes the Adler-32 checksum of a data buffer. |
| 523 | It is written for clarity, not for speed. The sample code is in the |
| 524 | ANSI C programming language. Non C users may find it easier to read |
| 525 | with these hints: |
| 526 | |
| 527 | & Bitwise AND operator. |
| 528 | >> Bitwise right shift operator. When applied to an |
| 529 | unsigned quantity, as here, right shift inserts zero bit(s) |
| 530 | at the left. |
| 531 | << Bitwise left shift operator. Left shift inserts zero |
| 532 | bit(s) at the right. |
| 533 | ++ "n++" increments the variable n. |
| 534 | % modulo operator: a % b is the remainder of a divided by b. |
| 535 | |
| 536 | #define BASE 65521 /* largest prime smaller than 65536 */ |
| 537 | |
| 538 | /* |
| 539 | Update a running Adler-32 checksum with the bytes buf[0..len-1] |
| 540 | and return the updated checksum. The Adler-32 checksum should be |
| 541 | initialized to 1. |
| 542 | |
| 543 | Usage example: |
| 544 | |
| 545 | unsigned long adler = 1L; |
| 546 | |
| 547 | while (read_buffer(buffer, length) != EOF) { |
| 548 | adler = update_adler32(adler, buffer, length); |
| 549 | } |
| 550 | if (adler != original_adler) error(); |
| 551 | */ |
| 552 | unsigned long update_adler32(unsigned long adler, |
| 553 | unsigned char *buf, int len) |
| 554 | { |
| 555 | unsigned long s1 = adler & 0xffff; |
| 556 | unsigned long s2 = (adler >> 16) & 0xffff; |
| 557 | int n; |
| 558 | |
| 559 | for (n = 0; n < len; n++) { |
| 560 | s1 = (s1 + buf[n]) % BASE; |
| 561 | s2 = (s2 + s1) % BASE; |
| 562 | } |
| 563 | return (s2 << 16) + s1; |
| 564 | } |
| 565 | |
| 566 | /* Return the adler32 of the bytes buf[0..len-1] */ |
| 567 | |
| 568 | |
| 569 | |
| 570 | |
| 571 | Deutsch & Gailly Informational [Page 10] |
| 572 | |
| 573 | |
| 574 | RFC 1950 ZLIB Compressed Data Format Specification May 1996 |
| 575 | |
| 576 | |
| 577 | unsigned long adler32(unsigned char *buf, int len) |
| 578 | { |
| 579 | return update_adler32(1L, buf, len); |
| 580 | } |
| 581 | |
| 582 | |
| 583 | |
| 584 | |
| 585 | |
| 586 | |
| 587 | |
| 588 | |
| 589 | |
| 590 | |
| 591 | |
| 592 | |
| 593 | |
| 594 | |
| 595 | |
| 596 | |
| 597 | |
| 598 | |
| 599 | |
| 600 | |
| 601 | |
| 602 | |
| 603 | |
| 604 | |
| 605 | |
| 606 | |
| 607 | |
| 608 | |
| 609 | |
| 610 | |
| 611 | |
| 612 | |
| 613 | |
| 614 | |
| 615 | |
| 616 | |
| 617 | |
| 618 | |
| 619 | |
| 620 | |
| 621 | |
| 622 | |
| 623 | |
| 624 | |
| 625 | |
| 626 | |
| 627 | |
| 628 | Deutsch & Gailly Informational [Page 11] |
| 629 | |
| 630 |
| --- a/compat/zlib/doc/rfc1950.txt | |
| +++ b/compat/zlib/doc/rfc1950.txt | |
| @@ -1,630 +0,0 @@ | |
D
compat/zlib/doc/rfc1951.txt
-972
| --- a/compat/zlib/doc/rfc1951.txt | ||
| +++ b/compat/zlib/doc/rfc1951.txt | ||
| @@ -1,972 +0,0 @@ | ||
| 1 | - | |
| 2 | - | |
| 3 | - | |
| 4 | - | |
| 5 | - | |
| 6 | - | |
| 7 | -Network Working Group P. Deutsch | |
| 8 | -Request for Comments: 1951 Aladdin Enterprises | |
| 9 | -Category: Informational May 1996 | |
| 10 | - | |
| 11 | - | |
| 12 | - DEFLATE Compressed Data Format Specification version 1.3 | |
| 13 | - | |
| 14 | -Status of This Memo | |
| 15 | - | |
| 16 | - This memo provides information for the Internet community. This memo | |
| 17 | - does not specify an Internet standard of any kind. Distribution of | |
| 18 | - this memo is unlimited. | |
| 19 | - | |
| 20 | -IESG Note: | |
| 21 | - | |
| 22 | - The IESG takes no position on the validity of any Intellectual | |
| 23 | - Property Rights statements contained in this document. | |
| 24 | - | |
| 25 | -Notices | |
| 26 | - | |
| 27 | - Copyright (c) 1996 L. Peter Deutsch | |
| 28 | - | |
| 29 | - Permission is granted to copy and distribute this document for any | |
| 30 | - purpose and without charge, including translations into other | |
| 31 | - languages and incorporation into compilations, provided that the | |
| 32 | - copyright notice and this notice are preserved, and that any | |
| 33 | - substantive changes or deletions from the original are clearly | |
| 34 | - marked. | |
| 35 | - | |
| 36 | - A pointer to the latest version of this and related documentation in | |
| 37 | - HTML format can be found at the URL | |
| 38 | - <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>. | |
| 39 | - | |
| 40 | -Abstract | |
| 41 | - | |
| 42 | - This specification defines a lossless compressed data format that | |
| 43 | - compresses data using a combination of the LZ77 algorithm and Huffman | |
| 44 | - coding, with efficiency comparable to the best currently available | |
| 45 | - general-purpose compression methods. The data can be produced or | |
| 46 | - consumed, even for an arbitrarily long sequentially presented input | |
| 47 | - data stream, using only an a priori bounded amount of intermediate | |
| 48 | - storage. The format can be implemented readily in a manner not | |
| 49 | - covered by patents. | |
| 50 | - | |
| 51 | - | |
| 52 | - | |
| 53 | - | |
| 54 | - | |
| 55 | - | |
| 56 | - | |
| 57 | - | |
| 58 | -Deutsch Informational [Page 1] | |
| 59 | - | |
| 60 | - | |
| 61 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 62 | - | |
| 63 | - | |
| 64 | -Table of Contents | |
| 65 | - | |
| 66 | - 1. Introduction ................................................... 2 | |
| 67 | - 1.1. Purpose ................................................... 2 | |
| 68 | - 1.2. Intended audience ......................................... 3 | |
| 69 | - 1.3. Scope ..................................................... 3 | |
| 70 | - 1.4. Compliance ................................................ 3 | |
| 71 | - 1.5. Definitions of terms and conventions used ................ 3 | |
| 72 | - 1.6. Changes from previous versions ............................ 4 | |
| 73 | - 2. Compressed representation overview ............................. 4 | |
| 74 | - 3. Detailed specification ......................................... 5 | |
| 75 | - 3.1. Overall conventions ....................................... 5 | |
| 76 | - 3.1.1. Packing into bytes .................................. 5 | |
| 77 | - 3.2. Compressed block format ................................... 6 | |
| 78 | - 3.2.1. Synopsis of prefix and Huffman coding ............... 6 | |
| 79 | - 3.2.2. Use of Huffman coding in the "deflate" format ....... 7 | |
| 80 | - 3.2.3. Details of block format ............................. 9 | |
| 81 | - 3.2.4. Non-compressed blocks (BTYPE=00) ................... 11 | |
| 82 | - 3.2.5. Compressed blocks (length and distance codes) ...... 11 | |
| 83 | - 3.2.6. Compression with fixed Huffman codes (BTYPE=01) .... 12 | |
| 84 | - 3.2.7. Compression with dynamic Huffman codes (BTYPE=10) .. 13 | |
| 85 | - 3.3. Compliance ............................................... 14 | |
| 86 | - 4. Compression algorithm details ................................. 14 | |
| 87 | - 5. References .................................................... 16 | |
| 88 | - 6. Security Considerations ....................................... 16 | |
| 89 | - 7. Source code ................................................... 16 | |
| 90 | - 8. Acknowledgements .............................................. 16 | |
| 91 | - 9. Author's Address .............................................. 17 | |
| 92 | - | |
| 93 | -1. Introduction | |
| 94 | - | |
| 95 | - 1.1. Purpose | |
| 96 | - | |
| 97 | - The purpose of this specification is to define a lossless | |
| 98 | - compressed data format that: | |
| 99 | - * Is independent of CPU type, operating system, file system, | |
| 100 | - and character set, and hence can be used for interchange; | |
| 101 | - * Can be produced or consumed, even for an arbitrarily long | |
| 102 | - sequentially presented input data stream, using only an a | |
| 103 | - priori bounded amount of intermediate storage, and hence | |
| 104 | - can be used in data communications or similar structures | |
| 105 | - such as Unix filters; | |
| 106 | - * Compresses data with efficiency comparable to the best | |
| 107 | - currently available general-purpose compression methods, | |
| 108 | - and in particular considerably better than the "compress" | |
| 109 | - program; | |
| 110 | - * Can be implemented readily in a manner not covered by | |
| 111 | - patents, and hence can be practiced freely; | |
| 112 | - | |
| 113 | - | |
| 114 | - | |
| 115 | -Deutsch Informational [Page 2] | |
| 116 | - | |
| 117 | - | |
| 118 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 119 | - | |
| 120 | - | |
| 121 | - * Is compatible with the file format produced by the current | |
| 122 | - widely used gzip utility, in that conforming decompressors | |
| 123 | - will be able to read data produced by the existing gzip | |
| 124 | - compressor. | |
| 125 | - | |
| 126 | - The data format defined by this specification does not attempt to: | |
| 127 | - | |
| 128 | - * Allow random access to compressed data; | |
| 129 | - * Compress specialized data (e.g., raster graphics) as well | |
| 130 | - as the best currently available specialized algorithms. | |
| 131 | - | |
| 132 | - A simple counting argument shows that no lossless compression | |
| 133 | - algorithm can compress every possible input data set. For the | |
| 134 | - format defined here, the worst case expansion is 5 bytes per 32K- | |
| 135 | - byte block, i.e., a size increase of 0.015% for large data sets. | |
| 136 | - English text usually compresses by a factor of 2.5 to 3; | |
| 137 | - executable files usually compress somewhat less; graphical data | |
| 138 | - such as raster images may compress much more. | |
| 139 | - | |
| 140 | - 1.2. Intended audience | |
| 141 | - | |
| 142 | - This specification is intended for use by implementors of software | |
| 143 | - to compress data into "deflate" format and/or decompress data from | |
| 144 | - "deflate" format. | |
| 145 | - | |
| 146 | - The text of the specification assumes a basic background in | |
| 147 | - programming at the level of bits and other primitive data | |
| 148 | - representations. Familiarity with the technique of Huffman coding | |
| 149 | - is helpful but not required. | |
| 150 | - | |
| 151 | - 1.3. Scope | |
| 152 | - | |
| 153 | - The specification specifies a method for representing a sequence | |
| 154 | - of bytes as a (usually shorter) sequence of bits, and a method for | |
| 155 | - packing the latter bit sequence into bytes. | |
| 156 | - | |
| 157 | - 1.4. Compliance | |
| 158 | - | |
| 159 | - Unless otherwise indicated below, a compliant decompressor must be | |
| 160 | - able to accept and decompress any data set that conforms to all | |
| 161 | - the specifications presented here; a compliant compressor must | |
| 162 | - produce data sets that conform to all the specifications presented | |
| 163 | - here. | |
| 164 | - | |
| 165 | - 1.5. Definitions of terms and conventions used | |
| 166 | - | |
| 167 | - Byte: 8 bits stored or transmitted as a unit (same as an octet). | |
| 168 | - For this specification, a byte is exactly 8 bits, even on machines | |
| 169 | - | |
| 170 | - | |
| 171 | - | |
| 172 | -Deutsch Informational [Page 3] | |
| 173 | - | |
| 174 | - | |
| 175 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 176 | - | |
| 177 | - | |
| 178 | - which store a character on a number of bits different from eight. | |
| 179 | - See below, for the numbering of bits within a byte. | |
| 180 | - | |
| 181 | - String: a sequence of arbitrary bytes. | |
| 182 | - | |
| 183 | - 1.6. Changes from previous versions | |
| 184 | - | |
| 185 | - There have been no technical changes to the deflate format since | |
| 186 | - version 1.1 of this specification. In version 1.2, some | |
| 187 | - terminology was changed. Version 1.3 is a conversion of the | |
| 188 | - specification to RFC style. | |
| 189 | - | |
| 190 | -2. Compressed representation overview | |
| 191 | - | |
| 192 | - A compressed data set consists of a series of blocks, corresponding | |
| 193 | - to successive blocks of input data. The block sizes are arbitrary, | |
| 194 | - except that non-compressible blocks are limited to 65,535 bytes. | |
| 195 | - | |
| 196 | - Each block is compressed using a combination of the LZ77 algorithm | |
| 197 | - and Huffman coding. The Huffman trees for each block are independent | |
| 198 | - of those for previous or subsequent blocks; the LZ77 algorithm may | |
| 199 | - use a reference to a duplicated string occurring in a previous block, | |
| 200 | - up to 32K input bytes before. | |
| 201 | - | |
| 202 | - Each block consists of two parts: a pair of Huffman code trees that | |
| 203 | - describe the representation of the compressed data part, and a | |
| 204 | - compressed data part. (The Huffman trees themselves are compressed | |
| 205 | - using Huffman encoding.) The compressed data consists of a series of | |
| 206 | - elements of two types: literal bytes (of strings that have not been | |
| 207 | - detected as duplicated within the previous 32K input bytes), and | |
| 208 | - pointers to duplicated strings, where a pointer is represented as a | |
| 209 | - pair <length, backward distance>. The representation used in the | |
| 210 | - "deflate" format limits distances to 32K bytes and lengths to 258 | |
| 211 | - bytes, but does not limit the size of a block, except for | |
| 212 | - uncompressible blocks, which are limited as noted above. | |
| 213 | - | |
| 214 | - Each type of value (literals, distances, and lengths) in the | |
| 215 | - compressed data is represented using a Huffman code, using one code | |
| 216 | - tree for literals and lengths and a separate code tree for distances. | |
| 217 | - The code trees for each block appear in a compact form just before | |
| 218 | - the compressed data for that block. | |
| 219 | - | |
| 220 | - | |
| 221 | - | |
| 222 | - | |
| 223 | - | |
| 224 | - | |
| 225 | - | |
| 226 | - | |
| 227 | - | |
| 228 | - | |
| 229 | -Deutsch Informational [Page 4] | |
| 230 | - | |
| 231 | - | |
| 232 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 233 | - | |
| 234 | - | |
| 235 | -3. Detailed specification | |
| 236 | - | |
| 237 | - 3.1. Overall conventions In the diagrams below, a box like this: | |
| 238 | - | |
| 239 | - +---+ | |
| 240 | - | | <-- the vertical bars might be missing | |
| 241 | - +---+ | |
| 242 | - | |
| 243 | - represents one byte; a box like this: | |
| 244 | - | |
| 245 | - +==============+ | |
| 246 | - | | | |
| 247 | - +==============+ | |
| 248 | - | |
| 249 | - represents a variable number of bytes. | |
| 250 | - | |
| 251 | - Bytes stored within a computer do not have a "bit order", since | |
| 252 | - they are always treated as a unit. However, a byte considered as | |
| 253 | - an integer between 0 and 255 does have a most- and least- | |
| 254 | - significant bit, and since we write numbers with the most- | |
| 255 | - significant digit on the left, we also write bytes with the most- | |
| 256 | - significant bit on the left. In the diagrams below, we number the | |
| 257 | - bits of a byte so that bit 0 is the least-significant bit, i.e., | |
| 258 | - the bits are numbered: | |
| 259 | - | |
| 260 | - +--------+ | |
| 261 | - |76543210| | |
| 262 | - +--------+ | |
| 263 | - | |
| 264 | - Within a computer, a number may occupy multiple bytes. All | |
| 265 | - multi-byte numbers in the format described here are stored with | |
| 266 | - the least-significant byte first (at the lower memory address). | |
| 267 | - For example, the decimal number 520 is stored as: | |
| 268 | - | |
| 269 | - 0 1 | |
| 270 | - +--------+--------+ | |
| 271 | - |00001000|00000010| | |
| 272 | - +--------+--------+ | |
| 273 | - ^ ^ | |
| 274 | - | | | |
| 275 | - | + more significant byte = 2 x 256 | |
| 276 | - + less significant byte = 8 | |
| 277 | - | |
| 278 | - 3.1.1. Packing into bytes | |
| 279 | - | |
| 280 | - This document does not address the issue of the order in which | |
| 281 | - bits of a byte are transmitted on a bit-sequential medium, | |
| 282 | - since the final data format described here is byte- rather than | |
| 283 | - | |
| 284 | - | |
| 285 | - | |
| 286 | -Deutsch Informational [Page 5] | |
| 287 | - | |
| 288 | - | |
| 289 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 290 | - | |
| 291 | - | |
| 292 | - bit-oriented. However, we describe the compressed block format | |
| 293 | - in below, as a sequence of data elements of various bit | |
| 294 | - lengths, not a sequence of bytes. We must therefore specify | |
| 295 | - how to pack these data elements into bytes to form the final | |
| 296 | - compressed byte sequence: | |
| 297 | - | |
| 298 | - * Data elements are packed into bytes in order of | |
| 299 | - increasing bit number within the byte, i.e., starting | |
| 300 | - with the least-significant bit of the byte. | |
| 301 | - * Data elements other than Huffman codes are packed | |
| 302 | - starting with the least-significant bit of the data | |
| 303 | - element. | |
| 304 | - * Huffman codes are packed starting with the most- | |
| 305 | - significant bit of the code. | |
| 306 | - | |
| 307 | - In other words, if one were to print out the compressed data as | |
| 308 | - a sequence of bytes, starting with the first byte at the | |
| 309 | - *right* margin and proceeding to the *left*, with the most- | |
| 310 | - significant bit of each byte on the left as usual, one would be | |
| 311 | - able to parse the result from right to left, with fixed-width | |
| 312 | - elements in the correct MSB-to-LSB order and Huffman codes in | |
| 313 | - bit-reversed order (i.e., with the first bit of the code in the | |
| 314 | - relative LSB position). | |
| 315 | - | |
| 316 | - 3.2. Compressed block format | |
| 317 | - | |
| 318 | - 3.2.1. Synopsis of prefix and Huffman coding | |
| 319 | - | |
| 320 | - Prefix coding represents symbols from an a priori known | |
| 321 | - alphabet by bit sequences (codes), one code for each symbol, in | |
| 322 | - a manner such that different symbols may be represented by bit | |
| 323 | - sequences of different lengths, but a parser can always parse | |
| 324 | - an encoded string unambiguously symbol-by-symbol. | |
| 325 | - | |
| 326 | - We define a prefix code in terms of a binary tree in which the | |
| 327 | - two edges descending from each non-leaf node are labeled 0 and | |
| 328 | - 1 and in which the leaf nodes correspond one-for-one with (are | |
| 329 | - labeled with) the symbols of the alphabet; then the code for a | |
| 330 | - symbol is the sequence of 0's and 1's on the edges leading from | |
| 331 | - the root to the leaf labeled with that symbol. For example: | |
| 332 | - | |
| 333 | - | |
| 334 | - | |
| 335 | - | |
| 336 | - | |
| 337 | - | |
| 338 | - | |
| 339 | - | |
| 340 | - | |
| 341 | - | |
| 342 | - | |
| 343 | -Deutsch Informational [Page 6] | |
| 344 | - | |
| 345 | - | |
| 346 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 347 | - | |
| 348 | - | |
| 349 | - /\ Symbol Code | |
| 350 | - 0 1 ------ ---- | |
| 351 | - / \ A 00 | |
| 352 | - /\ B B 1 | |
| 353 | - 0 1 C 011 | |
| 354 | - / \ D 010 | |
| 355 | - A /\ | |
| 356 | - 0 1 | |
| 357 | - / \ | |
| 358 | - D C | |
| 359 | - | |
| 360 | - A parser can decode the next symbol from an encoded input | |
| 361 | - stream by walking down the tree from the root, at each step | |
| 362 | - choosing the edge corresponding to the next input bit. | |
| 363 | - | |
| 364 | - Given an alphabet with known symbol frequencies, the Huffman | |
| 365 | - algorithm allows the construction of an optimal prefix code | |
| 366 | - (one which represents strings with those symbol frequencies | |
| 367 | - using the fewest bits of any possible prefix codes for that | |
| 368 | - alphabet). Such a code is called a Huffman code. (See | |
| 369 | - reference [1] in Chapter 5, references for additional | |
| 370 | - information on Huffman codes.) | |
| 371 | - | |
| 372 | - Note that in the "deflate" format, the Huffman codes for the | |
| 373 | - various alphabets must not exceed certain maximum code lengths. | |
| 374 | - This constraint complicates the algorithm for computing code | |
| 375 | - lengths from symbol frequencies. Again, see Chapter 5, | |
| 376 | - references for details. | |
| 377 | - | |
| 378 | - 3.2.2. Use of Huffman coding in the "deflate" format | |
| 379 | - | |
| 380 | - The Huffman codes used for each alphabet in the "deflate" | |
| 381 | - format have two additional rules: | |
| 382 | - | |
| 383 | - * All codes of a given bit length have lexicographically | |
| 384 | - consecutive values, in the same order as the symbols | |
| 385 | - they represent; | |
| 386 | - | |
| 387 | - * Shorter codes lexicographically precede longer codes. | |
| 388 | - | |
| 389 | - | |
| 390 | - | |
| 391 | - | |
| 392 | - | |
| 393 | - | |
| 394 | - | |
| 395 | - | |
| 396 | - | |
| 397 | - | |
| 398 | - | |
| 399 | - | |
| 400 | -Deutsch Informational [Page 7] | |
| 401 | - | |
| 402 | - | |
| 403 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 404 | - | |
| 405 | - | |
| 406 | - We could recode the example above to follow this rule as | |
| 407 | - follows, assuming that the order of the alphabet is ABCD: | |
| 408 | - | |
| 409 | - Symbol Code | |
| 410 | - ------ ---- | |
| 411 | - A 10 | |
| 412 | - B 0 | |
| 413 | - C 110 | |
| 414 | - D 111 | |
| 415 | - | |
| 416 | - I.e., 0 precedes 10 which precedes 11x, and 110 and 111 are | |
| 417 | - lexicographically consecutive. | |
| 418 | - | |
| 419 | - Given this rule, we can define the Huffman code for an alphabet | |
| 420 | - just by giving the bit lengths of the codes for each symbol of | |
| 421 | - the alphabet in order; this is sufficient to determine the | |
| 422 | - actual codes. In our example, the code is completely defined | |
| 423 | - by the sequence of bit lengths (2, 1, 3, 3). The following | |
| 424 | - algorithm generates the codes as integers, intended to be read | |
| 425 | - from most- to least-significant bit. The code lengths are | |
| 426 | - initially in tree[I].Len; the codes are produced in | |
| 427 | - tree[I].Code. | |
| 428 | - | |
| 429 | - 1) Count the number of codes for each code length. Let | |
| 430 | - bl_count[N] be the number of codes of length N, N >= 1. | |
| 431 | - | |
| 432 | - 2) Find the numerical value of the smallest code for each | |
| 433 | - code length: | |
| 434 | - | |
| 435 | - code = 0; | |
| 436 | - bl_count[0] = 0; | |
| 437 | - for (bits = 1; bits <= MAX_BITS; bits++) { | |
| 438 | - code = (code + bl_count[bits-1]) << 1; | |
| 439 | - next_code[bits] = code; | |
| 440 | - } | |
| 441 | - | |
| 442 | - 3) Assign numerical values to all codes, using consecutive | |
| 443 | - values for all codes of the same length with the base | |
| 444 | - values determined at step 2. Codes that are never used | |
| 445 | - (which have a bit length of zero) must not be assigned a | |
| 446 | - value. | |
| 447 | - | |
| 448 | - for (n = 0; n <= max_code; n++) { | |
| 449 | - len = tree[n].Len; | |
| 450 | - if (len != 0) { | |
| 451 | - tree[n].Code = next_code[len]; | |
| 452 | - next_code[len]++; | |
| 453 | - } | |
| 454 | - | |
| 455 | - | |
| 456 | - | |
| 457 | -Deutsch Informational [Page 8] | |
| 458 | - | |
| 459 | - | |
| 460 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 461 | - | |
| 462 | - | |
| 463 | - } | |
| 464 | - | |
| 465 | - Example: | |
| 466 | - | |
| 467 | - Consider the alphabet ABCDEFGH, with bit lengths (3, 3, 3, 3, | |
| 468 | - 3, 2, 4, 4). After step 1, we have: | |
| 469 | - | |
| 470 | - N bl_count[N] | |
| 471 | - - ----------- | |
| 472 | - 2 1 | |
| 473 | - 3 5 | |
| 474 | - 4 2 | |
| 475 | - | |
| 476 | - Step 2 computes the following next_code values: | |
| 477 | - | |
| 478 | - N next_code[N] | |
| 479 | - - ------------ | |
| 480 | - 1 0 | |
| 481 | - 2 0 | |
| 482 | - 3 2 | |
| 483 | - 4 14 | |
| 484 | - | |
| 485 | - Step 3 produces the following code values: | |
| 486 | - | |
| 487 | - Symbol Length Code | |
| 488 | - ------ ------ ---- | |
| 489 | - A 3 010 | |
| 490 | - B 3 011 | |
| 491 | - C 3 100 | |
| 492 | - D 3 101 | |
| 493 | - E 3 110 | |
| 494 | - F 2 00 | |
| 495 | - G 4 1110 | |
| 496 | - H 4 1111 | |
| 497 | - | |
| 498 | - 3.2.3. Details of block format | |
| 499 | - | |
| 500 | - Each block of compressed data begins with 3 header bits | |
| 501 | - containing the following data: | |
| 502 | - | |
| 503 | - first bit BFINAL | |
| 504 | - next 2 bits BTYPE | |
| 505 | - | |
| 506 | - Note that the header bits do not necessarily begin on a byte | |
| 507 | - boundary, since a block does not necessarily occupy an integral | |
| 508 | - number of bytes. | |
| 509 | - | |
| 510 | - | |
| 511 | - | |
| 512 | - | |
| 513 | - | |
| 514 | -Deutsch Informational [Page 9] | |
| 515 | - | |
| 516 | - | |
| 517 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 518 | - | |
| 519 | - | |
| 520 | - BFINAL is set if and only if this is the last block of the data | |
| 521 | - set. | |
| 522 | - | |
| 523 | - BTYPE specifies how the data are compressed, as follows: | |
| 524 | - | |
| 525 | - 00 - no compression | |
| 526 | - 01 - compressed with fixed Huffman codes | |
| 527 | - 10 - compressed with dynamic Huffman codes | |
| 528 | - 11 - reserved (error) | |
| 529 | - | |
| 530 | - The only difference between the two compressed cases is how the | |
| 531 | - Huffman codes for the literal/length and distance alphabets are | |
| 532 | - defined. | |
| 533 | - | |
| 534 | - In all cases, the decoding algorithm for the actual data is as | |
| 535 | - follows: | |
| 536 | - | |
| 537 | - do | |
| 538 | - read block header from input stream. | |
| 539 | - if stored with no compression | |
| 540 | - skip any remaining bits in current partially | |
| 541 | - processed byte | |
| 542 | - read LEN and NLEN (see next section) | |
| 543 | - copy LEN bytes of data to output | |
| 544 | - otherwise | |
| 545 | - if compressed with dynamic Huffman codes | |
| 546 | - read representation of code trees (see | |
| 547 | - subsection below) | |
| 548 | - loop (until end of block code recognized) | |
| 549 | - decode literal/length value from input stream | |
| 550 | - if value < 256 | |
| 551 | - copy value (literal byte) to output stream | |
| 552 | - otherwise | |
| 553 | - if value = end of block (256) | |
| 554 | - break from loop | |
| 555 | - otherwise (value = 257..285) | |
| 556 | - decode distance from input stream | |
| 557 | - | |
| 558 | - move backwards distance bytes in the output | |
| 559 | - stream, and copy length bytes from this | |
| 560 | - position to the output stream. | |
| 561 | - end loop | |
| 562 | - while not last block | |
| 563 | - | |
| 564 | - Note that a duplicated string reference may refer to a string | |
| 565 | - in a previous block; i.e., the backward distance may cross one | |
| 566 | - or more block boundaries. However a distance cannot refer past | |
| 567 | - the beginning of the output stream. (An application using a | |
| 568 | - | |
| 569 | - | |
| 570 | - | |
| 571 | -Deutsch Informational [Page 10] | |
| 572 | - | |
| 573 | - | |
| 574 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 575 | - | |
| 576 | - | |
| 577 | - preset dictionary might discard part of the output stream; a | |
| 578 | - distance can refer to that part of the output stream anyway) | |
| 579 | - Note also that the referenced string may overlap the current | |
| 580 | - position; for example, if the last 2 bytes decoded have values | |
| 581 | - X and Y, a string reference with <length = 5, distance = 2> | |
| 582 | - adds X,Y,X,Y,X to the output stream. | |
| 583 | - | |
| 584 | - We now specify each compression method in turn. | |
| 585 | - | |
| 586 | - 3.2.4. Non-compressed blocks (BTYPE=00) | |
| 587 | - | |
| 588 | - Any bits of input up to the next byte boundary are ignored. | |
| 589 | - The rest of the block consists of the following information: | |
| 590 | - | |
| 591 | - 0 1 2 3 4... | |
| 592 | - +---+---+---+---+================================+ | |
| 593 | - | LEN | NLEN |... LEN bytes of literal data...| | |
| 594 | - +---+---+---+---+================================+ | |
| 595 | - | |
| 596 | - LEN is the number of data bytes in the block. NLEN is the | |
| 597 | - one's complement of LEN. | |
| 598 | - | |
| 599 | - 3.2.5. Compressed blocks (length and distance codes) | |
| 600 | - | |
| 601 | - As noted above, encoded data blocks in the "deflate" format | |
| 602 | - consist of sequences of symbols drawn from three conceptually | |
| 603 | - distinct alphabets: either literal bytes, from the alphabet of | |
| 604 | - byte values (0..255), or <length, backward distance> pairs, | |
| 605 | - where the length is drawn from (3..258) and the distance is | |
| 606 | - drawn from (1..32,768). In fact, the literal and length | |
| 607 | - alphabets are merged into a single alphabet (0..285), where | |
| 608 | - values 0..255 represent literal bytes, the value 256 indicates | |
| 609 | - end-of-block, and values 257..285 represent length codes | |
| 610 | - (possibly in conjunction with extra bits following the symbol | |
| 611 | - code) as follows: | |
| 612 | - | |
| 613 | - | |
| 614 | - | |
| 615 | - | |
| 616 | - | |
| 617 | - | |
| 618 | - | |
| 619 | - | |
| 620 | - | |
| 621 | - | |
| 622 | - | |
| 623 | - | |
| 624 | - | |
| 625 | - | |
| 626 | - | |
| 627 | - | |
| 628 | -Deutsch Informational [Page 11] | |
| 629 | - | |
| 630 | - | |
| 631 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 632 | - | |
| 633 | - | |
| 634 | - Extra Extra Extra | |
| 635 | - Code Bits Length(s) Code Bits Lengths Code Bits Length(s) | |
| 636 | - ---- ---- ------ ---- ---- ------- ---- ---- ------- | |
| 637 | - 257 0 3 267 1 15,16 277 4 67-82 | |
| 638 | - 258 0 4 268 1 17,18 278 4 83-98 | |
| 639 | - 259 0 5 269 2 19-22 279 4 99-114 | |
| 640 | - 260 0 6 270 2 23-26 280 4 115-130 | |
| 641 | - 261 0 7 271 2 27-30 281 5 131-162 | |
| 642 | - 262 0 8 272 2 31-34 282 5 163-194 | |
| 643 | - 263 0 9 273 3 35-42 283 5 195-226 | |
| 644 | - 264 0 10 274 3 43-50 284 5 227-257 | |
| 645 | - 265 1 11,12 275 3 51-58 285 0 258 | |
| 646 | - 266 1 13,14 276 3 59-66 | |
| 647 | - | |
| 648 | - The extra bits should be interpreted as a machine integer | |
| 649 | - stored with the most-significant bit first, e.g., bits 1110 | |
| 650 | - represent the value 14. | |
| 651 | - | |
| 652 | - Extra Extra Extra | |
| 653 | - Code Bits Dist Code Bits Dist Code Bits Distance | |
| 654 | - ---- ---- ---- ---- ---- ------ ---- ---- -------- | |
| 655 | - 0 0 1 10 4 33-48 20 9 1025-1536 | |
| 656 | - 1 0 2 11 4 49-64 21 9 1537-2048 | |
| 657 | - 2 0 3 12 5 65-96 22 10 2049-3072 | |
| 658 | - 3 0 4 13 5 97-128 23 10 3073-4096 | |
| 659 | - 4 1 5,6 14 6 129-192 24 11 4097-6144 | |
| 660 | - 5 1 7,8 15 6 193-256 25 11 6145-8192 | |
| 661 | - 6 2 9-12 16 7 257-384 26 12 8193-12288 | |
| 662 | - 7 2 13-16 17 7 385-512 27 12 12289-16384 | |
| 663 | - 8 3 17-24 18 8 513-768 28 13 16385-24576 | |
| 664 | - 9 3 25-32 19 8 769-1024 29 13 24577-32768 | |
| 665 | - | |
| 666 | - 3.2.6. Compression with fixed Huffman codes (BTYPE=01) | |
| 667 | - | |
| 668 | - The Huffman codes for the two alphabets are fixed, and are not | |
| 669 | - represented explicitly in the data. The Huffman code lengths | |
| 670 | - for the literal/length alphabet are: | |
| 671 | - | |
| 672 | - Lit Value Bits Codes | |
| 673 | - --------- ---- ----- | |
| 674 | - 0 - 143 8 00110000 through | |
| 675 | - 10111111 | |
| 676 | - 144 - 255 9 110010000 through | |
| 677 | - 111111111 | |
| 678 | - 256 - 279 7 0000000 through | |
| 679 | - 0010111 | |
| 680 | - 280 - 287 8 11000000 through | |
| 681 | - 11000111 | |
| 682 | - | |
| 683 | - | |
| 684 | - | |
| 685 | -Deutsch Informational [Page 12] | |
| 686 | - | |
| 687 | - | |
| 688 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 689 | - | |
| 690 | - | |
| 691 | - The code lengths are sufficient to generate the actual codes, | |
| 692 | - as described above; we show the codes in the table for added | |
| 693 | - clarity. Literal/length values 286-287 will never actually | |
| 694 | - occur in the compressed data, but participate in the code | |
| 695 | - construction. | |
| 696 | - | |
| 697 | - Distance codes 0-31 are represented by (fixed-length) 5-bit | |
| 698 | - codes, with possible additional bits as shown in the table | |
| 699 | - shown in Paragraph 3.2.5, above. Note that distance codes 30- | |
| 700 | - 31 will never actually occur in the compressed data. | |
| 701 | - | |
| 702 | - 3.2.7. Compression with dynamic Huffman codes (BTYPE=10) | |
| 703 | - | |
| 704 | - The Huffman codes for the two alphabets appear in the block | |
| 705 | - immediately after the header bits and before the actual | |
| 706 | - compressed data, first the literal/length code and then the | |
| 707 | - distance code. Each code is defined by a sequence of code | |
| 708 | - lengths, as discussed in Paragraph 3.2.2, above. For even | |
| 709 | - greater compactness, the code length sequences themselves are | |
| 710 | - compressed using a Huffman code. The alphabet for code lengths | |
| 711 | - is as follows: | |
| 712 | - | |
| 713 | - 0 - 15: Represent code lengths of 0 - 15 | |
| 714 | - 16: Copy the previous code length 3 - 6 times. | |
| 715 | - The next 2 bits indicate repeat length | |
| 716 | - (0 = 3, ... , 3 = 6) | |
| 717 | - Example: Codes 8, 16 (+2 bits 11), | |
| 718 | - 16 (+2 bits 10) will expand to | |
| 719 | - 12 code lengths of 8 (1 + 6 + 5) | |
| 720 | - 17: Repeat a code length of 0 for 3 - 10 times. | |
| 721 | - (3 bits of length) | |
| 722 | - 18: Repeat a code length of 0 for 11 - 138 times | |
| 723 | - (7 bits of length) | |
| 724 | - | |
| 725 | - A code length of 0 indicates that the corresponding symbol in | |
| 726 | - the literal/length or distance alphabet will not occur in the | |
| 727 | - block, and should not participate in the Huffman code | |
| 728 | - construction algorithm given earlier. If only one distance | |
| 729 | - code is used, it is encoded using one bit, not zero bits; in | |
| 730 | - this case there is a single code length of one, with one unused | |
| 731 | - code. One distance code of zero bits means that there are no | |
| 732 | - distance codes used at all (the data is all literals). | |
| 733 | - | |
| 734 | - We can now define the format of the block: | |
| 735 | - | |
| 736 | - 5 Bits: HLIT, # of Literal/Length codes - 257 (257 - 286) | |
| 737 | - 5 Bits: HDIST, # of Distance codes - 1 (1 - 32) | |
| 738 | - 4 Bits: HCLEN, # of Code Length codes - 4 (4 - 19) | |
| 739 | - | |
| 740 | - | |
| 741 | - | |
| 742 | -Deutsch Informational [Page 13] | |
| 743 | - | |
| 744 | - | |
| 745 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 746 | - | |
| 747 | - | |
| 748 | - (HCLEN + 4) x 3 bits: code lengths for the code length | |
| 749 | - alphabet given just above, in the order: 16, 17, 18, | |
| 750 | - 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15 | |
| 751 | - | |
| 752 | - These code lengths are interpreted as 3-bit integers | |
| 753 | - (0-7); as above, a code length of 0 means the | |
| 754 | - corresponding symbol (literal/length or distance code | |
| 755 | - length) is not used. | |
| 756 | - | |
| 757 | - HLIT + 257 code lengths for the literal/length alphabet, | |
| 758 | - encoded using the code length Huffman code | |
| 759 | - | |
| 760 | - HDIST + 1 code lengths for the distance alphabet, | |
| 761 | - encoded using the code length Huffman code | |
| 762 | - | |
| 763 | - The actual compressed data of the block, | |
| 764 | - encoded using the literal/length and distance Huffman | |
| 765 | - codes | |
| 766 | - | |
| 767 | - The literal/length symbol 256 (end of data), | |
| 768 | - encoded using the literal/length Huffman code | |
| 769 | - | |
| 770 | - The code length repeat codes can cross from HLIT + 257 to the | |
| 771 | - HDIST + 1 code lengths. In other words, all code lengths form | |
| 772 | - a single sequence of HLIT + HDIST + 258 values. | |
| 773 | - | |
| 774 | - 3.3. Compliance | |
| 775 | - | |
| 776 | - A compressor may limit further the ranges of values specified in | |
| 777 | - the previous section and still be compliant; for example, it may | |
| 778 | - limit the range of backward pointers to some value smaller than | |
| 779 | - 32K. Similarly, a compressor may limit the size of blocks so that | |
| 780 | - a compressible block fits in memory. | |
| 781 | - | |
| 782 | - A compliant decompressor must accept the full range of possible | |
| 783 | - values defined in the previous section, and must accept blocks of | |
| 784 | - arbitrary size. | |
| 785 | - | |
| 786 | -4. Compression algorithm details | |
| 787 | - | |
| 788 | - While it is the intent of this document to define the "deflate" | |
| 789 | - compressed data format without reference to any particular | |
| 790 | - compression algorithm, the format is related to the compressed | |
| 791 | - formats produced by LZ77 (Lempel-Ziv 1977, see reference [2] below); | |
| 792 | - since many variations of LZ77 are patented, it is strongly | |
| 793 | - recommended that the implementor of a compressor follow the general | |
| 794 | - algorithm presented here, which is known not to be patented per se. | |
| 795 | - The material in this section is not part of the definition of the | |
| 796 | - | |
| 797 | - | |
| 798 | - | |
| 799 | -Deutsch Informational [Page 14] | |
| 800 | - | |
| 801 | - | |
| 802 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 803 | - | |
| 804 | - | |
| 805 | - specification per se, and a compressor need not follow it in order to | |
| 806 | - be compliant. | |
| 807 | - | |
| 808 | - The compressor terminates a block when it determines that starting a | |
| 809 | - new block with fresh trees would be useful, or when the block size | |
| 810 | - fills up the compressor's block buffer. | |
| 811 | - | |
| 812 | - The compressor uses a chained hash table to find duplicated strings, | |
| 813 | - using a hash function that operates on 3-byte sequences. At any | |
| 814 | - given point during compression, let XYZ be the next 3 input bytes to | |
| 815 | - be examined (not necessarily all different, of course). First, the | |
| 816 | - compressor examines the hash chain for XYZ. If the chain is empty, | |
| 817 | - the compressor simply writes out X as a literal byte and advances one | |
| 818 | - byte in the input. If the hash chain is not empty, indicating that | |
| 819 | - the sequence XYZ (or, if we are unlucky, some other 3 bytes with the | |
| 820 | - same hash function value) has occurred recently, the compressor | |
| 821 | - compares all strings on the XYZ hash chain with the actual input data | |
| 822 | - sequence starting at the current point, and selects the longest | |
| 823 | - match. | |
| 824 | - | |
| 825 | - The compressor searches the hash chains starting with the most recent | |
| 826 | - strings, to favor small distances and thus take advantage of the | |
| 827 | - Huffman encoding. The hash chains are singly linked. There are no | |
| 828 | - deletions from the hash chains; the algorithm simply discards matches | |
| 829 | - that are too old. To avoid a worst-case situation, very long hash | |
| 830 | - chains are arbitrarily truncated at a certain length, determined by a | |
| 831 | - run-time parameter. | |
| 832 | - | |
| 833 | - To improve overall compression, the compressor optionally defers the | |
| 834 | - selection of matches ("lazy matching"): after a match of length N has | |
| 835 | - been found, the compressor searches for a longer match starting at | |
| 836 | - the next input byte. If it finds a longer match, it truncates the | |
| 837 | - previous match to a length of one (thus producing a single literal | |
| 838 | - byte) and then emits the longer match. Otherwise, it emits the | |
| 839 | - original match, and, as described above, advances N bytes before | |
| 840 | - continuing. | |
| 841 | - | |
| 842 | - Run-time parameters also control this "lazy match" procedure. If | |
| 843 | - compression ratio is most important, the compressor attempts a | |
| 844 | - complete second search regardless of the length of the first match. | |
| 845 | - In the normal case, if the current match is "long enough", the | |
| 846 | - compressor reduces the search for a longer match, thus speeding up | |
| 847 | - the process. If speed is most important, the compressor inserts new | |
| 848 | - strings in the hash table only when no match was found, or when the | |
| 849 | - match is not "too long". This degrades the compression ratio but | |
| 850 | - saves time since there are both fewer insertions and fewer searches. | |
| 851 | - | |
| 852 | - | |
| 853 | - | |
| 854 | - | |
| 855 | - | |
| 856 | -Deutsch Informational [Page 15] | |
| 857 | - | |
| 858 | - | |
| 859 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 860 | - | |
| 861 | - | |
| 862 | -5. References | |
| 863 | - | |
| 864 | - [1] Huffman, D. A., "A Method for the Construction of Minimum | |
| 865 | - Redundancy Codes", Proceedings of the Institute of Radio | |
| 866 | - Engineers, September 1952, Volume 40, Number 9, pp. 1098-1101. | |
| 867 | - | |
| 868 | - [2] Ziv J., Lempel A., "A Universal Algorithm for Sequential Data | |
| 869 | - Compression", IEEE Transactions on Information Theory, Vol. 23, | |
| 870 | - No. 3, pp. 337-343. | |
| 871 | - | |
| 872 | - [3] Gailly, J.-L., and Adler, M., ZLIB documentation and sources, | |
| 873 | - available in ftp://ftp.uu.net/pub/archiving/zip/doc/ | |
| 874 | - | |
| 875 | - [4] Gailly, J.-L., and Adler, M., GZIP documentation and sources, | |
| 876 | - available as gzip-*.tar in ftp://prep.ai.mit.edu/pub/gnu/ | |
| 877 | - | |
| 878 | - [5] Schwartz, E. S., and Kallick, B. "Generating a canonical prefix | |
| 879 | - encoding." Comm. ACM, 7,3 (Mar. 1964), pp. 166-169. | |
| 880 | - | |
| 881 | - [6] Hirschberg and Lelewer, "Efficient decoding of prefix codes," | |
| 882 | - Comm. ACM, 33,4, April 1990, pp. 449-459. | |
| 883 | - | |
| 884 | -6. Security Considerations | |
| 885 | - | |
| 886 | - Any data compression method involves the reduction of redundancy in | |
| 887 | - the data. Consequently, any corruption of the data is likely to have | |
| 888 | - severe effects and be difficult to correct. Uncompressed text, on | |
| 889 | - the other hand, will probably still be readable despite the presence | |
| 890 | - of some corrupted bytes. | |
| 891 | - | |
| 892 | - It is recommended that systems using this data format provide some | |
| 893 | - means of validating the integrity of the compressed data. See | |
| 894 | - reference [3], for example. | |
| 895 | - | |
| 896 | -7. Source code | |
| 897 | - | |
| 898 | - Source code for a C language implementation of a "deflate" compliant | |
| 899 | - compressor and decompressor is available within the zlib package at | |
| 900 | - ftp://ftp.uu.net/pub/archiving/zip/zlib/. | |
| 901 | - | |
| 902 | -8. Acknowledgements | |
| 903 | - | |
| 904 | - Trademarks cited in this document are the property of their | |
| 905 | - respective owners. | |
| 906 | - | |
| 907 | - Phil Katz designed the deflate format. Jean-Loup Gailly and Mark | |
| 908 | - Adler wrote the related software described in this specification. | |
| 909 | - Glenn Randers-Pehrson converted this document to RFC and HTML format. | |
| 910 | - | |
| 911 | - | |
| 912 | - | |
| 913 | -Deutsch Informational [Page 16] | |
| 914 | - | |
| 915 | - | |
| 916 | -RFC 1951 DEFLATE Compressed Data Format Specification May 1996 | |
| 917 | - | |
| 918 | - | |
| 919 | -9. Author's Address | |
| 920 | - | |
| 921 | - L. Peter Deutsch | |
| 922 | - Aladdin Enterprises | |
| 923 | - 203 Santa Margarita Ave. | |
| 924 | - Menlo Park, CA 94025 | |
| 925 | - | |
| 926 | - Phone: (415) 322-0103 (AM only) | |
| 927 | - FAX: (415) 322-1734 | |
| 928 | - EMail: <[email protected]> | |
| 929 | - | |
| 930 | - Questions about the technical content of this specification can be | |
| 931 | - sent by email to: | |
| 932 | - | |
| 933 | - Jean-Loup Gailly <[email protected]> and | |
| 934 | - Mark Adler <[email protected]> | |
| 935 | - | |
| 936 | - Editorial comments on this specification can be sent by email to: | |
| 937 | - | |
| 938 | - L. Peter Deutsch <[email protected]> and | |
| 939 | - Glenn Randers-Pehrson <[email protected]> | |
| 940 | - | |
| 941 | - | |
| 942 | - | |
| 943 | - | |
| 944 | - | |
| 945 | - | |
| 946 | - | |
| 947 | - | |
| 948 | - | |
| 949 | - | |
| 950 | - | |
| 951 | - | |
| 952 | - | |
| 953 | - | |
| 954 | - | |
| 955 | - | |
| 956 | - | |
| 957 | - | |
| 958 | - | |
| 959 | - | |
| 960 | - | |
| 961 | - | |
| 962 | - | |
| 963 | - | |
| 964 | - | |
| 965 | - | |
| 966 | - | |
| 967 | - | |
| 968 | - | |
| 969 | - | |
| 970 | -Deutsch Informational [Page 17] | |
| 971 | - | |
| 972 | - |
| --- a/compat/zlib/doc/rfc1951.txt | |
| +++ b/compat/zlib/doc/rfc1951.txt | |
| @@ -1,972 +0,0 @@ | |
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | Network Working Group P. Deutsch |
| 8 | Request for Comments: 1951 Aladdin Enterprises |
| 9 | Category: Informational May 1996 |
| 10 | |
| 11 | |
| 12 | DEFLATE Compressed Data Format Specification version 1.3 |
| 13 | |
| 14 | Status of This Memo |
| 15 | |
| 16 | This memo provides information for the Internet community. This memo |
| 17 | does not specify an Internet standard of any kind. Distribution of |
| 18 | this memo is unlimited. |
| 19 | |
| 20 | IESG Note: |
| 21 | |
| 22 | The IESG takes no position on the validity of any Intellectual |
| 23 | Property Rights statements contained in this document. |
| 24 | |
| 25 | Notices |
| 26 | |
| 27 | Copyright (c) 1996 L. Peter Deutsch |
| 28 | |
| 29 | Permission is granted to copy and distribute this document for any |
| 30 | purpose and without charge, including translations into other |
| 31 | languages and incorporation into compilations, provided that the |
| 32 | copyright notice and this notice are preserved, and that any |
| 33 | substantive changes or deletions from the original are clearly |
| 34 | marked. |
| 35 | |
| 36 | A pointer to the latest version of this and related documentation in |
| 37 | HTML format can be found at the URL |
| 38 | <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>. |
| 39 | |
| 40 | Abstract |
| 41 | |
| 42 | This specification defines a lossless compressed data format that |
| 43 | compresses data using a combination of the LZ77 algorithm and Huffman |
| 44 | coding, with efficiency comparable to the best currently available |
| 45 | general-purpose compression methods. The data can be produced or |
| 46 | consumed, even for an arbitrarily long sequentially presented input |
| 47 | data stream, using only an a priori bounded amount of intermediate |
| 48 | storage. The format can be implemented readily in a manner not |
| 49 | covered by patents. |
| 50 | |
| 51 | |
| 52 | |
| 53 | |
| 54 | |
| 55 | |
| 56 | |
| 57 | |
| 58 | Deutsch Informational [Page 1] |
| 59 | |
| 60 | |
| 61 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 62 | |
| 63 | |
| 64 | Table of Contents |
| 65 | |
| 66 | 1. Introduction ................................................... 2 |
| 67 | 1.1. Purpose ................................................... 2 |
| 68 | 1.2. Intended audience ......................................... 3 |
| 69 | 1.3. Scope ..................................................... 3 |
| 70 | 1.4. Compliance ................................................ 3 |
| 71 | 1.5. Definitions of terms and conventions used ................ 3 |
| 72 | 1.6. Changes from previous versions ............................ 4 |
| 73 | 2. Compressed representation overview ............................. 4 |
| 74 | 3. Detailed specification ......................................... 5 |
| 75 | 3.1. Overall conventions ....................................... 5 |
| 76 | 3.1.1. Packing into bytes .................................. 5 |
| 77 | 3.2. Compressed block format ................................... 6 |
| 78 | 3.2.1. Synopsis of prefix and Huffman coding ............... 6 |
| 79 | 3.2.2. Use of Huffman coding in the "deflate" format ....... 7 |
| 80 | 3.2.3. Details of block format ............................. 9 |
| 81 | 3.2.4. Non-compressed blocks (BTYPE=00) ................... 11 |
| 82 | 3.2.5. Compressed blocks (length and distance codes) ...... 11 |
| 83 | 3.2.6. Compression with fixed Huffman codes (BTYPE=01) .... 12 |
| 84 | 3.2.7. Compression with dynamic Huffman codes (BTYPE=10) .. 13 |
| 85 | 3.3. Compliance ............................................... 14 |
| 86 | 4. Compression algorithm details ................................. 14 |
| 87 | 5. References .................................................... 16 |
| 88 | 6. Security Considerations ....................................... 16 |
| 89 | 7. Source code ................................................... 16 |
| 90 | 8. Acknowledgements .............................................. 16 |
| 91 | 9. Author's Address .............................................. 17 |
| 92 | |
| 93 | 1. Introduction |
| 94 | |
| 95 | 1.1. Purpose |
| 96 | |
| 97 | The purpose of this specification is to define a lossless |
| 98 | compressed data format that: |
| 99 | * Is independent of CPU type, operating system, file system, |
| 100 | and character set, and hence can be used for interchange; |
| 101 | * Can be produced or consumed, even for an arbitrarily long |
| 102 | sequentially presented input data stream, using only an a |
| 103 | priori bounded amount of intermediate storage, and hence |
| 104 | can be used in data communications or similar structures |
| 105 | such as Unix filters; |
| 106 | * Compresses data with efficiency comparable to the best |
| 107 | currently available general-purpose compression methods, |
| 108 | and in particular considerably better than the "compress" |
| 109 | program; |
| 110 | * Can be implemented readily in a manner not covered by |
| 111 | patents, and hence can be practiced freely; |
| 112 | |
| 113 | |
| 114 | |
| 115 | Deutsch Informational [Page 2] |
| 116 | |
| 117 | |
| 118 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 119 | |
| 120 | |
| 121 | * Is compatible with the file format produced by the current |
| 122 | widely used gzip utility, in that conforming decompressors |
| 123 | will be able to read data produced by the existing gzip |
| 124 | compressor. |
| 125 | |
| 126 | The data format defined by this specification does not attempt to: |
| 127 | |
| 128 | * Allow random access to compressed data; |
| 129 | * Compress specialized data (e.g., raster graphics) as well |
| 130 | as the best currently available specialized algorithms. |
| 131 | |
| 132 | A simple counting argument shows that no lossless compression |
| 133 | algorithm can compress every possible input data set. For the |
| 134 | format defined here, the worst case expansion is 5 bytes per 32K- |
| 135 | byte block, i.e., a size increase of 0.015% for large data sets. |
| 136 | English text usually compresses by a factor of 2.5 to 3; |
| 137 | executable files usually compress somewhat less; graphical data |
| 138 | such as raster images may compress much more. |
| 139 | |
| 140 | 1.2. Intended audience |
| 141 | |
| 142 | This specification is intended for use by implementors of software |
| 143 | to compress data into "deflate" format and/or decompress data from |
| 144 | "deflate" format. |
| 145 | |
| 146 | The text of the specification assumes a basic background in |
| 147 | programming at the level of bits and other primitive data |
| 148 | representations. Familiarity with the technique of Huffman coding |
| 149 | is helpful but not required. |
| 150 | |
| 151 | 1.3. Scope |
| 152 | |
| 153 | The specification specifies a method for representing a sequence |
| 154 | of bytes as a (usually shorter) sequence of bits, and a method for |
| 155 | packing the latter bit sequence into bytes. |
| 156 | |
| 157 | 1.4. Compliance |
| 158 | |
| 159 | Unless otherwise indicated below, a compliant decompressor must be |
| 160 | able to accept and decompress any data set that conforms to all |
| 161 | the specifications presented here; a compliant compressor must |
| 162 | produce data sets that conform to all the specifications presented |
| 163 | here. |
| 164 | |
| 165 | 1.5. Definitions of terms and conventions used |
| 166 | |
| 167 | Byte: 8 bits stored or transmitted as a unit (same as an octet). |
| 168 | For this specification, a byte is exactly 8 bits, even on machines |
| 169 | |
| 170 | |
| 171 | |
| 172 | Deutsch Informational [Page 3] |
| 173 | |
| 174 | |
| 175 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 176 | |
| 177 | |
| 178 | which store a character on a number of bits different from eight. |
| 179 | See below, for the numbering of bits within a byte. |
| 180 | |
| 181 | String: a sequence of arbitrary bytes. |
| 182 | |
| 183 | 1.6. Changes from previous versions |
| 184 | |
| 185 | There have been no technical changes to the deflate format since |
| 186 | version 1.1 of this specification. In version 1.2, some |
| 187 | terminology was changed. Version 1.3 is a conversion of the |
| 188 | specification to RFC style. |
| 189 | |
| 190 | 2. Compressed representation overview |
| 191 | |
| 192 | A compressed data set consists of a series of blocks, corresponding |
| 193 | to successive blocks of input data. The block sizes are arbitrary, |
| 194 | except that non-compressible blocks are limited to 65,535 bytes. |
| 195 | |
| 196 | Each block is compressed using a combination of the LZ77 algorithm |
| 197 | and Huffman coding. The Huffman trees for each block are independent |
| 198 | of those for previous or subsequent blocks; the LZ77 algorithm may |
| 199 | use a reference to a duplicated string occurring in a previous block, |
| 200 | up to 32K input bytes before. |
| 201 | |
| 202 | Each block consists of two parts: a pair of Huffman code trees that |
| 203 | describe the representation of the compressed data part, and a |
| 204 | compressed data part. (The Huffman trees themselves are compressed |
| 205 | using Huffman encoding.) The compressed data consists of a series of |
| 206 | elements of two types: literal bytes (of strings that have not been |
| 207 | detected as duplicated within the previous 32K input bytes), and |
| 208 | pointers to duplicated strings, where a pointer is represented as a |
| 209 | pair <length, backward distance>. The representation used in the |
| 210 | "deflate" format limits distances to 32K bytes and lengths to 258 |
| 211 | bytes, but does not limit the size of a block, except for |
| 212 | uncompressible blocks, which are limited as noted above. |
| 213 | |
| 214 | Each type of value (literals, distances, and lengths) in the |
| 215 | compressed data is represented using a Huffman code, using one code |
| 216 | tree for literals and lengths and a separate code tree for distances. |
| 217 | The code trees for each block appear in a compact form just before |
| 218 | the compressed data for that block. |
| 219 | |
| 220 | |
| 221 | |
| 222 | |
| 223 | |
| 224 | |
| 225 | |
| 226 | |
| 227 | |
| 228 | |
| 229 | Deutsch Informational [Page 4] |
| 230 | |
| 231 | |
| 232 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 233 | |
| 234 | |
| 235 | 3. Detailed specification |
| 236 | |
| 237 | 3.1. Overall conventions In the diagrams below, a box like this: |
| 238 | |
| 239 | +---+ |
| 240 | | | <-- the vertical bars might be missing |
| 241 | +---+ |
| 242 | |
| 243 | represents one byte; a box like this: |
| 244 | |
| 245 | +==============+ |
| 246 | | | |
| 247 | +==============+ |
| 248 | |
| 249 | represents a variable number of bytes. |
| 250 | |
| 251 | Bytes stored within a computer do not have a "bit order", since |
| 252 | they are always treated as a unit. However, a byte considered as |
| 253 | an integer between 0 and 255 does have a most- and least- |
| 254 | significant bit, and since we write numbers with the most- |
| 255 | significant digit on the left, we also write bytes with the most- |
| 256 | significant bit on the left. In the diagrams below, we number the |
| 257 | bits of a byte so that bit 0 is the least-significant bit, i.e., |
| 258 | the bits are numbered: |
| 259 | |
| 260 | +--------+ |
| 261 | |76543210| |
| 262 | +--------+ |
| 263 | |
| 264 | Within a computer, a number may occupy multiple bytes. All |
| 265 | multi-byte numbers in the format described here are stored with |
| 266 | the least-significant byte first (at the lower memory address). |
| 267 | For example, the decimal number 520 is stored as: |
| 268 | |
| 269 | 0 1 |
| 270 | +--------+--------+ |
| 271 | |00001000|00000010| |
| 272 | +--------+--------+ |
| 273 | ^ ^ |
| 274 | | | |
| 275 | | + more significant byte = 2 x 256 |
| 276 | + less significant byte = 8 |
| 277 | |
| 278 | 3.1.1. Packing into bytes |
| 279 | |
| 280 | This document does not address the issue of the order in which |
| 281 | bits of a byte are transmitted on a bit-sequential medium, |
| 282 | since the final data format described here is byte- rather than |
| 283 | |
| 284 | |
| 285 | |
| 286 | Deutsch Informational [Page 5] |
| 287 | |
| 288 | |
| 289 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 290 | |
| 291 | |
| 292 | bit-oriented. However, we describe the compressed block format |
| 293 | in below, as a sequence of data elements of various bit |
| 294 | lengths, not a sequence of bytes. We must therefore specify |
| 295 | how to pack these data elements into bytes to form the final |
| 296 | compressed byte sequence: |
| 297 | |
| 298 | * Data elements are packed into bytes in order of |
| 299 | increasing bit number within the byte, i.e., starting |
| 300 | with the least-significant bit of the byte. |
| 301 | * Data elements other than Huffman codes are packed |
| 302 | starting with the least-significant bit of the data |
| 303 | element. |
| 304 | * Huffman codes are packed starting with the most- |
| 305 | significant bit of the code. |
| 306 | |
| 307 | In other words, if one were to print out the compressed data as |
| 308 | a sequence of bytes, starting with the first byte at the |
| 309 | *right* margin and proceeding to the *left*, with the most- |
| 310 | significant bit of each byte on the left as usual, one would be |
| 311 | able to parse the result from right to left, with fixed-width |
| 312 | elements in the correct MSB-to-LSB order and Huffman codes in |
| 313 | bit-reversed order (i.e., with the first bit of the code in the |
| 314 | relative LSB position). |
| 315 | |
| 316 | 3.2. Compressed block format |
| 317 | |
| 318 | 3.2.1. Synopsis of prefix and Huffman coding |
| 319 | |
| 320 | Prefix coding represents symbols from an a priori known |
| 321 | alphabet by bit sequences (codes), one code for each symbol, in |
| 322 | a manner such that different symbols may be represented by bit |
| 323 | sequences of different lengths, but a parser can always parse |
| 324 | an encoded string unambiguously symbol-by-symbol. |
| 325 | |
| 326 | We define a prefix code in terms of a binary tree in which the |
| 327 | two edges descending from each non-leaf node are labeled 0 and |
| 328 | 1 and in which the leaf nodes correspond one-for-one with (are |
| 329 | labeled with) the symbols of the alphabet; then the code for a |
| 330 | symbol is the sequence of 0's and 1's on the edges leading from |
| 331 | the root to the leaf labeled with that symbol. For example: |
| 332 | |
| 333 | |
| 334 | |
| 335 | |
| 336 | |
| 337 | |
| 338 | |
| 339 | |
| 340 | |
| 341 | |
| 342 | |
| 343 | Deutsch Informational [Page 6] |
| 344 | |
| 345 | |
| 346 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 347 | |
| 348 | |
| 349 | /\ Symbol Code |
| 350 | 0 1 ------ ---- |
| 351 | / \ A 00 |
| 352 | /\ B B 1 |
| 353 | 0 1 C 011 |
| 354 | / \ D 010 |
| 355 | A /\ |
| 356 | 0 1 |
| 357 | / \ |
| 358 | D C |
| 359 | |
| 360 | A parser can decode the next symbol from an encoded input |
| 361 | stream by walking down the tree from the root, at each step |
| 362 | choosing the edge corresponding to the next input bit. |
| 363 | |
| 364 | Given an alphabet with known symbol frequencies, the Huffman |
| 365 | algorithm allows the construction of an optimal prefix code |
| 366 | (one which represents strings with those symbol frequencies |
| 367 | using the fewest bits of any possible prefix codes for that |
| 368 | alphabet). Such a code is called a Huffman code. (See |
| 369 | reference [1] in Chapter 5, references for additional |
| 370 | information on Huffman codes.) |
| 371 | |
| 372 | Note that in the "deflate" format, the Huffman codes for the |
| 373 | various alphabets must not exceed certain maximum code lengths. |
| 374 | This constraint complicates the algorithm for computing code |
| 375 | lengths from symbol frequencies. Again, see Chapter 5, |
| 376 | references for details. |
| 377 | |
| 378 | 3.2.2. Use of Huffman coding in the "deflate" format |
| 379 | |
| 380 | The Huffman codes used for each alphabet in the "deflate" |
| 381 | format have two additional rules: |
| 382 | |
| 383 | * All codes of a given bit length have lexicographically |
| 384 | consecutive values, in the same order as the symbols |
| 385 | they represent; |
| 386 | |
| 387 | * Shorter codes lexicographically precede longer codes. |
| 388 | |
| 389 | |
| 390 | |
| 391 | |
| 392 | |
| 393 | |
| 394 | |
| 395 | |
| 396 | |
| 397 | |
| 398 | |
| 399 | |
| 400 | Deutsch Informational [Page 7] |
| 401 | |
| 402 | |
| 403 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 404 | |
| 405 | |
| 406 | We could recode the example above to follow this rule as |
| 407 | follows, assuming that the order of the alphabet is ABCD: |
| 408 | |
| 409 | Symbol Code |
| 410 | ------ ---- |
| 411 | A 10 |
| 412 | B 0 |
| 413 | C 110 |
| 414 | D 111 |
| 415 | |
| 416 | I.e., 0 precedes 10 which precedes 11x, and 110 and 111 are |
| 417 | lexicographically consecutive. |
| 418 | |
| 419 | Given this rule, we can define the Huffman code for an alphabet |
| 420 | just by giving the bit lengths of the codes for each symbol of |
| 421 | the alphabet in order; this is sufficient to determine the |
| 422 | actual codes. In our example, the code is completely defined |
| 423 | by the sequence of bit lengths (2, 1, 3, 3). The following |
| 424 | algorithm generates the codes as integers, intended to be read |
| 425 | from most- to least-significant bit. The code lengths are |
| 426 | initially in tree[I].Len; the codes are produced in |
| 427 | tree[I].Code. |
| 428 | |
| 429 | 1) Count the number of codes for each code length. Let |
| 430 | bl_count[N] be the number of codes of length N, N >= 1. |
| 431 | |
| 432 | 2) Find the numerical value of the smallest code for each |
| 433 | code length: |
| 434 | |
| 435 | code = 0; |
| 436 | bl_count[0] = 0; |
| 437 | for (bits = 1; bits <= MAX_BITS; bits++) { |
| 438 | code = (code + bl_count[bits-1]) << 1; |
| 439 | next_code[bits] = code; |
| 440 | } |
| 441 | |
| 442 | 3) Assign numerical values to all codes, using consecutive |
| 443 | values for all codes of the same length with the base |
| 444 | values determined at step 2. Codes that are never used |
| 445 | (which have a bit length of zero) must not be assigned a |
| 446 | value. |
| 447 | |
| 448 | for (n = 0; n <= max_code; n++) { |
| 449 | len = tree[n].Len; |
| 450 | if (len != 0) { |
| 451 | tree[n].Code = next_code[len]; |
| 452 | next_code[len]++; |
| 453 | } |
| 454 | |
| 455 | |
| 456 | |
| 457 | Deutsch Informational [Page 8] |
| 458 | |
| 459 | |
| 460 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 461 | |
| 462 | |
| 463 | } |
| 464 | |
| 465 | Example: |
| 466 | |
| 467 | Consider the alphabet ABCDEFGH, with bit lengths (3, 3, 3, 3, |
| 468 | 3, 2, 4, 4). After step 1, we have: |
| 469 | |
| 470 | N bl_count[N] |
| 471 | - ----------- |
| 472 | 2 1 |
| 473 | 3 5 |
| 474 | 4 2 |
| 475 | |
| 476 | Step 2 computes the following next_code values: |
| 477 | |
| 478 | N next_code[N] |
| 479 | - ------------ |
| 480 | 1 0 |
| 481 | 2 0 |
| 482 | 3 2 |
| 483 | 4 14 |
| 484 | |
| 485 | Step 3 produces the following code values: |
| 486 | |
| 487 | Symbol Length Code |
| 488 | ------ ------ ---- |
| 489 | A 3 010 |
| 490 | B 3 011 |
| 491 | C 3 100 |
| 492 | D 3 101 |
| 493 | E 3 110 |
| 494 | F 2 00 |
| 495 | G 4 1110 |
| 496 | H 4 1111 |
| 497 | |
| 498 | 3.2.3. Details of block format |
| 499 | |
| 500 | Each block of compressed data begins with 3 header bits |
| 501 | containing the following data: |
| 502 | |
| 503 | first bit BFINAL |
| 504 | next 2 bits BTYPE |
| 505 | |
| 506 | Note that the header bits do not necessarily begin on a byte |
| 507 | boundary, since a block does not necessarily occupy an integral |
| 508 | number of bytes. |
| 509 | |
| 510 | |
| 511 | |
| 512 | |
| 513 | |
| 514 | Deutsch Informational [Page 9] |
| 515 | |
| 516 | |
| 517 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 518 | |
| 519 | |
| 520 | BFINAL is set if and only if this is the last block of the data |
| 521 | set. |
| 522 | |
| 523 | BTYPE specifies how the data are compressed, as follows: |
| 524 | |
| 525 | 00 - no compression |
| 526 | 01 - compressed with fixed Huffman codes |
| 527 | 10 - compressed with dynamic Huffman codes |
| 528 | 11 - reserved (error) |
| 529 | |
| 530 | The only difference between the two compressed cases is how the |
| 531 | Huffman codes for the literal/length and distance alphabets are |
| 532 | defined. |
| 533 | |
| 534 | In all cases, the decoding algorithm for the actual data is as |
| 535 | follows: |
| 536 | |
| 537 | do |
| 538 | read block header from input stream. |
| 539 | if stored with no compression |
| 540 | skip any remaining bits in current partially |
| 541 | processed byte |
| 542 | read LEN and NLEN (see next section) |
| 543 | copy LEN bytes of data to output |
| 544 | otherwise |
| 545 | if compressed with dynamic Huffman codes |
| 546 | read representation of code trees (see |
| 547 | subsection below) |
| 548 | loop (until end of block code recognized) |
| 549 | decode literal/length value from input stream |
| 550 | if value < 256 |
| 551 | copy value (literal byte) to output stream |
| 552 | otherwise |
| 553 | if value = end of block (256) |
| 554 | break from loop |
| 555 | otherwise (value = 257..285) |
| 556 | decode distance from input stream |
| 557 | |
| 558 | move backwards distance bytes in the output |
| 559 | stream, and copy length bytes from this |
| 560 | position to the output stream. |
| 561 | end loop |
| 562 | while not last block |
| 563 | |
| 564 | Note that a duplicated string reference may refer to a string |
| 565 | in a previous block; i.e., the backward distance may cross one |
| 566 | or more block boundaries. However a distance cannot refer past |
| 567 | the beginning of the output stream. (An application using a |
| 568 | |
| 569 | |
| 570 | |
| 571 | Deutsch Informational [Page 10] |
| 572 | |
| 573 | |
| 574 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 575 | |
| 576 | |
| 577 | preset dictionary might discard part of the output stream; a |
| 578 | distance can refer to that part of the output stream anyway) |
| 579 | Note also that the referenced string may overlap the current |
| 580 | position; for example, if the last 2 bytes decoded have values |
| 581 | X and Y, a string reference with <length = 5, distance = 2> |
| 582 | adds X,Y,X,Y,X to the output stream. |
| 583 | |
| 584 | We now specify each compression method in turn. |
| 585 | |
| 586 | 3.2.4. Non-compressed blocks (BTYPE=00) |
| 587 | |
| 588 | Any bits of input up to the next byte boundary are ignored. |
| 589 | The rest of the block consists of the following information: |
| 590 | |
| 591 | 0 1 2 3 4... |
| 592 | +---+---+---+---+================================+ |
| 593 | | LEN | NLEN |... LEN bytes of literal data...| |
| 594 | +---+---+---+---+================================+ |
| 595 | |
| 596 | LEN is the number of data bytes in the block. NLEN is the |
| 597 | one's complement of LEN. |
| 598 | |
| 599 | 3.2.5. Compressed blocks (length and distance codes) |
| 600 | |
| 601 | As noted above, encoded data blocks in the "deflate" format |
| 602 | consist of sequences of symbols drawn from three conceptually |
| 603 | distinct alphabets: either literal bytes, from the alphabet of |
| 604 | byte values (0..255), or <length, backward distance> pairs, |
| 605 | where the length is drawn from (3..258) and the distance is |
| 606 | drawn from (1..32,768). In fact, the literal and length |
| 607 | alphabets are merged into a single alphabet (0..285), where |
| 608 | values 0..255 represent literal bytes, the value 256 indicates |
| 609 | end-of-block, and values 257..285 represent length codes |
| 610 | (possibly in conjunction with extra bits following the symbol |
| 611 | code) as follows: |
| 612 | |
| 613 | |
| 614 | |
| 615 | |
| 616 | |
| 617 | |
| 618 | |
| 619 | |
| 620 | |
| 621 | |
| 622 | |
| 623 | |
| 624 | |
| 625 | |
| 626 | |
| 627 | |
| 628 | Deutsch Informational [Page 11] |
| 629 | |
| 630 | |
| 631 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 632 | |
| 633 | |
| 634 | Extra Extra Extra |
| 635 | Code Bits Length(s) Code Bits Lengths Code Bits Length(s) |
| 636 | ---- ---- ------ ---- ---- ------- ---- ---- ------- |
| 637 | 257 0 3 267 1 15,16 277 4 67-82 |
| 638 | 258 0 4 268 1 17,18 278 4 83-98 |
| 639 | 259 0 5 269 2 19-22 279 4 99-114 |
| 640 | 260 0 6 270 2 23-26 280 4 115-130 |
| 641 | 261 0 7 271 2 27-30 281 5 131-162 |
| 642 | 262 0 8 272 2 31-34 282 5 163-194 |
| 643 | 263 0 9 273 3 35-42 283 5 195-226 |
| 644 | 264 0 10 274 3 43-50 284 5 227-257 |
| 645 | 265 1 11,12 275 3 51-58 285 0 258 |
| 646 | 266 1 13,14 276 3 59-66 |
| 647 | |
| 648 | The extra bits should be interpreted as a machine integer |
| 649 | stored with the most-significant bit first, e.g., bits 1110 |
| 650 | represent the value 14. |
| 651 | |
| 652 | Extra Extra Extra |
| 653 | Code Bits Dist Code Bits Dist Code Bits Distance |
| 654 | ---- ---- ---- ---- ---- ------ ---- ---- -------- |
| 655 | 0 0 1 10 4 33-48 20 9 1025-1536 |
| 656 | 1 0 2 11 4 49-64 21 9 1537-2048 |
| 657 | 2 0 3 12 5 65-96 22 10 2049-3072 |
| 658 | 3 0 4 13 5 97-128 23 10 3073-4096 |
| 659 | 4 1 5,6 14 6 129-192 24 11 4097-6144 |
| 660 | 5 1 7,8 15 6 193-256 25 11 6145-8192 |
| 661 | 6 2 9-12 16 7 257-384 26 12 8193-12288 |
| 662 | 7 2 13-16 17 7 385-512 27 12 12289-16384 |
| 663 | 8 3 17-24 18 8 513-768 28 13 16385-24576 |
| 664 | 9 3 25-32 19 8 769-1024 29 13 24577-32768 |
| 665 | |
| 666 | 3.2.6. Compression with fixed Huffman codes (BTYPE=01) |
| 667 | |
| 668 | The Huffman codes for the two alphabets are fixed, and are not |
| 669 | represented explicitly in the data. The Huffman code lengths |
| 670 | for the literal/length alphabet are: |
| 671 | |
| 672 | Lit Value Bits Codes |
| 673 | --------- ---- ----- |
| 674 | 0 - 143 8 00110000 through |
| 675 | 10111111 |
| 676 | 144 - 255 9 110010000 through |
| 677 | 111111111 |
| 678 | 256 - 279 7 0000000 through |
| 679 | 0010111 |
| 680 | 280 - 287 8 11000000 through |
| 681 | 11000111 |
| 682 | |
| 683 | |
| 684 | |
| 685 | Deutsch Informational [Page 12] |
| 686 | |
| 687 | |
| 688 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 689 | |
| 690 | |
| 691 | The code lengths are sufficient to generate the actual codes, |
| 692 | as described above; we show the codes in the table for added |
| 693 | clarity. Literal/length values 286-287 will never actually |
| 694 | occur in the compressed data, but participate in the code |
| 695 | construction. |
| 696 | |
| 697 | Distance codes 0-31 are represented by (fixed-length) 5-bit |
| 698 | codes, with possible additional bits as shown in the table |
| 699 | shown in Paragraph 3.2.5, above. Note that distance codes 30- |
| 700 | 31 will never actually occur in the compressed data. |
| 701 | |
| 702 | 3.2.7. Compression with dynamic Huffman codes (BTYPE=10) |
| 703 | |
| 704 | The Huffman codes for the two alphabets appear in the block |
| 705 | immediately after the header bits and before the actual |
| 706 | compressed data, first the literal/length code and then the |
| 707 | distance code. Each code is defined by a sequence of code |
| 708 | lengths, as discussed in Paragraph 3.2.2, above. For even |
| 709 | greater compactness, the code length sequences themselves are |
| 710 | compressed using a Huffman code. The alphabet for code lengths |
| 711 | is as follows: |
| 712 | |
| 713 | 0 - 15: Represent code lengths of 0 - 15 |
| 714 | 16: Copy the previous code length 3 - 6 times. |
| 715 | The next 2 bits indicate repeat length |
| 716 | (0 = 3, ... , 3 = 6) |
| 717 | Example: Codes 8, 16 (+2 bits 11), |
| 718 | 16 (+2 bits 10) will expand to |
| 719 | 12 code lengths of 8 (1 + 6 + 5) |
| 720 | 17: Repeat a code length of 0 for 3 - 10 times. |
| 721 | (3 bits of length) |
| 722 | 18: Repeat a code length of 0 for 11 - 138 times |
| 723 | (7 bits of length) |
| 724 | |
| 725 | A code length of 0 indicates that the corresponding symbol in |
| 726 | the literal/length or distance alphabet will not occur in the |
| 727 | block, and should not participate in the Huffman code |
| 728 | construction algorithm given earlier. If only one distance |
| 729 | code is used, it is encoded using one bit, not zero bits; in |
| 730 | this case there is a single code length of one, with one unused |
| 731 | code. One distance code of zero bits means that there are no |
| 732 | distance codes used at all (the data is all literals). |
| 733 | |
| 734 | We can now define the format of the block: |
| 735 | |
| 736 | 5 Bits: HLIT, # of Literal/Length codes - 257 (257 - 286) |
| 737 | 5 Bits: HDIST, # of Distance codes - 1 (1 - 32) |
| 738 | 4 Bits: HCLEN, # of Code Length codes - 4 (4 - 19) |
| 739 | |
| 740 | |
| 741 | |
| 742 | Deutsch Informational [Page 13] |
| 743 | |
| 744 | |
| 745 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 746 | |
| 747 | |
| 748 | (HCLEN + 4) x 3 bits: code lengths for the code length |
| 749 | alphabet given just above, in the order: 16, 17, 18, |
| 750 | 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15 |
| 751 | |
| 752 | These code lengths are interpreted as 3-bit integers |
| 753 | (0-7); as above, a code length of 0 means the |
| 754 | corresponding symbol (literal/length or distance code |
| 755 | length) is not used. |
| 756 | |
| 757 | HLIT + 257 code lengths for the literal/length alphabet, |
| 758 | encoded using the code length Huffman code |
| 759 | |
| 760 | HDIST + 1 code lengths for the distance alphabet, |
| 761 | encoded using the code length Huffman code |
| 762 | |
| 763 | The actual compressed data of the block, |
| 764 | encoded using the literal/length and distance Huffman |
| 765 | codes |
| 766 | |
| 767 | The literal/length symbol 256 (end of data), |
| 768 | encoded using the literal/length Huffman code |
| 769 | |
| 770 | The code length repeat codes can cross from HLIT + 257 to the |
| 771 | HDIST + 1 code lengths. In other words, all code lengths form |
| 772 | a single sequence of HLIT + HDIST + 258 values. |
| 773 | |
| 774 | 3.3. Compliance |
| 775 | |
| 776 | A compressor may limit further the ranges of values specified in |
| 777 | the previous section and still be compliant; for example, it may |
| 778 | limit the range of backward pointers to some value smaller than |
| 779 | 32K. Similarly, a compressor may limit the size of blocks so that |
| 780 | a compressible block fits in memory. |
| 781 | |
| 782 | A compliant decompressor must accept the full range of possible |
| 783 | values defined in the previous section, and must accept blocks of |
| 784 | arbitrary size. |
| 785 | |
| 786 | 4. Compression algorithm details |
| 787 | |
| 788 | While it is the intent of this document to define the "deflate" |
| 789 | compressed data format without reference to any particular |
| 790 | compression algorithm, the format is related to the compressed |
| 791 | formats produced by LZ77 (Lempel-Ziv 1977, see reference [2] below); |
| 792 | since many variations of LZ77 are patented, it is strongly |
| 793 | recommended that the implementor of a compressor follow the general |
| 794 | algorithm presented here, which is known not to be patented per se. |
| 795 | The material in this section is not part of the definition of the |
| 796 | |
| 797 | |
| 798 | |
| 799 | Deutsch Informational [Page 14] |
| 800 | |
| 801 | |
| 802 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 803 | |
| 804 | |
| 805 | specification per se, and a compressor need not follow it in order to |
| 806 | be compliant. |
| 807 | |
| 808 | The compressor terminates a block when it determines that starting a |
| 809 | new block with fresh trees would be useful, or when the block size |
| 810 | fills up the compressor's block buffer. |
| 811 | |
| 812 | The compressor uses a chained hash table to find duplicated strings, |
| 813 | using a hash function that operates on 3-byte sequences. At any |
| 814 | given point during compression, let XYZ be the next 3 input bytes to |
| 815 | be examined (not necessarily all different, of course). First, the |
| 816 | compressor examines the hash chain for XYZ. If the chain is empty, |
| 817 | the compressor simply writes out X as a literal byte and advances one |
| 818 | byte in the input. If the hash chain is not empty, indicating that |
| 819 | the sequence XYZ (or, if we are unlucky, some other 3 bytes with the |
| 820 | same hash function value) has occurred recently, the compressor |
| 821 | compares all strings on the XYZ hash chain with the actual input data |
| 822 | sequence starting at the current point, and selects the longest |
| 823 | match. |
| 824 | |
| 825 | The compressor searches the hash chains starting with the most recent |
| 826 | strings, to favor small distances and thus take advantage of the |
| 827 | Huffman encoding. The hash chains are singly linked. There are no |
| 828 | deletions from the hash chains; the algorithm simply discards matches |
| 829 | that are too old. To avoid a worst-case situation, very long hash |
| 830 | chains are arbitrarily truncated at a certain length, determined by a |
| 831 | run-time parameter. |
| 832 | |
| 833 | To improve overall compression, the compressor optionally defers the |
| 834 | selection of matches ("lazy matching"): after a match of length N has |
| 835 | been found, the compressor searches for a longer match starting at |
| 836 | the next input byte. If it finds a longer match, it truncates the |
| 837 | previous match to a length of one (thus producing a single literal |
| 838 | byte) and then emits the longer match. Otherwise, it emits the |
| 839 | original match, and, as described above, advances N bytes before |
| 840 | continuing. |
| 841 | |
| 842 | Run-time parameters also control this "lazy match" procedure. If |
| 843 | compression ratio is most important, the compressor attempts a |
| 844 | complete second search regardless of the length of the first match. |
| 845 | In the normal case, if the current match is "long enough", the |
| 846 | compressor reduces the search for a longer match, thus speeding up |
| 847 | the process. If speed is most important, the compressor inserts new |
| 848 | strings in the hash table only when no match was found, or when the |
| 849 | match is not "too long". This degrades the compression ratio but |
| 850 | saves time since there are both fewer insertions and fewer searches. |
| 851 | |
| 852 | |
| 853 | |
| 854 | |
| 855 | |
| 856 | Deutsch Informational [Page 15] |
| 857 | |
| 858 | |
| 859 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 860 | |
| 861 | |
| 862 | 5. References |
| 863 | |
| 864 | [1] Huffman, D. A., "A Method for the Construction of Minimum |
| 865 | Redundancy Codes", Proceedings of the Institute of Radio |
| 866 | Engineers, September 1952, Volume 40, Number 9, pp. 1098-1101. |
| 867 | |
| 868 | [2] Ziv J., Lempel A., "A Universal Algorithm for Sequential Data |
| 869 | Compression", IEEE Transactions on Information Theory, Vol. 23, |
| 870 | No. 3, pp. 337-343. |
| 871 | |
| 872 | [3] Gailly, J.-L., and Adler, M., ZLIB documentation and sources, |
| 873 | available in ftp://ftp.uu.net/pub/archiving/zip/doc/ |
| 874 | |
| 875 | [4] Gailly, J.-L., and Adler, M., GZIP documentation and sources, |
| 876 | available as gzip-*.tar in ftp://prep.ai.mit.edu/pub/gnu/ |
| 877 | |
| 878 | [5] Schwartz, E. S., and Kallick, B. "Generating a canonical prefix |
| 879 | encoding." Comm. ACM, 7,3 (Mar. 1964), pp. 166-169. |
| 880 | |
| 881 | [6] Hirschberg and Lelewer, "Efficient decoding of prefix codes," |
| 882 | Comm. ACM, 33,4, April 1990, pp. 449-459. |
| 883 | |
| 884 | 6. Security Considerations |
| 885 | |
| 886 | Any data compression method involves the reduction of redundancy in |
| 887 | the data. Consequently, any corruption of the data is likely to have |
| 888 | severe effects and be difficult to correct. Uncompressed text, on |
| 889 | the other hand, will probably still be readable despite the presence |
| 890 | of some corrupted bytes. |
| 891 | |
| 892 | It is recommended that systems using this data format provide some |
| 893 | means of validating the integrity of the compressed data. See |
| 894 | reference [3], for example. |
| 895 | |
| 896 | 7. Source code |
| 897 | |
| 898 | Source code for a C language implementation of a "deflate" compliant |
| 899 | compressor and decompressor is available within the zlib package at |
| 900 | ftp://ftp.uu.net/pub/archiving/zip/zlib/. |
| 901 | |
| 902 | 8. Acknowledgements |
| 903 | |
| 904 | Trademarks cited in this document are the property of their |
| 905 | respective owners. |
| 906 | |
| 907 | Phil Katz designed the deflate format. Jean-Loup Gailly and Mark |
| 908 | Adler wrote the related software described in this specification. |
| 909 | Glenn Randers-Pehrson converted this document to RFC and HTML format. |
| 910 | |
| 911 | |
| 912 | |
| 913 | Deutsch Informational [Page 16] |
| 914 | |
| 915 | |
| 916 | RFC 1951 DEFLATE Compressed Data Format Specification May 1996 |
| 917 | |
| 918 | |
| 919 | 9. Author's Address |
| 920 | |
| 921 | L. Peter Deutsch |
| 922 | Aladdin Enterprises |
| 923 | 203 Santa Margarita Ave. |
| 924 | Menlo Park, CA 94025 |
| 925 | |
| 926 | Phone: (415) 322-0103 (AM only) |
| 927 | FAX: (415) 322-1734 |
| 928 | EMail: <[email protected]> |
| 929 | |
| 930 | Questions about the technical content of this specification can be |
| 931 | sent by email to: |
| 932 | |
| 933 | Jean-Loup Gailly <[email protected]> and |
| 934 | Mark Adler <[email protected]> |
| 935 | |
| 936 | Editorial comments on this specification can be sent by email to: |
| 937 | |
| 938 | L. Peter Deutsch <[email protected]> and |
| 939 | Glenn Randers-Pehrson <[email protected]> |
| 940 | |
| 941 | |
| 942 | |
| 943 | |
| 944 | |
| 945 | |
| 946 | |
| 947 | |
| 948 | |
| 949 | |
| 950 | |
| 951 | |
| 952 | |
| 953 | |
| 954 | |
| 955 | |
| 956 | |
| 957 | |
| 958 | |
| 959 | |
| 960 | |
| 961 | |
| 962 | |
| 963 | |
| 964 | |
| 965 | |
| 966 | |
| 967 | |
| 968 | |
| 969 | |
| 970 | Deutsch Informational [Page 17] |
| 971 | |
| 972 |
| --- a/compat/zlib/doc/rfc1951.txt | |
| +++ b/compat/zlib/doc/rfc1951.txt | |
| @@ -1,972 +0,0 @@ | |
D
compat/zlib/doc/rfc1952.txt
-687
| --- a/compat/zlib/doc/rfc1952.txt | ||
| +++ b/compat/zlib/doc/rfc1952.txt | ||
| @@ -1,687 +0,0 @@ | ||
| 1 | - | |
| 2 | - | |
| 3 | - | |
| 4 | - | |
| 5 | - | |
| 6 | - | |
| 7 | -Network Working Group P. Deutsch | |
| 8 | -Request for Comments: 1952 Aladdin Enterprises | |
| 9 | -Category: Informational May 1996 | |
| 10 | - | |
| 11 | - | |
| 12 | - GZIP file format specification version 4.3 | |
| 13 | - | |
| 14 | -Status of This Memo | |
| 15 | - | |
| 16 | - This memo provides information for the Internet community. This memo | |
| 17 | - does not specify an Internet standard of any kind. Distribution of | |
| 18 | - this memo is unlimited. | |
| 19 | - | |
| 20 | -IESG Note: | |
| 21 | - | |
| 22 | - The IESG takes no position on the validity of any Intellectual | |
| 23 | - Property Rights statements contained in this document. | |
| 24 | - | |
| 25 | -Notices | |
| 26 | - | |
| 27 | - Copyright (c) 1996 L. Peter Deutsch | |
| 28 | - | |
| 29 | - Permission is granted to copy and distribute this document for any | |
| 30 | - purpose and without charge, including translations into other | |
| 31 | - languages and incorporation into compilations, provided that the | |
| 32 | - copyright notice and this notice are preserved, and that any | |
| 33 | - substantive changes or deletions from the original are clearly | |
| 34 | - marked. | |
| 35 | - | |
| 36 | - A pointer to the latest version of this and related documentation in | |
| 37 | - HTML format can be found at the URL | |
| 38 | - <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>. | |
| 39 | - | |
| 40 | -Abstract | |
| 41 | - | |
| 42 | - This specification defines a lossless compressed data format that is | |
| 43 | - compatible with the widely used GZIP utility. The format includes a | |
| 44 | - cyclic redundancy check value for detecting data corruption. The | |
| 45 | - format presently uses the DEFLATE method of compression but can be | |
| 46 | - easily extended to use other compression methods. The format can be | |
| 47 | - implemented readily in a manner not covered by patents. | |
| 48 | - | |
| 49 | - | |
| 50 | - | |
| 51 | - | |
| 52 | - | |
| 53 | - | |
| 54 | - | |
| 55 | - | |
| 56 | - | |
| 57 | - | |
| 58 | -Deutsch Informational [Page 1] | |
| 59 | - | |
| 60 | - | |
| 61 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 62 | - | |
| 63 | - | |
| 64 | -Table of Contents | |
| 65 | - | |
| 66 | - 1. Introduction ................................................... 2 | |
| 67 | - 1.1. Purpose ................................................... 2 | |
| 68 | - 1.2. Intended audience ......................................... 3 | |
| 69 | - 1.3. Scope ..................................................... 3 | |
| 70 | - 1.4. Compliance ................................................ 3 | |
| 71 | - 1.5. Definitions of terms and conventions used ................. 3 | |
| 72 | - 1.6. Changes from previous versions ............................ 3 | |
| 73 | - 2. Detailed specification ......................................... 4 | |
| 74 | - 2.1. Overall conventions ....................................... 4 | |
| 75 | - 2.2. File format ............................................... 5 | |
| 76 | - 2.3. Member format ............................................. 5 | |
| 77 | - 2.3.1. Member header and trailer ........................... 6 | |
| 78 | - 2.3.1.1. Extra field ................................... 8 | |
| 79 | - 2.3.1.2. Compliance .................................... 9 | |
| 80 | - 3. References .................................................. 9 | |
| 81 | - 4. Security Considerations .................................... 10 | |
| 82 | - 5. Acknowledgements ........................................... 10 | |
| 83 | - 6. Author's Address ........................................... 10 | |
| 84 | - 7. Appendix: Jean-Loup Gailly's gzip utility .................. 11 | |
| 85 | - 8. Appendix: Sample CRC Code .................................. 11 | |
| 86 | - | |
| 87 | -1. Introduction | |
| 88 | - | |
| 89 | - 1.1. Purpose | |
| 90 | - | |
| 91 | - The purpose of this specification is to define a lossless | |
| 92 | - compressed data format that: | |
| 93 | - | |
| 94 | - * Is independent of CPU type, operating system, file system, | |
| 95 | - and character set, and hence can be used for interchange; | |
| 96 | - * Can compress or decompress a data stream (as opposed to a | |
| 97 | - randomly accessible file) to produce another data stream, | |
| 98 | - using only an a priori bounded amount of intermediate | |
| 99 | - storage, and hence can be used in data communications or | |
| 100 | - similar structures such as Unix filters; | |
| 101 | - * Compresses data with efficiency comparable to the best | |
| 102 | - currently available general-purpose compression methods, | |
| 103 | - and in particular considerably better than the "compress" | |
| 104 | - program; | |
| 105 | - * Can be implemented readily in a manner not covered by | |
| 106 | - patents, and hence can be practiced freely; | |
| 107 | - * Is compatible with the file format produced by the current | |
| 108 | - widely used gzip utility, in that conforming decompressors | |
| 109 | - will be able to read data produced by the existing gzip | |
| 110 | - compressor. | |
| 111 | - | |
| 112 | - | |
| 113 | - | |
| 114 | - | |
| 115 | -Deutsch Informational [Page 2] | |
| 116 | - | |
| 117 | - | |
| 118 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 119 | - | |
| 120 | - | |
| 121 | - The data format defined by this specification does not attempt to: | |
| 122 | - | |
| 123 | - * Provide random access to compressed data; | |
| 124 | - * Compress specialized data (e.g., raster graphics) as well as | |
| 125 | - the best currently available specialized algorithms. | |
| 126 | - | |
| 127 | - 1.2. Intended audience | |
| 128 | - | |
| 129 | - This specification is intended for use by implementors of software | |
| 130 | - to compress data into gzip format and/or decompress data from gzip | |
| 131 | - format. | |
| 132 | - | |
| 133 | - The text of the specification assumes a basic background in | |
| 134 | - programming at the level of bits and other primitive data | |
| 135 | - representations. | |
| 136 | - | |
| 137 | - 1.3. Scope | |
| 138 | - | |
| 139 | - The specification specifies a compression method and a file format | |
| 140 | - (the latter assuming only that a file can store a sequence of | |
| 141 | - arbitrary bytes). It does not specify any particular interface to | |
| 142 | - a file system or anything about character sets or encodings | |
| 143 | - (except for file names and comments, which are optional). | |
| 144 | - | |
| 145 | - 1.4. Compliance | |
| 146 | - | |
| 147 | - Unless otherwise indicated below, a compliant decompressor must be | |
| 148 | - able to accept and decompress any file that conforms to all the | |
| 149 | - specifications presented here; a compliant compressor must produce | |
| 150 | - files that conform to all the specifications presented here. The | |
| 151 | - material in the appendices is not part of the specification per se | |
| 152 | - and is not relevant to compliance. | |
| 153 | - | |
| 154 | - 1.5. Definitions of terms and conventions used | |
| 155 | - | |
| 156 | - byte: 8 bits stored or transmitted as a unit (same as an octet). | |
| 157 | - (For this specification, a byte is exactly 8 bits, even on | |
| 158 | - machines which store a character on a number of bits different | |
| 159 | - from 8.) See below for the numbering of bits within a byte. | |
| 160 | - | |
| 161 | - 1.6. Changes from previous versions | |
| 162 | - | |
| 163 | - There have been no technical changes to the gzip format since | |
| 164 | - version 4.1 of this specification. In version 4.2, some | |
| 165 | - terminology was changed, and the sample CRC code was rewritten for | |
| 166 | - clarity and to eliminate the requirement for the caller to do pre- | |
| 167 | - and post-conditioning. Version 4.3 is a conversion of the | |
| 168 | - specification to RFC style. | |
| 169 | - | |
| 170 | - | |
| 171 | - | |
| 172 | -Deutsch Informational [Page 3] | |
| 173 | - | |
| 174 | - | |
| 175 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 176 | - | |
| 177 | - | |
| 178 | -2. Detailed specification | |
| 179 | - | |
| 180 | - 2.1. Overall conventions | |
| 181 | - | |
| 182 | - In the diagrams below, a box like this: | |
| 183 | - | |
| 184 | - +---+ | |
| 185 | - | | <-- the vertical bars might be missing | |
| 186 | - +---+ | |
| 187 | - | |
| 188 | - represents one byte; a box like this: | |
| 189 | - | |
| 190 | - +==============+ | |
| 191 | - | | | |
| 192 | - +==============+ | |
| 193 | - | |
| 194 | - represents a variable number of bytes. | |
| 195 | - | |
| 196 | - Bytes stored within a computer do not have a "bit order", since | |
| 197 | - they are always treated as a unit. However, a byte considered as | |
| 198 | - an integer between 0 and 255 does have a most- and least- | |
| 199 | - significant bit, and since we write numbers with the most- | |
| 200 | - significant digit on the left, we also write bytes with the most- | |
| 201 | - significant bit on the left. In the diagrams below, we number the | |
| 202 | - bits of a byte so that bit 0 is the least-significant bit, i.e., | |
| 203 | - the bits are numbered: | |
| 204 | - | |
| 205 | - +--------+ | |
| 206 | - |76543210| | |
| 207 | - +--------+ | |
| 208 | - | |
| 209 | - This document does not address the issue of the order in which | |
| 210 | - bits of a byte are transmitted on a bit-sequential medium, since | |
| 211 | - the data format described here is byte- rather than bit-oriented. | |
| 212 | - | |
| 213 | - Within a computer, a number may occupy multiple bytes. All | |
| 214 | - multi-byte numbers in the format described here are stored with | |
| 215 | - the least-significant byte first (at the lower memory address). | |
| 216 | - For example, the decimal number 520 is stored as: | |
| 217 | - | |
| 218 | - 0 1 | |
| 219 | - +--------+--------+ | |
| 220 | - |00001000|00000010| | |
| 221 | - +--------+--------+ | |
| 222 | - ^ ^ | |
| 223 | - | | | |
| 224 | - | + more significant byte = 2 x 256 | |
| 225 | - + less significant byte = 8 | |
| 226 | - | |
| 227 | - | |
| 228 | - | |
| 229 | -Deutsch Informational [Page 4] | |
| 230 | - | |
| 231 | - | |
| 232 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 233 | - | |
| 234 | - | |
| 235 | - 2.2. File format | |
| 236 | - | |
| 237 | - A gzip file consists of a series of "members" (compressed data | |
| 238 | - sets). The format of each member is specified in the following | |
| 239 | - section. The members simply appear one after another in the file, | |
| 240 | - with no additional information before, between, or after them. | |
| 241 | - | |
| 242 | - 2.3. Member format | |
| 243 | - | |
| 244 | - Each member has the following structure: | |
| 245 | - | |
| 246 | - +---+---+---+---+---+---+---+---+---+---+ | |
| 247 | - |ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->) | |
| 248 | - +---+---+---+---+---+---+---+---+---+---+ | |
| 249 | - | |
| 250 | - (if FLG.FEXTRA set) | |
| 251 | - | |
| 252 | - +---+---+=================================+ | |
| 253 | - | XLEN |...XLEN bytes of "extra field"...| (more-->) | |
| 254 | - +---+---+=================================+ | |
| 255 | - | |
| 256 | - (if FLG.FNAME set) | |
| 257 | - | |
| 258 | - +=========================================+ | |
| 259 | - |...original file name, zero-terminated...| (more-->) | |
| 260 | - +=========================================+ | |
| 261 | - | |
| 262 | - (if FLG.FCOMMENT set) | |
| 263 | - | |
| 264 | - +===================================+ | |
| 265 | - |...file comment, zero-terminated...| (more-->) | |
| 266 | - +===================================+ | |
| 267 | - | |
| 268 | - (if FLG.FHCRC set) | |
| 269 | - | |
| 270 | - +---+---+ | |
| 271 | - | CRC16 | | |
| 272 | - +---+---+ | |
| 273 | - | |
| 274 | - +=======================+ | |
| 275 | - |...compressed blocks...| (more-->) | |
| 276 | - +=======================+ | |
| 277 | - | |
| 278 | - 0 1 2 3 4 5 6 7 | |
| 279 | - +---+---+---+---+---+---+---+---+ | |
| 280 | - | CRC32 | ISIZE | | |
| 281 | - +---+---+---+---+---+---+---+---+ | |
| 282 | - | |
| 283 | - | |
| 284 | - | |
| 285 | - | |
| 286 | -Deutsch Informational [Page 5] | |
| 287 | - | |
| 288 | - | |
| 289 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 290 | - | |
| 291 | - | |
| 292 | - 2.3.1. Member header and trailer | |
| 293 | - | |
| 294 | - ID1 (IDentification 1) | |
| 295 | - ID2 (IDentification 2) | |
| 296 | - These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 | |
| 297 | - (0x8b, \213), to identify the file as being in gzip format. | |
| 298 | - | |
| 299 | - CM (Compression Method) | |
| 300 | - This identifies the compression method used in the file. CM | |
| 301 | - = 0-7 are reserved. CM = 8 denotes the "deflate" | |
| 302 | - compression method, which is the one customarily used by | |
| 303 | - gzip and which is documented elsewhere. | |
| 304 | - | |
| 305 | - FLG (FLaGs) | |
| 306 | - This flag byte is divided into individual bits as follows: | |
| 307 | - | |
| 308 | - bit 0 FTEXT | |
| 309 | - bit 1 FHCRC | |
| 310 | - bit 2 FEXTRA | |
| 311 | - bit 3 FNAME | |
| 312 | - bit 4 FCOMMENT | |
| 313 | - bit 5 reserved | |
| 314 | - bit 6 reserved | |
| 315 | - bit 7 reserved | |
| 316 | - | |
| 317 | - If FTEXT is set, the file is probably ASCII text. This is | |
| 318 | - an optional indication, which the compressor may set by | |
| 319 | - checking a small amount of the input data to see whether any | |
| 320 | - non-ASCII characters are present. In case of doubt, FTEXT | |
| 321 | - is cleared, indicating binary data. For systems which have | |
| 322 | - different file formats for ascii text and binary data, the | |
| 323 | - decompressor can use FTEXT to choose the appropriate format. | |
| 324 | - We deliberately do not specify the algorithm used to set | |
| 325 | - this bit, since a compressor always has the option of | |
| 326 | - leaving it cleared and a decompressor always has the option | |
| 327 | - of ignoring it and letting some other program handle issues | |
| 328 | - of data conversion. | |
| 329 | - | |
| 330 | - If FHCRC is set, a CRC16 for the gzip header is present, | |
| 331 | - immediately before the compressed data. The CRC16 consists | |
| 332 | - of the two least significant bytes of the CRC32 for all | |
| 333 | - bytes of the gzip header up to and not including the CRC16. | |
| 334 | - [The FHCRC bit was never set by versions of gzip up to | |
| 335 | - 1.2.4, even though it was documented with a different | |
| 336 | - meaning in gzip 1.2.4.] | |
| 337 | - | |
| 338 | - If FEXTRA is set, optional extra fields are present, as | |
| 339 | - described in a following section. | |
| 340 | - | |
| 341 | - | |
| 342 | - | |
| 343 | -Deutsch Informational [Page 6] | |
| 344 | - | |
| 345 | - | |
| 346 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 347 | - | |
| 348 | - | |
| 349 | - If FNAME is set, an original file name is present, | |
| 350 | - terminated by a zero byte. The name must consist of ISO | |
| 351 | - 8859-1 (LATIN-1) characters; on operating systems using | |
| 352 | - EBCDIC or any other character set for file names, the name | |
| 353 | - must be translated to the ISO LATIN-1 character set. This | |
| 354 | - is the original name of the file being compressed, with any | |
| 355 | - directory components removed, and, if the file being | |
| 356 | - compressed is on a file system with case insensitive names, | |
| 357 | - forced to lower case. There is no original file name if the | |
| 358 | - data was compressed from a source other than a named file; | |
| 359 | - for example, if the source was stdin on a Unix system, there | |
| 360 | - is no file name. | |
| 361 | - | |
| 362 | - If FCOMMENT is set, a zero-terminated file comment is | |
| 363 | - present. This comment is not interpreted; it is only | |
| 364 | - intended for human consumption. The comment must consist of | |
| 365 | - ISO 8859-1 (LATIN-1) characters. Line breaks should be | |
| 366 | - denoted by a single line feed character (10 decimal). | |
| 367 | - | |
| 368 | - Reserved FLG bits must be zero. | |
| 369 | - | |
| 370 | - MTIME (Modification TIME) | |
| 371 | - This gives the most recent modification time of the original | |
| 372 | - file being compressed. The time is in Unix format, i.e., | |
| 373 | - seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this | |
| 374 | - may cause problems for MS-DOS and other systems that use | |
| 375 | - local rather than Universal time.) If the compressed data | |
| 376 | - did not come from a file, MTIME is set to the time at which | |
| 377 | - compression started. MTIME = 0 means no time stamp is | |
| 378 | - available. | |
| 379 | - | |
| 380 | - XFL (eXtra FLags) | |
| 381 | - These flags are available for use by specific compression | |
| 382 | - methods. The "deflate" method (CM = 8) sets these flags as | |
| 383 | - follows: | |
| 384 | - | |
| 385 | - XFL = 2 - compressor used maximum compression, | |
| 386 | - slowest algorithm | |
| 387 | - XFL = 4 - compressor used fastest algorithm | |
| 388 | - | |
| 389 | - OS (Operating System) | |
| 390 | - This identifies the type of file system on which compression | |
| 391 | - took place. This may be useful in determining end-of-line | |
| 392 | - convention for text files. The currently defined values are | |
| 393 | - as follows: | |
| 394 | - | |
| 395 | - | |
| 396 | - | |
| 397 | - | |
| 398 | - | |
| 399 | - | |
| 400 | -Deutsch Informational [Page 7] | |
| 401 | - | |
| 402 | - | |
| 403 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 404 | - | |
| 405 | - | |
| 406 | - 0 - FAT filesystem (MS-DOS, OS/2, NT/Win32) | |
| 407 | - 1 - Amiga | |
| 408 | - 2 - VMS (or OpenVMS) | |
| 409 | - 3 - Unix | |
| 410 | - 4 - VM/CMS | |
| 411 | - 5 - Atari TOS | |
| 412 | - 6 - HPFS filesystem (OS/2, NT) | |
| 413 | - 7 - Macintosh | |
| 414 | - 8 - Z-System | |
| 415 | - 9 - CP/M | |
| 416 | - 10 - TOPS-20 | |
| 417 | - 11 - NTFS filesystem (NT) | |
| 418 | - 12 - QDOS | |
| 419 | - 13 - Acorn RISCOS | |
| 420 | - 255 - unknown | |
| 421 | - | |
| 422 | - XLEN (eXtra LENgth) | |
| 423 | - If FLG.FEXTRA is set, this gives the length of the optional | |
| 424 | - extra field. See below for details. | |
| 425 | - | |
| 426 | - CRC32 (CRC-32) | |
| 427 | - This contains a Cyclic Redundancy Check value of the | |
| 428 | - uncompressed data computed according to CRC-32 algorithm | |
| 429 | - used in the ISO 3309 standard and in section 8.1.1.6.2 of | |
| 430 | - ITU-T recommendation V.42. (See http://www.iso.ch for | |
| 431 | - ordering ISO documents. See gopher://info.itu.ch for an | |
| 432 | - online version of ITU-T V.42.) | |
| 433 | - | |
| 434 | - ISIZE (Input SIZE) | |
| 435 | - This contains the size of the original (uncompressed) input | |
| 436 | - data modulo 2^32. | |
| 437 | - | |
| 438 | - 2.3.1.1. Extra field | |
| 439 | - | |
| 440 | - If the FLG.FEXTRA bit is set, an "extra field" is present in | |
| 441 | - the header, with total length XLEN bytes. It consists of a | |
| 442 | - series of subfields, each of the form: | |
| 443 | - | |
| 444 | - +---+---+---+---+==================================+ | |
| 445 | - |SI1|SI2| LEN |... LEN bytes of subfield data ...| | |
| 446 | - +---+---+---+---+==================================+ | |
| 447 | - | |
| 448 | - SI1 and SI2 provide a subfield ID, typically two ASCII letters | |
| 449 | - with some mnemonic value. Jean-Loup Gailly | |
| 450 | - <[email protected]> is maintaining a registry of subfield | |
| 451 | - IDs; please send him any subfield ID you wish to use. Subfield | |
| 452 | - IDs with SI2 = 0 are reserved for future use. The following | |
| 453 | - IDs are currently defined: | |
| 454 | - | |
| 455 | - | |
| 456 | - | |
| 457 | -Deutsch Informational [Page 8] | |
| 458 | - | |
| 459 | - | |
| 460 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 461 | - | |
| 462 | - | |
| 463 | - SI1 SI2 Data | |
| 464 | - ---------- ---------- ---- | |
| 465 | - 0x41 ('A') 0x70 ('P') Apollo file type information | |
| 466 | - | |
| 467 | - LEN gives the length of the subfield data, excluding the 4 | |
| 468 | - initial bytes. | |
| 469 | - | |
| 470 | - 2.3.1.2. Compliance | |
| 471 | - | |
| 472 | - A compliant compressor must produce files with correct ID1, | |
| 473 | - ID2, CM, CRC32, and ISIZE, but may set all the other fields in | |
| 474 | - the fixed-length part of the header to default values (255 for | |
| 475 | - OS, 0 for all others). The compressor must set all reserved | |
| 476 | - bits to zero. | |
| 477 | - | |
| 478 | - A compliant decompressor must check ID1, ID2, and CM, and | |
| 479 | - provide an error indication if any of these have incorrect | |
| 480 | - values. It must examine FEXTRA/XLEN, FNAME, FCOMMENT and FHCRC | |
| 481 | - at least so it can skip over the optional fields if they are | |
| 482 | - present. It need not examine any other part of the header or | |
| 483 | - trailer; in particular, a decompressor may ignore FTEXT and OS | |
| 484 | - and always produce binary output, and still be compliant. A | |
| 485 | - compliant decompressor must give an error indication if any | |
| 486 | - reserved bit is non-zero, since such a bit could indicate the | |
| 487 | - presence of a new field that would cause subsequent data to be | |
| 488 | - interpreted incorrectly. | |
| 489 | - | |
| 490 | -3. References | |
| 491 | - | |
| 492 | - [1] "Information Processing - 8-bit single-byte coded graphic | |
| 493 | - character sets - Part 1: Latin alphabet No.1" (ISO 8859-1:1987). | |
| 494 | - The ISO 8859-1 (Latin-1) character set is a superset of 7-bit | |
| 495 | - ASCII. Files defining this character set are available as | |
| 496 | - iso_8859-1.* in ftp://ftp.uu.net/graphics/png/documents/ | |
| 497 | - | |
| 498 | - [2] ISO 3309 | |
| 499 | - | |
| 500 | - [3] ITU-T recommendation V.42 | |
| 501 | - | |
| 502 | - [4] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification", | |
| 503 | - available in ftp://ftp.uu.net/pub/archiving/zip/doc/ | |
| 504 | - | |
| 505 | - [5] Gailly, J.-L., GZIP documentation, available as gzip-*.tar in | |
| 506 | - ftp://prep.ai.mit.edu/pub/gnu/ | |
| 507 | - | |
| 508 | - [6] Sarwate, D.V., "Computation of Cyclic Redundancy Checks via Table | |
| 509 | - Look-Up", Communications of the ACM, 31(8), pp.1008-1013. | |
| 510 | - | |
| 511 | - | |
| 512 | - | |
| 513 | - | |
| 514 | -Deutsch Informational [Page 9] | |
| 515 | - | |
| 516 | - | |
| 517 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 518 | - | |
| 519 | - | |
| 520 | - [7] Schwaderer, W.D., "CRC Calculation", April 85 PC Tech Journal, | |
| 521 | - pp.118-133. | |
| 522 | - | |
| 523 | - [8] ftp://ftp.adelaide.edu.au/pub/rocksoft/papers/crc_v3.txt, | |
| 524 | - describing the CRC concept. | |
| 525 | - | |
| 526 | -4. Security Considerations | |
| 527 | - | |
| 528 | - Any data compression method involves the reduction of redundancy in | |
| 529 | - the data. Consequently, any corruption of the data is likely to have | |
| 530 | - severe effects and be difficult to correct. Uncompressed text, on | |
| 531 | - the other hand, will probably still be readable despite the presence | |
| 532 | - of some corrupted bytes. | |
| 533 | - | |
| 534 | - It is recommended that systems using this data format provide some | |
| 535 | - means of validating the integrity of the compressed data, such as by | |
| 536 | - setting and checking the CRC-32 check value. | |
| 537 | - | |
| 538 | -5. Acknowledgements | |
| 539 | - | |
| 540 | - Trademarks cited in this document are the property of their | |
| 541 | - respective owners. | |
| 542 | - | |
| 543 | - Jean-Loup Gailly designed the gzip format and wrote, with Mark Adler, | |
| 544 | - the related software described in this specification. Glenn | |
| 545 | - Randers-Pehrson converted this document to RFC and HTML format. | |
| 546 | - | |
| 547 | -6. Author's Address | |
| 548 | - | |
| 549 | - L. Peter Deutsch | |
| 550 | - Aladdin Enterprises | |
| 551 | - 203 Santa Margarita Ave. | |
| 552 | - Menlo Park, CA 94025 | |
| 553 | - | |
| 554 | - Phone: (415) 322-0103 (AM only) | |
| 555 | - FAX: (415) 322-1734 | |
| 556 | - EMail: <[email protected]> | |
| 557 | - | |
| 558 | - Questions about the technical content of this specification can be | |
| 559 | - sent by email to: | |
| 560 | - | |
| 561 | - Jean-Loup Gailly <[email protected]> and | |
| 562 | - Mark Adler <[email protected]> | |
| 563 | - | |
| 564 | - Editorial comments on this specification can be sent by email to: | |
| 565 | - | |
| 566 | - L. Peter Deutsch <[email protected]> and | |
| 567 | - Glenn Randers-Pehrson <[email protected]> | |
| 568 | - | |
| 569 | - | |
| 570 | - | |
| 571 | -Deutsch Informational [Page 10] | |
| 572 | - | |
| 573 | - | |
| 574 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 575 | - | |
| 576 | - | |
| 577 | -7. Appendix: Jean-Loup Gailly's gzip utility | |
| 578 | - | |
| 579 | - The most widely used implementation of gzip compression, and the | |
| 580 | - original documentation on which this specification is based, were | |
| 581 | - created by Jean-Loup Gailly <[email protected]>. Since this | |
| 582 | - implementation is a de facto standard, we mention some more of its | |
| 583 | - features here. Again, the material in this section is not part of | |
| 584 | - the specification per se, and implementations need not follow it to | |
| 585 | - be compliant. | |
| 586 | - | |
| 587 | - When compressing or decompressing a file, gzip preserves the | |
| 588 | - protection, ownership, and modification time attributes on the local | |
| 589 | - file system, since there is no provision for representing protection | |
| 590 | - attributes in the gzip file format itself. Since the file format | |
| 591 | - includes a modification time, the gzip decompressor provides a | |
| 592 | - command line switch that assigns the modification time from the file, | |
| 593 | - rather than the local modification time of the compressed input, to | |
| 594 | - the decompressed output. | |
| 595 | - | |
| 596 | -8. Appendix: Sample CRC Code | |
| 597 | - | |
| 598 | - The following sample code represents a practical implementation of | |
| 599 | - the CRC (Cyclic Redundancy Check). (See also ISO 3309 and ITU-T V.42 | |
| 600 | - for a formal specification.) | |
| 601 | - | |
| 602 | - The sample code is in the ANSI C programming language. Non C users | |
| 603 | - may find it easier to read with these hints: | |
| 604 | - | |
| 605 | - & Bitwise AND operator. | |
| 606 | - ^ Bitwise exclusive-OR operator. | |
| 607 | - >> Bitwise right shift operator. When applied to an | |
| 608 | - unsigned quantity, as here, right shift inserts zero | |
| 609 | - bit(s) at the left. | |
| 610 | - ! Logical NOT operator. | |
| 611 | - ++ "n++" increments the variable n. | |
| 612 | - 0xNNN 0x introduces a hexadecimal (base 16) constant. | |
| 613 | - Suffix L indicates a long value (at least 32 bits). | |
| 614 | - | |
| 615 | - /* Table of CRCs of all 8-bit messages. */ | |
| 616 | - unsigned long crc_table[256]; | |
| 617 | - | |
| 618 | - /* Flag: has the table been computed? Initially false. */ | |
| 619 | - int crc_table_computed = 0; | |
| 620 | - | |
| 621 | - /* Make the table for a fast CRC. */ | |
| 622 | - void make_crc_table(void) | |
| 623 | - { | |
| 624 | - unsigned long c; | |
| 625 | - | |
| 626 | - | |
| 627 | - | |
| 628 | -Deutsch Informational [Page 11] | |
| 629 | - | |
| 630 | - | |
| 631 | -RFC 1952 GZIP File Format Specification May 1996 | |
| 632 | - | |
| 633 | - | |
| 634 | - int n, k; | |
| 635 | - for (n = 0; n < 256; n++) { | |
| 636 | - c = (unsigned long) n; | |
| 637 | - for (k = 0; k < 8; k++) { | |
| 638 | - if (c & 1) { | |
| 639 | - c = 0xedb88320L ^ (c >> 1); | |
| 640 | - } else { | |
| 641 | - c = c >> 1; | |
| 642 | - } | |
| 643 | - } | |
| 644 | - crc_table[n] = c; | |
| 645 | - } | |
| 646 | - crc_table_computed = 1; | |
| 647 | - } | |
| 648 | - | |
| 649 | - /* | |
| 650 | - Update a running crc with the bytes buf[0..len-1] and return | |
| 651 | - the updated crc. The crc should be initialized to zero. Pre- and | |
| 652 | - post-conditioning (one's complement) is performed within this | |
| 653 | - function so it shouldn't be done by the caller. Usage example: | |
| 654 | - | |
| 655 | - unsigned long crc = 0L; | |
| 656 | - | |
| 657 | - while (read_buffer(buffer, length) != EOF) { | |
| 658 | - crc = update_crc(crc, buffer, length); | |
| 659 | - } | |
| 660 | - if (crc != original_crc) error(); | |
| 661 | - */ | |
| 662 | - unsigned long update_crc(unsigned long crc, | |
| 663 | - unsigned char *buf, int len) | |
| 664 | - { | |
| 665 | - unsigned long c = crc ^ 0xffffffffL; | |
| 666 | - int n; | |
| 667 | - | |
| 668 | - if (!crc_table_computed) | |
| 669 | - make_crc_table(); | |
| 670 | - for (n = 0; n < len; n++) { | |
| 671 | - c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8); | |
| 672 | - } | |
| 673 | - return c ^ 0xffffffffL; | |
| 674 | - } | |
| 675 | - | |
| 676 | - /* Return the CRC of the bytes buf[0..len-1]. */ | |
| 677 | - unsigned long crc(unsigned char *buf, int len) | |
| 678 | - { | |
| 679 | - return update_crc(0L, buf, len); | |
| 680 | - } | |
| 681 | - | |
| 682 | - | |
| 683 | - | |
| 684 | - | |
| 685 | -Deutsch Informational [Page 12] | |
| 686 | - | |
| 687 | - |
| --- a/compat/zlib/doc/rfc1952.txt | |
| +++ b/compat/zlib/doc/rfc1952.txt | |
| @@ -1,687 +0,0 @@ | |
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | Network Working Group P. Deutsch |
| 8 | Request for Comments: 1952 Aladdin Enterprises |
| 9 | Category: Informational May 1996 |
| 10 | |
| 11 | |
| 12 | GZIP file format specification version 4.3 |
| 13 | |
| 14 | Status of This Memo |
| 15 | |
| 16 | This memo provides information for the Internet community. This memo |
| 17 | does not specify an Internet standard of any kind. Distribution of |
| 18 | this memo is unlimited. |
| 19 | |
| 20 | IESG Note: |
| 21 | |
| 22 | The IESG takes no position on the validity of any Intellectual |
| 23 | Property Rights statements contained in this document. |
| 24 | |
| 25 | Notices |
| 26 | |
| 27 | Copyright (c) 1996 L. Peter Deutsch |
| 28 | |
| 29 | Permission is granted to copy and distribute this document for any |
| 30 | purpose and without charge, including translations into other |
| 31 | languages and incorporation into compilations, provided that the |
| 32 | copyright notice and this notice are preserved, and that any |
| 33 | substantive changes or deletions from the original are clearly |
| 34 | marked. |
| 35 | |
| 36 | A pointer to the latest version of this and related documentation in |
| 37 | HTML format can be found at the URL |
| 38 | <ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html>. |
| 39 | |
| 40 | Abstract |
| 41 | |
| 42 | This specification defines a lossless compressed data format that is |
| 43 | compatible with the widely used GZIP utility. The format includes a |
| 44 | cyclic redundancy check value for detecting data corruption. The |
| 45 | format presently uses the DEFLATE method of compression but can be |
| 46 | easily extended to use other compression methods. The format can be |
| 47 | implemented readily in a manner not covered by patents. |
| 48 | |
| 49 | |
| 50 | |
| 51 | |
| 52 | |
| 53 | |
| 54 | |
| 55 | |
| 56 | |
| 57 | |
| 58 | Deutsch Informational [Page 1] |
| 59 | |
| 60 | |
| 61 | RFC 1952 GZIP File Format Specification May 1996 |
| 62 | |
| 63 | |
| 64 | Table of Contents |
| 65 | |
| 66 | 1. Introduction ................................................... 2 |
| 67 | 1.1. Purpose ................................................... 2 |
| 68 | 1.2. Intended audience ......................................... 3 |
| 69 | 1.3. Scope ..................................................... 3 |
| 70 | 1.4. Compliance ................................................ 3 |
| 71 | 1.5. Definitions of terms and conventions used ................. 3 |
| 72 | 1.6. Changes from previous versions ............................ 3 |
| 73 | 2. Detailed specification ......................................... 4 |
| 74 | 2.1. Overall conventions ....................................... 4 |
| 75 | 2.2. File format ............................................... 5 |
| 76 | 2.3. Member format ............................................. 5 |
| 77 | 2.3.1. Member header and trailer ........................... 6 |
| 78 | 2.3.1.1. Extra field ................................... 8 |
| 79 | 2.3.1.2. Compliance .................................... 9 |
| 80 | 3. References .................................................. 9 |
| 81 | 4. Security Considerations .................................... 10 |
| 82 | 5. Acknowledgements ........................................... 10 |
| 83 | 6. Author's Address ........................................... 10 |
| 84 | 7. Appendix: Jean-Loup Gailly's gzip utility .................. 11 |
| 85 | 8. Appendix: Sample CRC Code .................................. 11 |
| 86 | |
| 87 | 1. Introduction |
| 88 | |
| 89 | 1.1. Purpose |
| 90 | |
| 91 | The purpose of this specification is to define a lossless |
| 92 | compressed data format that: |
| 93 | |
| 94 | * Is independent of CPU type, operating system, file system, |
| 95 | and character set, and hence can be used for interchange; |
| 96 | * Can compress or decompress a data stream (as opposed to a |
| 97 | randomly accessible file) to produce another data stream, |
| 98 | using only an a priori bounded amount of intermediate |
| 99 | storage, and hence can be used in data communications or |
| 100 | similar structures such as Unix filters; |
| 101 | * Compresses data with efficiency comparable to the best |
| 102 | currently available general-purpose compression methods, |
| 103 | and in particular considerably better than the "compress" |
| 104 | program; |
| 105 | * Can be implemented readily in a manner not covered by |
| 106 | patents, and hence can be practiced freely; |
| 107 | * Is compatible with the file format produced by the current |
| 108 | widely used gzip utility, in that conforming decompressors |
| 109 | will be able to read data produced by the existing gzip |
| 110 | compressor. |
| 111 | |
| 112 | |
| 113 | |
| 114 | |
| 115 | Deutsch Informational [Page 2] |
| 116 | |
| 117 | |
| 118 | RFC 1952 GZIP File Format Specification May 1996 |
| 119 | |
| 120 | |
| 121 | The data format defined by this specification does not attempt to: |
| 122 | |
| 123 | * Provide random access to compressed data; |
| 124 | * Compress specialized data (e.g., raster graphics) as well as |
| 125 | the best currently available specialized algorithms. |
| 126 | |
| 127 | 1.2. Intended audience |
| 128 | |
| 129 | This specification is intended for use by implementors of software |
| 130 | to compress data into gzip format and/or decompress data from gzip |
| 131 | format. |
| 132 | |
| 133 | The text of the specification assumes a basic background in |
| 134 | programming at the level of bits and other primitive data |
| 135 | representations. |
| 136 | |
| 137 | 1.3. Scope |
| 138 | |
| 139 | The specification specifies a compression method and a file format |
| 140 | (the latter assuming only that a file can store a sequence of |
| 141 | arbitrary bytes). It does not specify any particular interface to |
| 142 | a file system or anything about character sets or encodings |
| 143 | (except for file names and comments, which are optional). |
| 144 | |
| 145 | 1.4. Compliance |
| 146 | |
| 147 | Unless otherwise indicated below, a compliant decompressor must be |
| 148 | able to accept and decompress any file that conforms to all the |
| 149 | specifications presented here; a compliant compressor must produce |
| 150 | files that conform to all the specifications presented here. The |
| 151 | material in the appendices is not part of the specification per se |
| 152 | and is not relevant to compliance. |
| 153 | |
| 154 | 1.5. Definitions of terms and conventions used |
| 155 | |
| 156 | byte: 8 bits stored or transmitted as a unit (same as an octet). |
| 157 | (For this specification, a byte is exactly 8 bits, even on |
| 158 | machines which store a character on a number of bits different |
| 159 | from 8.) See below for the numbering of bits within a byte. |
| 160 | |
| 161 | 1.6. Changes from previous versions |
| 162 | |
| 163 | There have been no technical changes to the gzip format since |
| 164 | version 4.1 of this specification. In version 4.2, some |
| 165 | terminology was changed, and the sample CRC code was rewritten for |
| 166 | clarity and to eliminate the requirement for the caller to do pre- |
| 167 | and post-conditioning. Version 4.3 is a conversion of the |
| 168 | specification to RFC style. |
| 169 | |
| 170 | |
| 171 | |
| 172 | Deutsch Informational [Page 3] |
| 173 | |
| 174 | |
| 175 | RFC 1952 GZIP File Format Specification May 1996 |
| 176 | |
| 177 | |
| 178 | 2. Detailed specification |
| 179 | |
| 180 | 2.1. Overall conventions |
| 181 | |
| 182 | In the diagrams below, a box like this: |
| 183 | |
| 184 | +---+ |
| 185 | | | <-- the vertical bars might be missing |
| 186 | +---+ |
| 187 | |
| 188 | represents one byte; a box like this: |
| 189 | |
| 190 | +==============+ |
| 191 | | | |
| 192 | +==============+ |
| 193 | |
| 194 | represents a variable number of bytes. |
| 195 | |
| 196 | Bytes stored within a computer do not have a "bit order", since |
| 197 | they are always treated as a unit. However, a byte considered as |
| 198 | an integer between 0 and 255 does have a most- and least- |
| 199 | significant bit, and since we write numbers with the most- |
| 200 | significant digit on the left, we also write bytes with the most- |
| 201 | significant bit on the left. In the diagrams below, we number the |
| 202 | bits of a byte so that bit 0 is the least-significant bit, i.e., |
| 203 | the bits are numbered: |
| 204 | |
| 205 | +--------+ |
| 206 | |76543210| |
| 207 | +--------+ |
| 208 | |
| 209 | This document does not address the issue of the order in which |
| 210 | bits of a byte are transmitted on a bit-sequential medium, since |
| 211 | the data format described here is byte- rather than bit-oriented. |
| 212 | |
| 213 | Within a computer, a number may occupy multiple bytes. All |
| 214 | multi-byte numbers in the format described here are stored with |
| 215 | the least-significant byte first (at the lower memory address). |
| 216 | For example, the decimal number 520 is stored as: |
| 217 | |
| 218 | 0 1 |
| 219 | +--------+--------+ |
| 220 | |00001000|00000010| |
| 221 | +--------+--------+ |
| 222 | ^ ^ |
| 223 | | | |
| 224 | | + more significant byte = 2 x 256 |
| 225 | + less significant byte = 8 |
| 226 | |
| 227 | |
| 228 | |
| 229 | Deutsch Informational [Page 4] |
| 230 | |
| 231 | |
| 232 | RFC 1952 GZIP File Format Specification May 1996 |
| 233 | |
| 234 | |
| 235 | 2.2. File format |
| 236 | |
| 237 | A gzip file consists of a series of "members" (compressed data |
| 238 | sets). The format of each member is specified in the following |
| 239 | section. The members simply appear one after another in the file, |
| 240 | with no additional information before, between, or after them. |
| 241 | |
| 242 | 2.3. Member format |
| 243 | |
| 244 | Each member has the following structure: |
| 245 | |
| 246 | +---+---+---+---+---+---+---+---+---+---+ |
| 247 | |ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->) |
| 248 | +---+---+---+---+---+---+---+---+---+---+ |
| 249 | |
| 250 | (if FLG.FEXTRA set) |
| 251 | |
| 252 | +---+---+=================================+ |
| 253 | | XLEN |...XLEN bytes of "extra field"...| (more-->) |
| 254 | +---+---+=================================+ |
| 255 | |
| 256 | (if FLG.FNAME set) |
| 257 | |
| 258 | +=========================================+ |
| 259 | |...original file name, zero-terminated...| (more-->) |
| 260 | +=========================================+ |
| 261 | |
| 262 | (if FLG.FCOMMENT set) |
| 263 | |
| 264 | +===================================+ |
| 265 | |...file comment, zero-terminated...| (more-->) |
| 266 | +===================================+ |
| 267 | |
| 268 | (if FLG.FHCRC set) |
| 269 | |
| 270 | +---+---+ |
| 271 | | CRC16 | |
| 272 | +---+---+ |
| 273 | |
| 274 | +=======================+ |
| 275 | |...compressed blocks...| (more-->) |
| 276 | +=======================+ |
| 277 | |
| 278 | 0 1 2 3 4 5 6 7 |
| 279 | +---+---+---+---+---+---+---+---+ |
| 280 | | CRC32 | ISIZE | |
| 281 | +---+---+---+---+---+---+---+---+ |
| 282 | |
| 283 | |
| 284 | |
| 285 | |
| 286 | Deutsch Informational [Page 5] |
| 287 | |
| 288 | |
| 289 | RFC 1952 GZIP File Format Specification May 1996 |
| 290 | |
| 291 | |
| 292 | 2.3.1. Member header and trailer |
| 293 | |
| 294 | ID1 (IDentification 1) |
| 295 | ID2 (IDentification 2) |
| 296 | These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 |
| 297 | (0x8b, \213), to identify the file as being in gzip format. |
| 298 | |
| 299 | CM (Compression Method) |
| 300 | This identifies the compression method used in the file. CM |
| 301 | = 0-7 are reserved. CM = 8 denotes the "deflate" |
| 302 | compression method, which is the one customarily used by |
| 303 | gzip and which is documented elsewhere. |
| 304 | |
| 305 | FLG (FLaGs) |
| 306 | This flag byte is divided into individual bits as follows: |
| 307 | |
| 308 | bit 0 FTEXT |
| 309 | bit 1 FHCRC |
| 310 | bit 2 FEXTRA |
| 311 | bit 3 FNAME |
| 312 | bit 4 FCOMMENT |
| 313 | bit 5 reserved |
| 314 | bit 6 reserved |
| 315 | bit 7 reserved |
| 316 | |
| 317 | If FTEXT is set, the file is probably ASCII text. This is |
| 318 | an optional indication, which the compressor may set by |
| 319 | checking a small amount of the input data to see whether any |
| 320 | non-ASCII characters are present. In case of doubt, FTEXT |
| 321 | is cleared, indicating binary data. For systems which have |
| 322 | different file formats for ascii text and binary data, the |
| 323 | decompressor can use FTEXT to choose the appropriate format. |
| 324 | We deliberately do not specify the algorithm used to set |
| 325 | this bit, since a compressor always has the option of |
| 326 | leaving it cleared and a decompressor always has the option |
| 327 | of ignoring it and letting some other program handle issues |
| 328 | of data conversion. |
| 329 | |
| 330 | If FHCRC is set, a CRC16 for the gzip header is present, |
| 331 | immediately before the compressed data. The CRC16 consists |
| 332 | of the two least significant bytes of the CRC32 for all |
| 333 | bytes of the gzip header up to and not including the CRC16. |
| 334 | [The FHCRC bit was never set by versions of gzip up to |
| 335 | 1.2.4, even though it was documented with a different |
| 336 | meaning in gzip 1.2.4.] |
| 337 | |
| 338 | If FEXTRA is set, optional extra fields are present, as |
| 339 | described in a following section. |
| 340 | |
| 341 | |
| 342 | |
| 343 | Deutsch Informational [Page 6] |
| 344 | |
| 345 | |
| 346 | RFC 1952 GZIP File Format Specification May 1996 |
| 347 | |
| 348 | |
| 349 | If FNAME is set, an original file name is present, |
| 350 | terminated by a zero byte. The name must consist of ISO |
| 351 | 8859-1 (LATIN-1) characters; on operating systems using |
| 352 | EBCDIC or any other character set for file names, the name |
| 353 | must be translated to the ISO LATIN-1 character set. This |
| 354 | is the original name of the file being compressed, with any |
| 355 | directory components removed, and, if the file being |
| 356 | compressed is on a file system with case insensitive names, |
| 357 | forced to lower case. There is no original file name if the |
| 358 | data was compressed from a source other than a named file; |
| 359 | for example, if the source was stdin on a Unix system, there |
| 360 | is no file name. |
| 361 | |
| 362 | If FCOMMENT is set, a zero-terminated file comment is |
| 363 | present. This comment is not interpreted; it is only |
| 364 | intended for human consumption. The comment must consist of |
| 365 | ISO 8859-1 (LATIN-1) characters. Line breaks should be |
| 366 | denoted by a single line feed character (10 decimal). |
| 367 | |
| 368 | Reserved FLG bits must be zero. |
| 369 | |
| 370 | MTIME (Modification TIME) |
| 371 | This gives the most recent modification time of the original |
| 372 | file being compressed. The time is in Unix format, i.e., |
| 373 | seconds since 00:00:00 GMT, Jan. 1, 1970. (Note that this |
| 374 | may cause problems for MS-DOS and other systems that use |
| 375 | local rather than Universal time.) If the compressed data |
| 376 | did not come from a file, MTIME is set to the time at which |
| 377 | compression started. MTIME = 0 means no time stamp is |
| 378 | available. |
| 379 | |
| 380 | XFL (eXtra FLags) |
| 381 | These flags are available for use by specific compression |
| 382 | methods. The "deflate" method (CM = 8) sets these flags as |
| 383 | follows: |
| 384 | |
| 385 | XFL = 2 - compressor used maximum compression, |
| 386 | slowest algorithm |
| 387 | XFL = 4 - compressor used fastest algorithm |
| 388 | |
| 389 | OS (Operating System) |
| 390 | This identifies the type of file system on which compression |
| 391 | took place. This may be useful in determining end-of-line |
| 392 | convention for text files. The currently defined values are |
| 393 | as follows: |
| 394 | |
| 395 | |
| 396 | |
| 397 | |
| 398 | |
| 399 | |
| 400 | Deutsch Informational [Page 7] |
| 401 | |
| 402 | |
| 403 | RFC 1952 GZIP File Format Specification May 1996 |
| 404 | |
| 405 | |
| 406 | 0 - FAT filesystem (MS-DOS, OS/2, NT/Win32) |
| 407 | 1 - Amiga |
| 408 | 2 - VMS (or OpenVMS) |
| 409 | 3 - Unix |
| 410 | 4 - VM/CMS |
| 411 | 5 - Atari TOS |
| 412 | 6 - HPFS filesystem (OS/2, NT) |
| 413 | 7 - Macintosh |
| 414 | 8 - Z-System |
| 415 | 9 - CP/M |
| 416 | 10 - TOPS-20 |
| 417 | 11 - NTFS filesystem (NT) |
| 418 | 12 - QDOS |
| 419 | 13 - Acorn RISCOS |
| 420 | 255 - unknown |
| 421 | |
| 422 | XLEN (eXtra LENgth) |
| 423 | If FLG.FEXTRA is set, this gives the length of the optional |
| 424 | extra field. See below for details. |
| 425 | |
| 426 | CRC32 (CRC-32) |
| 427 | This contains a Cyclic Redundancy Check value of the |
| 428 | uncompressed data computed according to CRC-32 algorithm |
| 429 | used in the ISO 3309 standard and in section 8.1.1.6.2 of |
| 430 | ITU-T recommendation V.42. (See http://www.iso.ch for |
| 431 | ordering ISO documents. See gopher://info.itu.ch for an |
| 432 | online version of ITU-T V.42.) |
| 433 | |
| 434 | ISIZE (Input SIZE) |
| 435 | This contains the size of the original (uncompressed) input |
| 436 | data modulo 2^32. |
| 437 | |
| 438 | 2.3.1.1. Extra field |
| 439 | |
| 440 | If the FLG.FEXTRA bit is set, an "extra field" is present in |
| 441 | the header, with total length XLEN bytes. It consists of a |
| 442 | series of subfields, each of the form: |
| 443 | |
| 444 | +---+---+---+---+==================================+ |
| 445 | |SI1|SI2| LEN |... LEN bytes of subfield data ...| |
| 446 | +---+---+---+---+==================================+ |
| 447 | |
| 448 | SI1 and SI2 provide a subfield ID, typically two ASCII letters |
| 449 | with some mnemonic value. Jean-Loup Gailly |
| 450 | <[email protected]> is maintaining a registry of subfield |
| 451 | IDs; please send him any subfield ID you wish to use. Subfield |
| 452 | IDs with SI2 = 0 are reserved for future use. The following |
| 453 | IDs are currently defined: |
| 454 | |
| 455 | |
| 456 | |
| 457 | Deutsch Informational [Page 8] |
| 458 | |
| 459 | |
| 460 | RFC 1952 GZIP File Format Specification May 1996 |
| 461 | |
| 462 | |
| 463 | SI1 SI2 Data |
| 464 | ---------- ---------- ---- |
| 465 | 0x41 ('A') 0x70 ('P') Apollo file type information |
| 466 | |
| 467 | LEN gives the length of the subfield data, excluding the 4 |
| 468 | initial bytes. |
| 469 | |
| 470 | 2.3.1.2. Compliance |
| 471 | |
| 472 | A compliant compressor must produce files with correct ID1, |
| 473 | ID2, CM, CRC32, and ISIZE, but may set all the other fields in |
| 474 | the fixed-length part of the header to default values (255 for |
| 475 | OS, 0 for all others). The compressor must set all reserved |
| 476 | bits to zero. |
| 477 | |
| 478 | A compliant decompressor must check ID1, ID2, and CM, and |
| 479 | provide an error indication if any of these have incorrect |
| 480 | values. It must examine FEXTRA/XLEN, FNAME, FCOMMENT and FHCRC |
| 481 | at least so it can skip over the optional fields if they are |
| 482 | present. It need not examine any other part of the header or |
| 483 | trailer; in particular, a decompressor may ignore FTEXT and OS |
| 484 | and always produce binary output, and still be compliant. A |
| 485 | compliant decompressor must give an error indication if any |
| 486 | reserved bit is non-zero, since such a bit could indicate the |
| 487 | presence of a new field that would cause subsequent data to be |
| 488 | interpreted incorrectly. |
| 489 | |
| 490 | 3. References |
| 491 | |
| 492 | [1] "Information Processing - 8-bit single-byte coded graphic |
| 493 | character sets - Part 1: Latin alphabet No.1" (ISO 8859-1:1987). |
| 494 | The ISO 8859-1 (Latin-1) character set is a superset of 7-bit |
| 495 | ASCII. Files defining this character set are available as |
| 496 | iso_8859-1.* in ftp://ftp.uu.net/graphics/png/documents/ |
| 497 | |
| 498 | [2] ISO 3309 |
| 499 | |
| 500 | [3] ITU-T recommendation V.42 |
| 501 | |
| 502 | [4] Deutsch, L.P.,"DEFLATE Compressed Data Format Specification", |
| 503 | available in ftp://ftp.uu.net/pub/archiving/zip/doc/ |
| 504 | |
| 505 | [5] Gailly, J.-L., GZIP documentation, available as gzip-*.tar in |
| 506 | ftp://prep.ai.mit.edu/pub/gnu/ |
| 507 | |
| 508 | [6] Sarwate, D.V., "Computation of Cyclic Redundancy Checks via Table |
| 509 | Look-Up", Communications of the ACM, 31(8), pp.1008-1013. |
| 510 | |
| 511 | |
| 512 | |
| 513 | |
| 514 | Deutsch Informational [Page 9] |
| 515 | |
| 516 | |
| 517 | RFC 1952 GZIP File Format Specification May 1996 |
| 518 | |
| 519 | |
| 520 | [7] Schwaderer, W.D., "CRC Calculation", April 85 PC Tech Journal, |
| 521 | pp.118-133. |
| 522 | |
| 523 | [8] ftp://ftp.adelaide.edu.au/pub/rocksoft/papers/crc_v3.txt, |
| 524 | describing the CRC concept. |
| 525 | |
| 526 | 4. Security Considerations |
| 527 | |
| 528 | Any data compression method involves the reduction of redundancy in |
| 529 | the data. Consequently, any corruption of the data is likely to have |
| 530 | severe effects and be difficult to correct. Uncompressed text, on |
| 531 | the other hand, will probably still be readable despite the presence |
| 532 | of some corrupted bytes. |
| 533 | |
| 534 | It is recommended that systems using this data format provide some |
| 535 | means of validating the integrity of the compressed data, such as by |
| 536 | setting and checking the CRC-32 check value. |
| 537 | |
| 538 | 5. Acknowledgements |
| 539 | |
| 540 | Trademarks cited in this document are the property of their |
| 541 | respective owners. |
| 542 | |
| 543 | Jean-Loup Gailly designed the gzip format and wrote, with Mark Adler, |
| 544 | the related software described in this specification. Glenn |
| 545 | Randers-Pehrson converted this document to RFC and HTML format. |
| 546 | |
| 547 | 6. Author's Address |
| 548 | |
| 549 | L. Peter Deutsch |
| 550 | Aladdin Enterprises |
| 551 | 203 Santa Margarita Ave. |
| 552 | Menlo Park, CA 94025 |
| 553 | |
| 554 | Phone: (415) 322-0103 (AM only) |
| 555 | FAX: (415) 322-1734 |
| 556 | EMail: <[email protected]> |
| 557 | |
| 558 | Questions about the technical content of this specification can be |
| 559 | sent by email to: |
| 560 | |
| 561 | Jean-Loup Gailly <[email protected]> and |
| 562 | Mark Adler <[email protected]> |
| 563 | |
| 564 | Editorial comments on this specification can be sent by email to: |
| 565 | |
| 566 | L. Peter Deutsch <[email protected]> and |
| 567 | Glenn Randers-Pehrson <[email protected]> |
| 568 | |
| 569 | |
| 570 | |
| 571 | Deutsch Informational [Page 10] |
| 572 | |
| 573 | |
| 574 | RFC 1952 GZIP File Format Specification May 1996 |
| 575 | |
| 576 | |
| 577 | 7. Appendix: Jean-Loup Gailly's gzip utility |
| 578 | |
| 579 | The most widely used implementation of gzip compression, and the |
| 580 | original documentation on which this specification is based, were |
| 581 | created by Jean-Loup Gailly <[email protected]>. Since this |
| 582 | implementation is a de facto standard, we mention some more of its |
| 583 | features here. Again, the material in this section is not part of |
| 584 | the specification per se, and implementations need not follow it to |
| 585 | be compliant. |
| 586 | |
| 587 | When compressing or decompressing a file, gzip preserves the |
| 588 | protection, ownership, and modification time attributes on the local |
| 589 | file system, since there is no provision for representing protection |
| 590 | attributes in the gzip file format itself. Since the file format |
| 591 | includes a modification time, the gzip decompressor provides a |
| 592 | command line switch that assigns the modification time from the file, |
| 593 | rather than the local modification time of the compressed input, to |
| 594 | the decompressed output. |
| 595 | |
| 596 | 8. Appendix: Sample CRC Code |
| 597 | |
| 598 | The following sample code represents a practical implementation of |
| 599 | the CRC (Cyclic Redundancy Check). (See also ISO 3309 and ITU-T V.42 |
| 600 | for a formal specification.) |
| 601 | |
| 602 | The sample code is in the ANSI C programming language. Non C users |
| 603 | may find it easier to read with these hints: |
| 604 | |
| 605 | & Bitwise AND operator. |
| 606 | ^ Bitwise exclusive-OR operator. |
| 607 | >> Bitwise right shift operator. When applied to an |
| 608 | unsigned quantity, as here, right shift inserts zero |
| 609 | bit(s) at the left. |
| 610 | ! Logical NOT operator. |
| 611 | ++ "n++" increments the variable n. |
| 612 | 0xNNN 0x introduces a hexadecimal (base 16) constant. |
| 613 | Suffix L indicates a long value (at least 32 bits). |
| 614 | |
| 615 | /* Table of CRCs of all 8-bit messages. */ |
| 616 | unsigned long crc_table[256]; |
| 617 | |
| 618 | /* Flag: has the table been computed? Initially false. */ |
| 619 | int crc_table_computed = 0; |
| 620 | |
| 621 | /* Make the table for a fast CRC. */ |
| 622 | void make_crc_table(void) |
| 623 | { |
| 624 | unsigned long c; |
| 625 | |
| 626 | |
| 627 | |
| 628 | Deutsch Informational [Page 11] |
| 629 | |
| 630 | |
| 631 | RFC 1952 GZIP File Format Specification May 1996 |
| 632 | |
| 633 | |
| 634 | int n, k; |
| 635 | for (n = 0; n < 256; n++) { |
| 636 | c = (unsigned long) n; |
| 637 | for (k = 0; k < 8; k++) { |
| 638 | if (c & 1) { |
| 639 | c = 0xedb88320L ^ (c >> 1); |
| 640 | } else { |
| 641 | c = c >> 1; |
| 642 | } |
| 643 | } |
| 644 | crc_table[n] = c; |
| 645 | } |
| 646 | crc_table_computed = 1; |
| 647 | } |
| 648 | |
| 649 | /* |
| 650 | Update a running crc with the bytes buf[0..len-1] and return |
| 651 | the updated crc. The crc should be initialized to zero. Pre- and |
| 652 | post-conditioning (one's complement) is performed within this |
| 653 | function so it shouldn't be done by the caller. Usage example: |
| 654 | |
| 655 | unsigned long crc = 0L; |
| 656 | |
| 657 | while (read_buffer(buffer, length) != EOF) { |
| 658 | crc = update_crc(crc, buffer, length); |
| 659 | } |
| 660 | if (crc != original_crc) error(); |
| 661 | */ |
| 662 | unsigned long update_crc(unsigned long crc, |
| 663 | unsigned char *buf, int len) |
| 664 | { |
| 665 | unsigned long c = crc ^ 0xffffffffL; |
| 666 | int n; |
| 667 | |
| 668 | if (!crc_table_computed) |
| 669 | make_crc_table(); |
| 670 | for (n = 0; n < len; n++) { |
| 671 | c = crc_table[(c ^ buf[n]) & 0xff] ^ (c >> 8); |
| 672 | } |
| 673 | return c ^ 0xffffffffL; |
| 674 | } |
| 675 | |
| 676 | /* Return the CRC of the bytes buf[0..len-1]. */ |
| 677 | unsigned long crc(unsigned char *buf, int len) |
| 678 | { |
| 679 | return update_crc(0L, buf, len); |
| 680 | } |
| 681 | |
| 682 | |
| 683 | |
| 684 | |
| 685 | Deutsch Informational [Page 12] |
| 686 | |
| 687 |
| --- a/compat/zlib/doc/rfc1952.txt | |
| +++ b/compat/zlib/doc/rfc1952.txt | |
| @@ -1,687 +0,0 @@ | |
D
compat/zlib/doc/txtvsbin.txt
-103
| --- a/compat/zlib/doc/txtvsbin.txt | ||
| +++ b/compat/zlib/doc/txtvsbin.txt | ||
| @@ -1,107 +0,0 @@ | ||
| 1 | -A Fast Method for Identifying Plain Text Files | |
| 2 | -============================================== | |
| 3 | - | |
| 4 | - | |
| 5 | -Introduction | |
| ------------- | ||
| 6 | - | |
| 7 | -Given a file coming from an unknown source, it is sometimes desirable | |
| 8 | -to find out whether the format of that file is plain text. Although | |
| 9 | -this may appear like a simple task, a fully accurate detection of the | |
| 10 | -file type requires heavy-duty semantic analysis on the file contents. | |
| 11 | -It is, however, possible to obtain satisfactory results by employing | |
| 12 | -various heuristics. | |
| 13 | - | |
| 14 | -Previous versions of PKZip and other zip-compatible compression tools | |
| 15 | -were using a crude detection scheme: if more than 80% (4/5) of the bytes | |
| 16 | -found in a certain buffer are within the range [7..127], the file is | |
| 17 | -labeled as plain text, otherwise it is labeled as binary. A prominent | |
| 18 | -limitation of this scheme is the restriction to Latin-based alphabets. | |
| 19 | -Other alphabets, like Greek, Cyrillic or Asian, make extensive use of | |
| 20 | -the bytes within the range [128..255], and texts using these alphabets | |
| 21 | -are most often misidentified by this scheme; in other words, the rate | |
| 22 | -of false negatives is sometimes too high, which means that the recall | |
| 23 | -is low. Another weakness of this scheme is a reduced precision, due to | |
| 24 | -the false positives that may occur when binary files containing large | |
| 25 | -amounts of textual characters are misidentified as plain text. | |
| 26 | - | |
| 27 | -In this article we propose a new, simple detection scheme that features | |
| 28 | -a much increased precision and a near-100% recall. This scheme is | |
| 29 | -designed to work on ASCII, Unicode and other ASCII-derived alphabets, | |
| 30 | -and it handles single-byte encodings (ISO-8859, MacRoman, KOI8, etc.) | |
| 31 | -and variable-sized encodings (ISO-2022, UTF-8, etc.). Wider encodings | |
| 32 | -(UCS-2/UTF-16 and UCS-4/UTF-32) are not handled, however. | |
| 33 | - | |
| 34 | - | |
| 35 | -The Algorithm | |
| -------------- | ||
| 36 | - | |
| 37 | -The algorithm works by dividing the set of bytecodes [0..255] into three | |
| 38 | -categories: | |
| 39 | -- The white list of textual bytecodes: | |
| 40 | - 9 (TAB), 10 (LF), 13 (CR), 32 (SPACE) to 255. | |
| 41 | -- The gray list of tolerated bytecodes: | |
| 42 | - 7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB), 27 (ESC). | |
| 43 | -- The black list of undesired, non-textual bytecodes: | |
| 44 | - 0 (NUL) to 6, 14 to 31. | |
| 45 | - | |
| 46 | -If a file contains at least one byte that belongs to the white list and | |
| 47 | -no byte that belongs to the black list, then the file is categorized as | |
| 48 | -plain text; otherwise, it is categorized as binary. (The boundary case, | |
| 49 | -when the file is empty, automatically falls into the latter category.) | |
| 50 | - | |
| 51 | - | |
| 52 | -Rationale | |
| ---------- | ||
| 53 | - | |
| 54 | -The idea behind this algorithm relies on two observations. | |
| 55 | - | |
| 56 | -The first observation is that, although the full range of 7-bit codes | |
| 57 | -[0..127] is properly specified by the ASCII standard, most control | |
| 58 | -characters in the range [0..31] are not used in practice. The only | |
| 59 | -widely-used, almost universally-portable control codes are 9 (TAB), | |
| 60 | -10 (LF) and 13 (CR). There are a few more control codes that are | |
| 61 | -recognized on a reduced range of platforms and text viewers/editors: | |
| 62 | -7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB) and 27 (ESC); but these | |
| 63 | -codes are rarely (if ever) used alone, without being accompanied by | |
| 64 | -some printable text. Even the newer, portable text formats such as | |
| 65 | -XML avoid using control characters outside the list mentioned here. | |
| 66 | - | |
| 67 | -The second observation is that most of the binary files tend to contain | |
| 68 | -control characters, especially 0 (NUL). Even though the older text | |
| 69 | -detection schemes observe the presence of non-ASCII codes from the range | |
| 70 | -[128..255], the precision rarely has to suffer if this upper range is | |
| 71 | -labeled as textual, because the files that are genuinely binary tend to | |
| 72 | -contain both control characters and codes from the upper range. On the | |
| 73 | -other hand, the upper range needs to be labeled as textual, because it | |
| 74 | -is used by virtually all ASCII extensions. In particular, this range is | |
| 75 | -used for encoding non-Latin scripts. | |
| 76 | - | |
| 77 | -Since there is no counting involved, other than simply observing the | |
| 78 | -presence or the absence of some byte values, the algorithm produces | |
| 79 | -consistent results, regardless what alphabet encoding is being used. | |
| 80 | -(If counting were involved, it could be possible to obtain different | |
| 81 | -results on a text encoded, say, using ISO-8859-16 versus UTF-8.) | |
| 82 | - | |
| 83 | -There is an extra category of plain text files that are "polluted" with | |
| 84 | -one or more black-listed codes, either by mistake or by peculiar design | |
| 85 | -considerations. In such cases, a scheme that tolerates a small fraction | |
| 86 | -of black-listed codes would provide an increased recall (i.e. more true | |
| 87 | -positives). This, however, incurs a reduced precision overall, since | |
| 88 | -false positives are more likely to appear in binary files that contain | |
| 89 | -large chunks of textual data. Furthermore, "polluted" plain text should | |
| 90 | -be regarded as binary by general-purpose text detection schemes, because | |
| 91 | -general-purpose text processing algorithms might not be applicable. | |
| 92 | -Under this premise, it is safe to say that our detection method provides | |
| 93 | -a near-100% recall. | |
| 94 | - | |
| 95 | -Experiments have been run on many files coming from various platforms | |
| 96 | -and applications. We tried plain text files, system logs, source code, | |
| 97 | -formatted office documents, compiled object code, etc. The results | |
| 98 | -confirm the optimistic assumptions about the capabilities of this | |
| 99 | -algorithm. | |
| 100 | - | |
| 101 | - | |
| --- | ||
| 102 | -Cosmin Truta | |
| 103 | -Last updated: 2006-May-28 |
| --- a/compat/zlib/doc/txtvsbin.txt | |
| +++ b/compat/zlib/doc/txtvsbin.txt | |
| @@ -1,107 +0,0 @@ | |
| 1 | A Fast Method for Identifying Plain Text Files |
| 2 | ============================================== |
| 3 | |
| 4 | |
| 5 | Introduction |
| ------------- | |
| 6 | |
| 7 | Given a file coming from an unknown source, it is sometimes desirable |
| 8 | to find out whether the format of that file is plain text. Although |
| 9 | this may appear like a simple task, a fully accurate detection of the |
| 10 | file type requires heavy-duty semantic analysis on the file contents. |
| 11 | It is, however, possible to obtain satisfactory results by employing |
| 12 | various heuristics. |
| 13 | |
| 14 | Previous versions of PKZip and other zip-compatible compression tools |
| 15 | were using a crude detection scheme: if more than 80% (4/5) of the bytes |
| 16 | found in a certain buffer are within the range [7..127], the file is |
| 17 | labeled as plain text, otherwise it is labeled as binary. A prominent |
| 18 | limitation of this scheme is the restriction to Latin-based alphabets. |
| 19 | Other alphabets, like Greek, Cyrillic or Asian, make extensive use of |
| 20 | the bytes within the range [128..255], and texts using these alphabets |
| 21 | are most often misidentified by this scheme; in other words, the rate |
| 22 | of false negatives is sometimes too high, which means that the recall |
| 23 | is low. Another weakness of this scheme is a reduced precision, due to |
| 24 | the false positives that may occur when binary files containing large |
| 25 | amounts of textual characters are misidentified as plain text. |
| 26 | |
| 27 | In this article we propose a new, simple detection scheme that features |
| 28 | a much increased precision and a near-100% recall. This scheme is |
| 29 | designed to work on ASCII, Unicode and other ASCII-derived alphabets, |
| 30 | and it handles single-byte encodings (ISO-8859, MacRoman, KOI8, etc.) |
| 31 | and variable-sized encodings (ISO-2022, UTF-8, etc.). Wider encodings |
| 32 | (UCS-2/UTF-16 and UCS-4/UTF-32) are not handled, however. |
| 33 | |
| 34 | |
| 35 | The Algorithm |
| -------------- | |
| 36 | |
| 37 | The algorithm works by dividing the set of bytecodes [0..255] into three |
| 38 | categories: |
| 39 | - The white list of textual bytecodes: |
| 40 | 9 (TAB), 10 (LF), 13 (CR), 32 (SPACE) to 255. |
| 41 | - The gray list of tolerated bytecodes: |
| 42 | 7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB), 27 (ESC). |
| 43 | - The black list of undesired, non-textual bytecodes: |
| 44 | 0 (NUL) to 6, 14 to 31. |
| 45 | |
| 46 | If a file contains at least one byte that belongs to the white list and |
| 47 | no byte that belongs to the black list, then the file is categorized as |
| 48 | plain text; otherwise, it is categorized as binary. (The boundary case, |
| 49 | when the file is empty, automatically falls into the latter category.) |
| 50 | |
| 51 | |
| 52 | Rationale |
| ---------- | |
| 53 | |
| 54 | The idea behind this algorithm relies on two observations. |
| 55 | |
| 56 | The first observation is that, although the full range of 7-bit codes |
| 57 | [0..127] is properly specified by the ASCII standard, most control |
| 58 | characters in the range [0..31] are not used in practice. The only |
| 59 | widely-used, almost universally-portable control codes are 9 (TAB), |
| 60 | 10 (LF) and 13 (CR). There are a few more control codes that are |
| 61 | recognized on a reduced range of platforms and text viewers/editors: |
| 62 | 7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB) and 27 (ESC); but these |
| 63 | codes are rarely (if ever) used alone, without being accompanied by |
| 64 | some printable text. Even the newer, portable text formats such as |
| 65 | XML avoid using control characters outside the list mentioned here. |
| 66 | |
| 67 | The second observation is that most of the binary files tend to contain |
| 68 | control characters, especially 0 (NUL). Even though the older text |
| 69 | detection schemes observe the presence of non-ASCII codes from the range |
| 70 | [128..255], the precision rarely has to suffer if this upper range is |
| 71 | labeled as textual, because the files that are genuinely binary tend to |
| 72 | contain both control characters and codes from the upper range. On the |
| 73 | other hand, the upper range needs to be labeled as textual, because it |
| 74 | is used by virtually all ASCII extensions. In particular, this range is |
| 75 | used for encoding non-Latin scripts. |
| 76 | |
| 77 | Since there is no counting involved, other than simply observing the |
| 78 | presence or the absence of some byte values, the algorithm produces |
| 79 | consistent results, regardless what alphabet encoding is being used. |
| 80 | (If counting were involved, it could be possible to obtain different |
| 81 | results on a text encoded, say, using ISO-8859-16 versus UTF-8.) |
| 82 | |
| 83 | There is an extra category of plain text files that are "polluted" with |
| 84 | one or more black-listed codes, either by mistake or by peculiar design |
| 85 | considerations. In such cases, a scheme that tolerates a small fraction |
| 86 | of black-listed codes would provide an increased recall (i.e. more true |
| 87 | positives). This, however, incurs a reduced precision overall, since |
| 88 | false positives are more likely to appear in binary files that contain |
| 89 | large chunks of textual data. Furthermore, "polluted" plain text should |
| 90 | be regarded as binary by general-purpose text detection schemes, because |
| 91 | general-purpose text processing algorithms might not be applicable. |
| 92 | Under this premise, it is safe to say that our detection method provides |
| 93 | a near-100% recall. |
| 94 | |
| 95 | Experiments have been run on many files coming from various platforms |
| 96 | and applications. We tried plain text files, system logs, source code, |
| 97 | formatted office documents, compiled object code, etc. The results |
| 98 | confirm the optimistic assumptions about the capabilities of this |
| 99 | algorithm. |
| 100 | |
| 101 | |
| --- | |
| 102 | Cosmin Truta |
| 103 | Last updated: 2006-May-28 |
| --- a/compat/zlib/doc/txtvsbin.txt | |
| +++ b/compat/zlib/doc/txtvsbin.txt | |
| @@ -1,107 +0,0 @@ | |
| ------------- | |
| -------------- | |
| ---------- | |
| --- | |
+1
| --- src/clone.c | ||
| +++ src/clone.c | ||
| @@ -175,10 +175,11 @@ | ||
| 175 | 175 | db_initial_setup(0, 0, zDefaultUser); |
| 176 | 176 | user_select(); |
| 177 | 177 | db_set("content-schema", CONTENT_SCHEMA, 0); |
| 178 | 178 | db_set("aux-schema", AUX_SCHEMA_MAX, 0); |
| 179 | 179 | db_set("rebuilt", get_version(), 0); |
| 180 | + db_unset("hash-policy", 0); | |
| 180 | 181 | remember_or_get_http_auth(zHttpAuth, urlFlags & URL_REMEMBER, g.argv[2]); |
| 181 | 182 | url_remember(); |
| 182 | 183 | if( g.zSSLIdentity!=0 ){ |
| 183 | 184 | /* If the --ssl-identity option was specified, store it as a setting */ |
| 184 | 185 | Blob fn; |
| 185 | 186 |
| --- src/clone.c | |
| +++ src/clone.c | |
| @@ -175,10 +175,11 @@ | |
| 175 | db_initial_setup(0, 0, zDefaultUser); |
| 176 | user_select(); |
| 177 | db_set("content-schema", CONTENT_SCHEMA, 0); |
| 178 | db_set("aux-schema", AUX_SCHEMA_MAX, 0); |
| 179 | db_set("rebuilt", get_version(), 0); |
| 180 | remember_or_get_http_auth(zHttpAuth, urlFlags & URL_REMEMBER, g.argv[2]); |
| 181 | url_remember(); |
| 182 | if( g.zSSLIdentity!=0 ){ |
| 183 | /* If the --ssl-identity option was specified, store it as a setting */ |
| 184 | Blob fn; |
| 185 |
| --- src/clone.c | |
| +++ src/clone.c | |
| @@ -175,10 +175,11 @@ | |
| 175 | db_initial_setup(0, 0, zDefaultUser); |
| 176 | user_select(); |
| 177 | db_set("content-schema", CONTENT_SCHEMA, 0); |
| 178 | db_set("aux-schema", AUX_SCHEMA_MAX, 0); |
| 179 | db_set("rebuilt", get_version(), 0); |
| 180 | db_unset("hash-policy", 0); |
| 181 | remember_or_get_http_auth(zHttpAuth, urlFlags & URL_REMEMBER, g.argv[2]); |
| 182 | url_remember(); |
| 183 | if( g.zSSLIdentity!=0 ){ |
| 184 | /* If the --ssl-identity option was specified, store it as a setting */ |
| 185 | Blob fn; |
| 186 |
+1
| --- src/configure.c | ||
| +++ src/configure.c | ||
| @@ -129,10 +129,11 @@ | ||
| 129 | 129 | { "empty-dirs", CONFIGSET_PROJ }, |
| 130 | 130 | { "allow-symlinks", CONFIGSET_PROJ }, |
| 131 | 131 | { "dotfiles", CONFIGSET_PROJ }, |
| 132 | 132 | { "parent-project-code", CONFIGSET_PROJ }, |
| 133 | 133 | { "parent-project-name", CONFIGSET_PROJ }, |
| 134 | + { "hash-policy", CONFIGSET_PROJ }, | |
| 134 | 135 | |
| 135 | 136 | #ifdef FOSSIL_ENABLE_LEGACY_MV_RM |
| 136 | 137 | { "mv-rm-files", CONFIGSET_PROJ }, |
| 137 | 138 | #endif |
| 138 | 139 | |
| 139 | 140 |
| --- src/configure.c | |
| +++ src/configure.c | |
| @@ -129,10 +129,11 @@ | |
| 129 | { "empty-dirs", CONFIGSET_PROJ }, |
| 130 | { "allow-symlinks", CONFIGSET_PROJ }, |
| 131 | { "dotfiles", CONFIGSET_PROJ }, |
| 132 | { "parent-project-code", CONFIGSET_PROJ }, |
| 133 | { "parent-project-name", CONFIGSET_PROJ }, |
| 134 | |
| 135 | #ifdef FOSSIL_ENABLE_LEGACY_MV_RM |
| 136 | { "mv-rm-files", CONFIGSET_PROJ }, |
| 137 | #endif |
| 138 | |
| 139 |
| --- src/configure.c | |
| +++ src/configure.c | |
| @@ -129,10 +129,11 @@ | |
| 129 | { "empty-dirs", CONFIGSET_PROJ }, |
| 130 | { "allow-symlinks", CONFIGSET_PROJ }, |
| 131 | { "dotfiles", CONFIGSET_PROJ }, |
| 132 | { "parent-project-code", CONFIGSET_PROJ }, |
| 133 | { "parent-project-name", CONFIGSET_PROJ }, |
| 134 | { "hash-policy", CONFIGSET_PROJ }, |
| 135 | |
| 136 | #ifdef FOSSIL_ENABLE_LEGACY_MV_RM |
| 137 | { "mv-rm-files", CONFIGSET_PROJ }, |
| 138 | #endif |
| 139 | |
| 140 |
+4
| --- src/content.c | ||
| +++ src/content.c | ||
| @@ -528,10 +528,14 @@ | ||
| 528 | 528 | blob_reset(&hash); |
| 529 | 529 | hname_hash(pBlob, 0, &hash); |
| 530 | 530 | } |
| 531 | 531 | }else{ |
| 532 | 532 | blob_init(&hash, zUuid, -1); |
| 533 | + } | |
| 534 | + if( g.eHashPolicy==HPOLICY_AUTO && blob_size(&hash)>HNAME_LEN_SHA1 ){ | |
| 535 | + g.eHashPolicy = HPOLICY_SHA3; | |
| 536 | + db_set_int("hash-policy", HPOLICY_SHA3, 0); | |
| 533 | 537 | } |
| 534 | 538 | if( nBlob ){ |
| 535 | 539 | size = nBlob; |
| 536 | 540 | }else{ |
| 537 | 541 | size = blob_size(pBlob); |
| 538 | 542 |
| --- src/content.c | |
| +++ src/content.c | |
| @@ -528,10 +528,14 @@ | |
| 528 | blob_reset(&hash); |
| 529 | hname_hash(pBlob, 0, &hash); |
| 530 | } |
| 531 | }else{ |
| 532 | blob_init(&hash, zUuid, -1); |
| 533 | } |
| 534 | if( nBlob ){ |
| 535 | size = nBlob; |
| 536 | }else{ |
| 537 | size = blob_size(pBlob); |
| 538 |
| --- src/content.c | |
| +++ src/content.c | |
| @@ -528,10 +528,14 @@ | |
| 528 | blob_reset(&hash); |
| 529 | hname_hash(pBlob, 0, &hash); |
| 530 | } |
| 531 | }else{ |
| 532 | blob_init(&hash, zUuid, -1); |
| 533 | } |
| 534 | if( g.eHashPolicy==HPOLICY_AUTO && blob_size(&hash)>HNAME_LEN_SHA1 ){ |
| 535 | g.eHashPolicy = HPOLICY_SHA3; |
| 536 | db_set_int("hash-policy", HPOLICY_SHA3, 0); |
| 537 | } |
| 538 | if( nBlob ){ |
| 539 | size = nBlob; |
| 540 | }else{ |
| 541 | size = blob_size(pBlob); |
| 542 |
M
src/db.c
+17
-3
| --- src/db.c | ||
| +++ src/db.c | ||
| @@ -1485,10 +1485,15 @@ | ||
| 1485 | 1485 | g.repositoryOpen = 1; |
| 1486 | 1486 | /* Cache "allow-symlinks" option, because we'll need it on every stat call */ |
| 1487 | 1487 | g.allowSymlinks = db_get_boolean("allow-symlinks", |
| 1488 | 1488 | db_allow_symlinks_by_default()); |
| 1489 | 1489 | g.zAuxSchema = db_get("aux-schema",""); |
| 1490 | + g.eHashPolicy = db_get_int("hash-policy",-1); | |
| 1491 | + if( g.eHashPolicy<0 ){ | |
| 1492 | + g.eHashPolicy = hname_default_policy(); | |
| 1493 | + db_set_int("hash-policy", g.eHashPolicy, 0); | |
| 1494 | + } | |
| 1490 | 1495 | |
| 1491 | 1496 | /* If the ALIAS table is not present, then some on-the-fly schema |
| 1492 | 1497 | ** updates might be required. |
| 1493 | 1498 | */ |
| 1494 | 1499 | rebuild_schema_update_2_0(); /* Do the Fossil-2.0 schema updates */ |
| @@ -1828,10 +1833,11 @@ | ||
| 1828 | 1833 | " AND name NOT GLOB 'project-*'" |
| 1829 | 1834 | " AND name NOT GLOB 'short-project-*';", |
| 1830 | 1835 | configure_inop_rhs(CONFIGSET_ALL), |
| 1831 | 1836 | db_setting_inop_rhs() |
| 1832 | 1837 | ); |
| 1838 | + g.eHashPolicy = db_get_int("hash-policy", g.eHashPolicy); | |
| 1833 | 1839 | db_multi_exec( |
| 1834 | 1840 | "REPLACE INTO reportfmt SELECT * FROM settingSrc.reportfmt;" |
| 1835 | 1841 | ); |
| 1836 | 1842 | |
| 1837 | 1843 | /* |
| @@ -1900,13 +1906,14 @@ | ||
| 1900 | 1906 | ** their associated permissions will not be copied; however, the system |
| 1901 | 1907 | ** default users "anonymous", "nobody", "reader", "developer", and their |
| 1902 | 1908 | ** associated permissions will be copied. |
| 1903 | 1909 | ** |
| 1904 | 1910 | ** Options: |
| 1905 | -** --template FILE copy settings from repository file | |
| 1906 | -** --admin-user|-A USERNAME select given USERNAME as admin user | |
| 1907 | -** --date-override DATETIME use DATETIME as time of the initial check-in | |
| 1911 | +** --template FILE Copy settings from repository file | |
| 1912 | +** --admin-user|-A USERNAME Select given USERNAME as admin user | |
| 1913 | +** --date-override DATETIME Use DATETIME as time of the initial check-in | |
| 1914 | +** --sha1 Use a initial hash policy of "sha1" | |
| 1908 | 1915 | ** |
| 1909 | 1916 | ** DATETIME may be "now" or "YYYY-MM-DDTHH:MM:SS.SSS". If in |
| 1910 | 1917 | ** year-month-day form, it may be truncated, the "T" may be replaced by |
| 1911 | 1918 | ** a space, and it may also name a timezone offset from UTC as "-HH:MM" |
| 1912 | 1919 | ** (westward) or "+HH:MM" (eastward). Either no timezone suffix or "Z" |
| @@ -1917,14 +1924,17 @@ | ||
| 1917 | 1924 | void create_repository_cmd(void){ |
| 1918 | 1925 | char *zPassword; |
| 1919 | 1926 | const char *zTemplate; /* Repository from which to copy settings */ |
| 1920 | 1927 | const char *zDate; /* Date of the initial check-in */ |
| 1921 | 1928 | const char *zDefaultUser; /* Optional name of the default user */ |
| 1929 | + int bUseSha1 = 0; /* True to set the hash-policy to sha1 */ | |
| 1930 | + | |
| 1922 | 1931 | |
| 1923 | 1932 | zTemplate = find_option("template",0,1); |
| 1924 | 1933 | zDate = find_option("date-override",0,1); |
| 1925 | 1934 | zDefaultUser = find_option("admin-user","A",1); |
| 1935 | + bUseSha1 = find_option("sha1",0,0)!=0; | |
| 1926 | 1936 | /* We should be done with options.. */ |
| 1927 | 1937 | verify_all_options(); |
| 1928 | 1938 | |
| 1929 | 1939 | if( g.argc!=3 ){ |
| 1930 | 1940 | usage("REPOSITORY-NAME"); |
| @@ -1937,10 +1947,14 @@ | ||
| 1937 | 1947 | db_create_repository(g.argv[2]); |
| 1938 | 1948 | db_open_repository(g.argv[2]); |
| 1939 | 1949 | db_open_config(0, 0); |
| 1940 | 1950 | if( zTemplate ) db_attach(zTemplate, "settingSrc"); |
| 1941 | 1951 | db_begin_transaction(); |
| 1952 | + if( bUseSha1 ){ | |
| 1953 | + g.eHashPolicy = HPOLICY_SHA1; | |
| 1954 | + db_set_int("hash-policy", HPOLICY_SHA1, 0); | |
| 1955 | + } | |
| 1942 | 1956 | if( zDate==0 ) zDate = "now"; |
| 1943 | 1957 | db_initial_setup(zTemplate, zDate, zDefaultUser); |
| 1944 | 1958 | db_end_transaction(0); |
| 1945 | 1959 | if( zTemplate ) db_detach("settingSrc"); |
| 1946 | 1960 | fossil_print("project-id: %s\n", db_get("project-code", 0)); |
| 1947 | 1961 |
| --- src/db.c | |
| +++ src/db.c | |
| @@ -1485,10 +1485,15 @@ | |
| 1485 | g.repositoryOpen = 1; |
| 1486 | /* Cache "allow-symlinks" option, because we'll need it on every stat call */ |
| 1487 | g.allowSymlinks = db_get_boolean("allow-symlinks", |
| 1488 | db_allow_symlinks_by_default()); |
| 1489 | g.zAuxSchema = db_get("aux-schema",""); |
| 1490 | |
| 1491 | /* If the ALIAS table is not present, then some on-the-fly schema |
| 1492 | ** updates might be required. |
| 1493 | */ |
| 1494 | rebuild_schema_update_2_0(); /* Do the Fossil-2.0 schema updates */ |
| @@ -1828,10 +1833,11 @@ | |
| 1828 | " AND name NOT GLOB 'project-*'" |
| 1829 | " AND name NOT GLOB 'short-project-*';", |
| 1830 | configure_inop_rhs(CONFIGSET_ALL), |
| 1831 | db_setting_inop_rhs() |
| 1832 | ); |
| 1833 | db_multi_exec( |
| 1834 | "REPLACE INTO reportfmt SELECT * FROM settingSrc.reportfmt;" |
| 1835 | ); |
| 1836 | |
| 1837 | /* |
| @@ -1900,13 +1906,14 @@ | |
| 1900 | ** their associated permissions will not be copied; however, the system |
| 1901 | ** default users "anonymous", "nobody", "reader", "developer", and their |
| 1902 | ** associated permissions will be copied. |
| 1903 | ** |
| 1904 | ** Options: |
| 1905 | ** --template FILE copy settings from repository file |
| 1906 | ** --admin-user|-A USERNAME select given USERNAME as admin user |
| 1907 | ** --date-override DATETIME use DATETIME as time of the initial check-in |
| 1908 | ** |
| 1909 | ** DATETIME may be "now" or "YYYY-MM-DDTHH:MM:SS.SSS". If in |
| 1910 | ** year-month-day form, it may be truncated, the "T" may be replaced by |
| 1911 | ** a space, and it may also name a timezone offset from UTC as "-HH:MM" |
| 1912 | ** (westward) or "+HH:MM" (eastward). Either no timezone suffix or "Z" |
| @@ -1917,14 +1924,17 @@ | |
| 1917 | void create_repository_cmd(void){ |
| 1918 | char *zPassword; |
| 1919 | const char *zTemplate; /* Repository from which to copy settings */ |
| 1920 | const char *zDate; /* Date of the initial check-in */ |
| 1921 | const char *zDefaultUser; /* Optional name of the default user */ |
| 1922 | |
| 1923 | zTemplate = find_option("template",0,1); |
| 1924 | zDate = find_option("date-override",0,1); |
| 1925 | zDefaultUser = find_option("admin-user","A",1); |
| 1926 | /* We should be done with options.. */ |
| 1927 | verify_all_options(); |
| 1928 | |
| 1929 | if( g.argc!=3 ){ |
| 1930 | usage("REPOSITORY-NAME"); |
| @@ -1937,10 +1947,14 @@ | |
| 1937 | db_create_repository(g.argv[2]); |
| 1938 | db_open_repository(g.argv[2]); |
| 1939 | db_open_config(0, 0); |
| 1940 | if( zTemplate ) db_attach(zTemplate, "settingSrc"); |
| 1941 | db_begin_transaction(); |
| 1942 | if( zDate==0 ) zDate = "now"; |
| 1943 | db_initial_setup(zTemplate, zDate, zDefaultUser); |
| 1944 | db_end_transaction(0); |
| 1945 | if( zTemplate ) db_detach("settingSrc"); |
| 1946 | fossil_print("project-id: %s\n", db_get("project-code", 0)); |
| 1947 |
| --- src/db.c | |
| +++ src/db.c | |
| @@ -1485,10 +1485,15 @@ | |
| 1485 | g.repositoryOpen = 1; |
| 1486 | /* Cache "allow-symlinks" option, because we'll need it on every stat call */ |
| 1487 | g.allowSymlinks = db_get_boolean("allow-symlinks", |
| 1488 | db_allow_symlinks_by_default()); |
| 1489 | g.zAuxSchema = db_get("aux-schema",""); |
| 1490 | g.eHashPolicy = db_get_int("hash-policy",-1); |
| 1491 | if( g.eHashPolicy<0 ){ |
| 1492 | g.eHashPolicy = hname_default_policy(); |
| 1493 | db_set_int("hash-policy", g.eHashPolicy, 0); |
| 1494 | } |
| 1495 | |
| 1496 | /* If the ALIAS table is not present, then some on-the-fly schema |
| 1497 | ** updates might be required. |
| 1498 | */ |
| 1499 | rebuild_schema_update_2_0(); /* Do the Fossil-2.0 schema updates */ |
| @@ -1828,10 +1833,11 @@ | |
| 1833 | " AND name NOT GLOB 'project-*'" |
| 1834 | " AND name NOT GLOB 'short-project-*';", |
| 1835 | configure_inop_rhs(CONFIGSET_ALL), |
| 1836 | db_setting_inop_rhs() |
| 1837 | ); |
| 1838 | g.eHashPolicy = db_get_int("hash-policy", g.eHashPolicy); |
| 1839 | db_multi_exec( |
| 1840 | "REPLACE INTO reportfmt SELECT * FROM settingSrc.reportfmt;" |
| 1841 | ); |
| 1842 | |
| 1843 | /* |
| @@ -1900,13 +1906,14 @@ | |
| 1906 | ** their associated permissions will not be copied; however, the system |
| 1907 | ** default users "anonymous", "nobody", "reader", "developer", and their |
| 1908 | ** associated permissions will be copied. |
| 1909 | ** |
| 1910 | ** Options: |
| 1911 | ** --template FILE Copy settings from repository file |
| 1912 | ** --admin-user|-A USERNAME Select given USERNAME as admin user |
| 1913 | ** --date-override DATETIME Use DATETIME as time of the initial check-in |
| 1914 | ** --sha1 Use a initial hash policy of "sha1" |
| 1915 | ** |
| 1916 | ** DATETIME may be "now" or "YYYY-MM-DDTHH:MM:SS.SSS". If in |
| 1917 | ** year-month-day form, it may be truncated, the "T" may be replaced by |
| 1918 | ** a space, and it may also name a timezone offset from UTC as "-HH:MM" |
| 1919 | ** (westward) or "+HH:MM" (eastward). Either no timezone suffix or "Z" |
| @@ -1917,14 +1924,17 @@ | |
| 1924 | void create_repository_cmd(void){ |
| 1925 | char *zPassword; |
| 1926 | const char *zTemplate; /* Repository from which to copy settings */ |
| 1927 | const char *zDate; /* Date of the initial check-in */ |
| 1928 | const char *zDefaultUser; /* Optional name of the default user */ |
| 1929 | int bUseSha1 = 0; /* True to set the hash-policy to sha1 */ |
| 1930 | |
| 1931 | |
| 1932 | zTemplate = find_option("template",0,1); |
| 1933 | zDate = find_option("date-override",0,1); |
| 1934 | zDefaultUser = find_option("admin-user","A",1); |
| 1935 | bUseSha1 = find_option("sha1",0,0)!=0; |
| 1936 | /* We should be done with options.. */ |
| 1937 | verify_all_options(); |
| 1938 | |
| 1939 | if( g.argc!=3 ){ |
| 1940 | usage("REPOSITORY-NAME"); |
| @@ -1937,10 +1947,14 @@ | |
| 1947 | db_create_repository(g.argv[2]); |
| 1948 | db_open_repository(g.argv[2]); |
| 1949 | db_open_config(0, 0); |
| 1950 | if( zTemplate ) db_attach(zTemplate, "settingSrc"); |
| 1951 | db_begin_transaction(); |
| 1952 | if( bUseSha1 ){ |
| 1953 | g.eHashPolicy = HPOLICY_SHA1; |
| 1954 | db_set_int("hash-policy", HPOLICY_SHA1, 0); |
| 1955 | } |
| 1956 | if( zDate==0 ) zDate = "now"; |
| 1957 | db_initial_setup(zTemplate, zDate, zDefaultUser); |
| 1958 | db_end_transaction(0); |
| 1959 | if( zTemplate ) db_detach("settingSrc"); |
| 1960 | fossil_print("project-id: %s\n", db_get("project-code", 0)); |
| 1961 |
+21
-7
| --- src/diffcmd.c | ||
| +++ src/diffcmd.c | ||
| @@ -151,10 +151,13 @@ | ||
| 151 | 151 | /* |
| 152 | 152 | ** Show the difference between two files, one in memory and one on disk. |
| 153 | 153 | ** |
| 154 | 154 | ** The difference is the set of edits needed to transform pFile1 into |
| 155 | 155 | ** zFile2. The content of pFile1 is in memory. zFile2 exists on disk. |
| 156 | +** | |
| 157 | +** If fSwapDiff is 1, show the set of edits to transform zFile2 into pFile1 | |
| 158 | +** instead of the opposite. | |
| 156 | 159 | ** |
| 157 | 160 | ** Use the internal diff logic if zDiffCmd is NULL. Otherwise call the |
| 158 | 161 | ** command zDiffCmd to do the diffing. |
| 159 | 162 | ** |
| 160 | 163 | ** When using an external diff program, zBinGlob contains the GLOB patterns |
| @@ -167,11 +170,12 @@ | ||
| 167 | 170 | const char *zFile2, /* On disk content to compare to */ |
| 168 | 171 | const char *zName, /* Display name of the file */ |
| 169 | 172 | const char *zDiffCmd, /* Command for comparison */ |
| 170 | 173 | const char *zBinGlob, /* Treat file names matching this as binary */ |
| 171 | 174 | int fIncludeBinary, /* Include binary files for external diff */ |
| 172 | - u64 diffFlags /* Flags to control the diff */ | |
| 175 | + u64 diffFlags, /* Flags to control the diff */ | |
| 176 | + int fSwapDiff /* Diff from Zfile2 to Pfile1 */ | |
| 173 | 177 | ){ |
| 174 | 178 | if( zDiffCmd==0 ){ |
| 175 | 179 | Blob out; /* Diff output text */ |
| 176 | 180 | Blob file2; /* Content of zFile2 */ |
| 177 | 181 | const char *zName2; /* Name of zFile2 for display */ |
| @@ -194,11 +198,15 @@ | ||
| 194 | 198 | if( blob_compare(pFile1, &file2) ){ |
| 195 | 199 | fossil_print("CHANGED %s\n", zName); |
| 196 | 200 | } |
| 197 | 201 | }else{ |
| 198 | 202 | blob_zero(&out); |
| 199 | - text_diff(pFile1, &file2, &out, 0, diffFlags); | |
| 203 | + if( fSwapDiff ){ | |
| 204 | + text_diff(&file2, pFile1, &out, 0, diffFlags); | |
| 205 | + }else{ | |
| 206 | + text_diff(pFile1, &file2, &out, 0, diffFlags); | |
| 207 | + } | |
| 200 | 208 | if( blob_size(&out) ){ |
| 201 | 209 | diff_print_filenames(zName, zName2, diffFlags); |
| 202 | 210 | fossil_print("%s\n", blob_str(&out)); |
| 203 | 211 | } |
| 204 | 212 | blob_reset(&out); |
| @@ -252,13 +260,19 @@ | ||
| 252 | 260 | blob_write_to_file(pFile1, blob_str(&nameFile1)); |
| 253 | 261 | |
| 254 | 262 | /* Construct the external diff command */ |
| 255 | 263 | blob_zero(&cmd); |
| 256 | 264 | blob_appendf(&cmd, "%s ", zDiffCmd); |
| 257 | - shell_escape(&cmd, blob_str(&nameFile1)); | |
| 258 | - blob_append(&cmd, " ", 1); | |
| 259 | - shell_escape(&cmd, zFile2); | |
| 265 | + if( fSwapDiff ){ | |
| 266 | + shell_escape(&cmd, zFile2); | |
| 267 | + blob_append(&cmd, " ", 1); | |
| 268 | + shell_escape(&cmd, blob_str(&nameFile1)); | |
| 269 | + }else{ | |
| 270 | + shell_escape(&cmd, blob_str(&nameFile1)); | |
| 271 | + blob_append(&cmd, " ", 1); | |
| 272 | + shell_escape(&cmd, zFile2); | |
| 273 | + } | |
| 260 | 274 | |
| 261 | 275 | /* Run the external diff command */ |
| 262 | 276 | fossil_system(blob_str(&cmd)); |
| 263 | 277 | |
| 264 | 278 | /* Delete the temporary file and clean up memory used */ |
| @@ -482,11 +496,11 @@ | ||
| 482 | 496 | blob_zero(&content); |
| 483 | 497 | } |
| 484 | 498 | isBin = fIncludeBinary ? 0 : looks_like_binary(&content); |
| 485 | 499 | diff_print_index(zPathname, diffFlags); |
| 486 | 500 | diff_file(&content, isBin, zFullName, zPathname, zDiffCmd, |
| 487 | - zBinGlob, fIncludeBinary, diffFlags); | |
| 501 | + zBinGlob, fIncludeBinary, diffFlags, 0); | |
| 488 | 502 | blob_reset(&content); |
| 489 | 503 | } |
| 490 | 504 | blob_reset(&fname); |
| 491 | 505 | } |
| 492 | 506 | db_finalize(&q); |
| @@ -519,11 +533,11 @@ | ||
| 519 | 533 | const char *zFile = (const char*)db_column_text(&q, 0); |
| 520 | 534 | if( !file_dir_match(pFileDir, zFile) ) continue; |
| 521 | 535 | zFullName = mprintf("%s%s", g.zLocalRoot, zFile); |
| 522 | 536 | db_column_blob(&q, 1, &content); |
| 523 | 537 | diff_file(&content, 0, zFullName, zFile, |
| 524 | - zDiffCmd, zBinGlob, fIncludeBinary, diffFlags); | |
| 538 | + zDiffCmd, zBinGlob, fIncludeBinary, diffFlags, 0); | |
| 525 | 539 | fossil_free(zFullName); |
| 526 | 540 | blob_reset(&content); |
| 527 | 541 | } |
| 528 | 542 | db_finalize(&q); |
| 529 | 543 | } |
| 530 | 544 |
| --- src/diffcmd.c | |
| +++ src/diffcmd.c | |
| @@ -151,10 +151,13 @@ | |
| 151 | /* |
| 152 | ** Show the difference between two files, one in memory and one on disk. |
| 153 | ** |
| 154 | ** The difference is the set of edits needed to transform pFile1 into |
| 155 | ** zFile2. The content of pFile1 is in memory. zFile2 exists on disk. |
| 156 | ** |
| 157 | ** Use the internal diff logic if zDiffCmd is NULL. Otherwise call the |
| 158 | ** command zDiffCmd to do the diffing. |
| 159 | ** |
| 160 | ** When using an external diff program, zBinGlob contains the GLOB patterns |
| @@ -167,11 +170,12 @@ | |
| 167 | const char *zFile2, /* On disk content to compare to */ |
| 168 | const char *zName, /* Display name of the file */ |
| 169 | const char *zDiffCmd, /* Command for comparison */ |
| 170 | const char *zBinGlob, /* Treat file names matching this as binary */ |
| 171 | int fIncludeBinary, /* Include binary files for external diff */ |
| 172 | u64 diffFlags /* Flags to control the diff */ |
| 173 | ){ |
| 174 | if( zDiffCmd==0 ){ |
| 175 | Blob out; /* Diff output text */ |
| 176 | Blob file2; /* Content of zFile2 */ |
| 177 | const char *zName2; /* Name of zFile2 for display */ |
| @@ -194,11 +198,15 @@ | |
| 194 | if( blob_compare(pFile1, &file2) ){ |
| 195 | fossil_print("CHANGED %s\n", zName); |
| 196 | } |
| 197 | }else{ |
| 198 | blob_zero(&out); |
| 199 | text_diff(pFile1, &file2, &out, 0, diffFlags); |
| 200 | if( blob_size(&out) ){ |
| 201 | diff_print_filenames(zName, zName2, diffFlags); |
| 202 | fossil_print("%s\n", blob_str(&out)); |
| 203 | } |
| 204 | blob_reset(&out); |
| @@ -252,13 +260,19 @@ | |
| 252 | blob_write_to_file(pFile1, blob_str(&nameFile1)); |
| 253 | |
| 254 | /* Construct the external diff command */ |
| 255 | blob_zero(&cmd); |
| 256 | blob_appendf(&cmd, "%s ", zDiffCmd); |
| 257 | shell_escape(&cmd, blob_str(&nameFile1)); |
| 258 | blob_append(&cmd, " ", 1); |
| 259 | shell_escape(&cmd, zFile2); |
| 260 | |
| 261 | /* Run the external diff command */ |
| 262 | fossil_system(blob_str(&cmd)); |
| 263 | |
| 264 | /* Delete the temporary file and clean up memory used */ |
| @@ -482,11 +496,11 @@ | |
| 482 | blob_zero(&content); |
| 483 | } |
| 484 | isBin = fIncludeBinary ? 0 : looks_like_binary(&content); |
| 485 | diff_print_index(zPathname, diffFlags); |
| 486 | diff_file(&content, isBin, zFullName, zPathname, zDiffCmd, |
| 487 | zBinGlob, fIncludeBinary, diffFlags); |
| 488 | blob_reset(&content); |
| 489 | } |
| 490 | blob_reset(&fname); |
| 491 | } |
| 492 | db_finalize(&q); |
| @@ -519,11 +533,11 @@ | |
| 519 | const char *zFile = (const char*)db_column_text(&q, 0); |
| 520 | if( !file_dir_match(pFileDir, zFile) ) continue; |
| 521 | zFullName = mprintf("%s%s", g.zLocalRoot, zFile); |
| 522 | db_column_blob(&q, 1, &content); |
| 523 | diff_file(&content, 0, zFullName, zFile, |
| 524 | zDiffCmd, zBinGlob, fIncludeBinary, diffFlags); |
| 525 | fossil_free(zFullName); |
| 526 | blob_reset(&content); |
| 527 | } |
| 528 | db_finalize(&q); |
| 529 | } |
| 530 |
| --- src/diffcmd.c | |
| +++ src/diffcmd.c | |
| @@ -151,10 +151,13 @@ | |
| 151 | /* |
| 152 | ** Show the difference between two files, one in memory and one on disk. |
| 153 | ** |
| 154 | ** The difference is the set of edits needed to transform pFile1 into |
| 155 | ** zFile2. The content of pFile1 is in memory. zFile2 exists on disk. |
| 156 | ** |
| 157 | ** If fSwapDiff is 1, show the set of edits to transform zFile2 into pFile1 |
| 158 | ** instead of the opposite. |
| 159 | ** |
| 160 | ** Use the internal diff logic if zDiffCmd is NULL. Otherwise call the |
| 161 | ** command zDiffCmd to do the diffing. |
| 162 | ** |
| 163 | ** When using an external diff program, zBinGlob contains the GLOB patterns |
| @@ -167,11 +170,12 @@ | |
| 170 | const char *zFile2, /* On disk content to compare to */ |
| 171 | const char *zName, /* Display name of the file */ |
| 172 | const char *zDiffCmd, /* Command for comparison */ |
| 173 | const char *zBinGlob, /* Treat file names matching this as binary */ |
| 174 | int fIncludeBinary, /* Include binary files for external diff */ |
| 175 | u64 diffFlags, /* Flags to control the diff */ |
| 176 | int fSwapDiff /* Diff from Zfile2 to Pfile1 */ |
| 177 | ){ |
| 178 | if( zDiffCmd==0 ){ |
| 179 | Blob out; /* Diff output text */ |
| 180 | Blob file2; /* Content of zFile2 */ |
| 181 | const char *zName2; /* Name of zFile2 for display */ |
| @@ -194,11 +198,15 @@ | |
| 198 | if( blob_compare(pFile1, &file2) ){ |
| 199 | fossil_print("CHANGED %s\n", zName); |
| 200 | } |
| 201 | }else{ |
| 202 | blob_zero(&out); |
| 203 | if( fSwapDiff ){ |
| 204 | text_diff(&file2, pFile1, &out, 0, diffFlags); |
| 205 | }else{ |
| 206 | text_diff(pFile1, &file2, &out, 0, diffFlags); |
| 207 | } |
| 208 | if( blob_size(&out) ){ |
| 209 | diff_print_filenames(zName, zName2, diffFlags); |
| 210 | fossil_print("%s\n", blob_str(&out)); |
| 211 | } |
| 212 | blob_reset(&out); |
| @@ -252,13 +260,19 @@ | |
| 260 | blob_write_to_file(pFile1, blob_str(&nameFile1)); |
| 261 | |
| 262 | /* Construct the external diff command */ |
| 263 | blob_zero(&cmd); |
| 264 | blob_appendf(&cmd, "%s ", zDiffCmd); |
| 265 | if( fSwapDiff ){ |
| 266 | shell_escape(&cmd, zFile2); |
| 267 | blob_append(&cmd, " ", 1); |
| 268 | shell_escape(&cmd, blob_str(&nameFile1)); |
| 269 | }else{ |
| 270 | shell_escape(&cmd, blob_str(&nameFile1)); |
| 271 | blob_append(&cmd, " ", 1); |
| 272 | shell_escape(&cmd, zFile2); |
| 273 | } |
| 274 | |
| 275 | /* Run the external diff command */ |
| 276 | fossil_system(blob_str(&cmd)); |
| 277 | |
| 278 | /* Delete the temporary file and clean up memory used */ |
| @@ -482,11 +496,11 @@ | |
| 496 | blob_zero(&content); |
| 497 | } |
| 498 | isBin = fIncludeBinary ? 0 : looks_like_binary(&content); |
| 499 | diff_print_index(zPathname, diffFlags); |
| 500 | diff_file(&content, isBin, zFullName, zPathname, zDiffCmd, |
| 501 | zBinGlob, fIncludeBinary, diffFlags, 0); |
| 502 | blob_reset(&content); |
| 503 | } |
| 504 | blob_reset(&fname); |
| 505 | } |
| 506 | db_finalize(&q); |
| @@ -519,11 +533,11 @@ | |
| 533 | const char *zFile = (const char*)db_column_text(&q, 0); |
| 534 | if( !file_dir_match(pFileDir, zFile) ) continue; |
| 535 | zFullName = mprintf("%s%s", g.zLocalRoot, zFile); |
| 536 | db_column_blob(&q, 1, &content); |
| 537 | diff_file(&content, 0, zFullName, zFile, |
| 538 | zDiffCmd, zBinGlob, fIncludeBinary, diffFlags, 0); |
| 539 | fossil_free(zFullName); |
| 540 | blob_reset(&content); |
| 541 | } |
| 542 | db_finalize(&q); |
| 543 | } |
| 544 |
+1
-1
| --- src/doc.c | ||
| +++ src/doc.c | ||
| @@ -735,11 +735,11 @@ | ||
| 735 | 735 | |
| 736 | 736 | /* Jump here when unable to locate the document */ |
| 737 | 737 | doc_not_found: |
| 738 | 738 | db_end_transaction(0); |
| 739 | 739 | if( isUV && P("name")==0 ){ |
| 740 | - uvstat_page(); | |
| 740 | + uvlist_page(); | |
| 741 | 741 | return; |
| 742 | 742 | } |
| 743 | 743 | cgi_set_status(404, "Not Found"); |
| 744 | 744 | style_header("Not Found"); |
| 745 | 745 | @ <p>Document %h(zOrigName) not found |
| 746 | 746 |
| --- src/doc.c | |
| +++ src/doc.c | |
| @@ -735,11 +735,11 @@ | |
| 735 | |
| 736 | /* Jump here when unable to locate the document */ |
| 737 | doc_not_found: |
| 738 | db_end_transaction(0); |
| 739 | if( isUV && P("name")==0 ){ |
| 740 | uvstat_page(); |
| 741 | return; |
| 742 | } |
| 743 | cgi_set_status(404, "Not Found"); |
| 744 | style_header("Not Found"); |
| 745 | @ <p>Document %h(zOrigName) not found |
| 746 |
| --- src/doc.c | |
| +++ src/doc.c | |
| @@ -735,11 +735,11 @@ | |
| 735 | |
| 736 | /* Jump here when unable to locate the document */ |
| 737 | doc_not_found: |
| 738 | db_end_transaction(0); |
| 739 | if( isUV && P("name")==0 ){ |
| 740 | uvlist_page(); |
| 741 | return; |
| 742 | } |
| 743 | cgi_set_status(404, "Not Found"); |
| 744 | style_header("Not Found"); |
| 745 | @ <p>Document %h(zOrigName) not found |
| 746 |
+90
| --- src/encode.c | ||
| +++ src/encode.c | ||
| @@ -336,10 +336,100 @@ | ||
| 336 | 336 | z[j++] = c; |
| 337 | 337 | } |
| 338 | 338 | if( z[j] ) z[j] = 0; |
| 339 | 339 | } |
| 340 | 340 | |
| 341 | + | |
| 342 | +/* | |
| 343 | +** The *pz variable points to a UTF8 string. Read the next character | |
| 344 | +** off of that string and return its codepoint value. Advance *pz to the | |
| 345 | +** next character | |
| 346 | +*/ | |
| 347 | +u32 fossil_utf8_read( | |
| 348 | + const unsigned char **pz /* Pointer to string from which to read char */ | |
| 349 | +){ | |
| 350 | + unsigned int c; | |
| 351 | + | |
| 352 | + /* | |
| 353 | + ** This lookup table is used to help decode the first byte of | |
| 354 | + ** a multi-byte UTF8 character. | |
| 355 | + */ | |
| 356 | + static const unsigned char utf8Trans1[] = { | |
| 357 | + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, | |
| 358 | + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, | |
| 359 | + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, | |
| 360 | + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, | |
| 361 | + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, | |
| 362 | + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, | |
| 363 | + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, | |
| 364 | + 0x00, 0x01, 0x02, 0x03, 0x00, 0x01, 0x00, 0x00, | |
| 365 | + }; | |
| 366 | + | |
| 367 | + c = *((*pz)++); | |
| 368 | + if( c>=0xc0 ){ | |
| 369 | + c = utf8Trans1[c-0xc0]; | |
| 370 | + while( (*(*pz) & 0xc0)==0x80 ){ | |
| 371 | + c = (c<<6) + (0x3f & *((*pz)++)); | |
| 372 | + } | |
| 373 | + if( c<0x80 | |
| 374 | + || (c&0xFFFFF800)==0xD800 | |
| 375 | + || (c&0xFFFFFFFE)==0xFFFE ){ c = 0xFFFD; } | |
| 376 | + } | |
| 377 | + return c; | |
| 378 | +} | |
| 379 | + | |
| 380 | +/* | |
| 381 | +** Encode a UTF8 string for JSON. All special characters are escaped. | |
| 382 | +*/ | |
| 383 | +void blob_append_json_string(Blob *pBlob, const char *zStr){ | |
| 384 | + const unsigned char *z; | |
| 385 | + char *zOut; | |
| 386 | + u32 c; | |
| 387 | + int n, i, j; | |
| 388 | + z = (const unsigned char*)zStr; | |
| 389 | + n = 0; | |
| 390 | + while( (c = fossil_utf8_read(&z))!=0 ){ | |
| 391 | + if( c=='\\' || c=='"' ){ | |
| 392 | + n += 2; | |
| 393 | + }else if( c<' ' || c>=0x7f ){ | |
| 394 | + if( c=='\n' || c=='\r' ){ | |
| 395 | + n += 2; | |
| 396 | + }else{ | |
| 397 | + n += 6; | |
| 398 | + } | |
| 399 | + }else{ | |
| 400 | + n++; | |
| 401 | + } | |
| 402 | + } | |
| 403 | + i = blob_size(pBlob); | |
| 404 | + blob_resize(pBlob, i+n); | |
| 405 | + zOut = blob_buffer(pBlob); | |
| 406 | + z = (const unsigned char*)zStr; | |
| 407 | + while( (c = fossil_utf8_read(&z))!=0 ){ | |
| 408 | + if( c=='\\' ){ | |
| 409 | + zOut[i++] = '\\'; | |
| 410 | + zOut[i++] = c; | |
| 411 | + }else if( c<' ' || c>=0x7f ){ | |
| 412 | + zOut[i++] = '\\'; | |
| 413 | + if( c=='\n' ){ | |
| 414 | + zOut[i++] = 'n'; | |
| 415 | + }else if( c=='\r' ){ | |
| 416 | + zOut[i++] = 'r'; | |
| 417 | + }else{ | |
| 418 | + zOut[i++] = 'u'; | |
| 419 | + for(j=3; j>=0; j--){ | |
| 420 | + zOut[i+j] = "0123456789abcdef"[c&0xf]; | |
| 421 | + c >>= 4; | |
| 422 | + } | |
| 423 | + i += 4; | |
| 424 | + } | |
| 425 | + }else{ | |
| 426 | + zOut[i++] = c; | |
| 427 | + } | |
| 428 | + } | |
| 429 | + zOut[i] = 0; | |
| 430 | +} | |
| 341 | 431 | |
| 342 | 432 | /* |
| 343 | 433 | ** The characters used for HTTP base64 encoding. |
| 344 | 434 | */ |
| 345 | 435 | static unsigned char zBase[] = |
| 346 | 436 |
| --- src/encode.c | |
| +++ src/encode.c | |
| @@ -336,10 +336,100 @@ | |
| 336 | z[j++] = c; |
| 337 | } |
| 338 | if( z[j] ) z[j] = 0; |
| 339 | } |
| 340 | |
| 341 | |
| 342 | /* |
| 343 | ** The characters used for HTTP base64 encoding. |
| 344 | */ |
| 345 | static unsigned char zBase[] = |
| 346 |
| --- src/encode.c | |
| +++ src/encode.c | |
| @@ -336,10 +336,100 @@ | |
| 336 | z[j++] = c; |
| 337 | } |
| 338 | if( z[j] ) z[j] = 0; |
| 339 | } |
| 340 | |
| 341 | |
| 342 | /* |
| 343 | ** The *pz variable points to a UTF8 string. Read the next character |
| 344 | ** off of that string and return its codepoint value. Advance *pz to the |
| 345 | ** next character |
| 346 | */ |
| 347 | u32 fossil_utf8_read( |
| 348 | const unsigned char **pz /* Pointer to string from which to read char */ |
| 349 | ){ |
| 350 | unsigned int c; |
| 351 | |
| 352 | /* |
| 353 | ** This lookup table is used to help decode the first byte of |
| 354 | ** a multi-byte UTF8 character. |
| 355 | */ |
| 356 | static const unsigned char utf8Trans1[] = { |
| 357 | 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, |
| 358 | 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, |
| 359 | 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, |
| 360 | 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, |
| 361 | 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, |
| 362 | 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, |
| 363 | 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, |
| 364 | 0x00, 0x01, 0x02, 0x03, 0x00, 0x01, 0x00, 0x00, |
| 365 | }; |
| 366 | |
| 367 | c = *((*pz)++); |
| 368 | if( c>=0xc0 ){ |
| 369 | c = utf8Trans1[c-0xc0]; |
| 370 | while( (*(*pz) & 0xc0)==0x80 ){ |
| 371 | c = (c<<6) + (0x3f & *((*pz)++)); |
| 372 | } |
| 373 | if( c<0x80 |
| 374 | || (c&0xFFFFF800)==0xD800 |
| 375 | || (c&0xFFFFFFFE)==0xFFFE ){ c = 0xFFFD; } |
| 376 | } |
| 377 | return c; |
| 378 | } |
| 379 | |
| 380 | /* |
| 381 | ** Encode a UTF8 string for JSON. All special characters are escaped. |
| 382 | */ |
| 383 | void blob_append_json_string(Blob *pBlob, const char *zStr){ |
| 384 | const unsigned char *z; |
| 385 | char *zOut; |
| 386 | u32 c; |
| 387 | int n, i, j; |
| 388 | z = (const unsigned char*)zStr; |
| 389 | n = 0; |
| 390 | while( (c = fossil_utf8_read(&z))!=0 ){ |
| 391 | if( c=='\\' || c=='"' ){ |
| 392 | n += 2; |
| 393 | }else if( c<' ' || c>=0x7f ){ |
| 394 | if( c=='\n' || c=='\r' ){ |
| 395 | n += 2; |
| 396 | }else{ |
| 397 | n += 6; |
| 398 | } |
| 399 | }else{ |
| 400 | n++; |
| 401 | } |
| 402 | } |
| 403 | i = blob_size(pBlob); |
| 404 | blob_resize(pBlob, i+n); |
| 405 | zOut = blob_buffer(pBlob); |
| 406 | z = (const unsigned char*)zStr; |
| 407 | while( (c = fossil_utf8_read(&z))!=0 ){ |
| 408 | if( c=='\\' ){ |
| 409 | zOut[i++] = '\\'; |
| 410 | zOut[i++] = c; |
| 411 | }else if( c<' ' || c>=0x7f ){ |
| 412 | zOut[i++] = '\\'; |
| 413 | if( c=='\n' ){ |
| 414 | zOut[i++] = 'n'; |
| 415 | }else if( c=='\r' ){ |
| 416 | zOut[i++] = 'r'; |
| 417 | }else{ |
| 418 | zOut[i++] = 'u'; |
| 419 | for(j=3; j>=0; j--){ |
| 420 | zOut[i+j] = "0123456789abcdef"[c&0xf]; |
| 421 | c >>= 4; |
| 422 | } |
| 423 | i += 4; |
| 424 | } |
| 425 | }else{ |
| 426 | zOut[i++] = c; |
| 427 | } |
| 428 | } |
| 429 | zOut[i] = 0; |
| 430 | } |
| 431 | |
| 432 | /* |
| 433 | ** The characters used for HTTP base64 encoding. |
| 434 | */ |
| 435 | static unsigned char zBase[] = |
| 436 |
+137
-20
| --- src/hname.c | ||
| +++ src/hname.c | ||
| @@ -16,11 +16,13 @@ | ||
| 16 | 16 | ******************************************************************************* |
| 17 | 17 | ** |
| 18 | 18 | ** This file contains generic code for dealing with hashes used for |
| 19 | 19 | ** naming artifacts. Specific hash algorithms are implemented separately |
| 20 | 20 | ** (for example in sha1.c and sha3.c). This file contains the generic |
| 21 | -** interface code. | |
| 21 | +** interface logic. | |
| 22 | +** | |
| 23 | +** "hname" is intended to be an abbreviation of "hash name". | |
| 22 | 24 | */ |
| 23 | 25 | #include "config.h" |
| 24 | 26 | #include "hname.h" |
| 25 | 27 | |
| 26 | 28 | |
| @@ -47,10 +49,19 @@ | ||
| 47 | 49 | /* |
| 48 | 50 | ** The number of distinct hash algorithms: |
| 49 | 51 | */ |
| 50 | 52 | #define HNAME_COUNT 2 /* Just SHA1 and SHA3-256. Let's keep it that way! */ |
| 51 | 53 | |
| 54 | +/* | |
| 55 | +** Hash naming policies | |
| 56 | +*/ | |
| 57 | +#define HPOLICY_SHA1 0 /* Use SHA1 hashes */ | |
| 58 | +#define HPOLICY_AUTO 1 /* SHA1 but auto-promote to SHA3 */ | |
| 59 | +#define HPOLICY_SHA3 2 /* Use SHA3 hashes */ | |
| 60 | +#define HPOLICY_SHA3_ONLY 3 /* Use SHA3 hashes exclusively */ | |
| 61 | +#define HPOLICY_SHUN_SHA1 4 /* Shun all SHA1 objects */ | |
| 62 | + | |
| 52 | 63 | #endif /* INTERFACE */ |
| 53 | 64 | |
| 54 | 65 | /* |
| 55 | 66 | ** Return a human-readable name for the hash algorithm given a hash with |
| 56 | 67 | ** a length of nHash hexadecimal digits. |
| @@ -142,26 +153,132 @@ | ||
| 142 | 153 | |
| 143 | 154 | /* |
| 144 | 155 | ** Compute a hash on blob pContent. Write the hash into blob pHashOut. |
| 145 | 156 | ** This routine assumes that pHashOut is uninitialized. |
| 146 | 157 | ** |
| 147 | -** The preferred hash is used for iHType==0, and various alternative hashes | |
| 148 | -** are used for iHType>0 && iHType<NHAME_COUNT. | |
| 158 | +** The preferred hash is used for iHType==0 and the alternative hash is | |
| 159 | +** used if iHType==1. (The interface is designed to accommodate more than | |
| 160 | +** just two hashes, but HNAME_COUNT is currently fixed at 2.) | |
| 161 | +** | |
| 162 | +** Depending on the hash policy, the alternative hash may be disallowed. | |
| 163 | +** If the alterative hash is disallowed, the routine returns 0. This | |
| 164 | +** routine returns 1 if iHType>0 and the alternative hash is allowed, | |
| 165 | +** and it always returns 1 when iHType==0. | |
| 166 | +** | |
| 167 | +** Alternative hash is disallowed for all hash policies except auto, | |
| 168 | +** sha1 and sha3. | |
| 169 | +*/ | |
| 170 | +int hname_hash(const Blob *pContent, unsigned int iHType, Blob *pHashOut){ | |
| 171 | + assert( iHType==0 || iHType==1 ); | |
| 172 | + if( iHType==1 ){ | |
| 173 | + switch( g.eHashPolicy ){ | |
| 174 | + case HPOLICY_AUTO: | |
| 175 | + case HPOLICY_SHA1: | |
| 176 | + sha3sum_blob(pContent, 256, pHashOut); | |
| 177 | + return 1; | |
| 178 | + case HPOLICY_SHA3: | |
| 179 | + sha1sum_blob(pContent, pHashOut); | |
| 180 | + return 1; | |
| 181 | + } | |
| 182 | + } | |
| 183 | + if( iHType==0 ){ | |
| 184 | + switch( g.eHashPolicy ){ | |
| 185 | + case HPOLICY_SHA1: | |
| 186 | + case HPOLICY_AUTO: | |
| 187 | + sha1sum_blob(pContent, pHashOut); | |
| 188 | + return 1; | |
| 189 | + case HPOLICY_SHA3: | |
| 190 | + case HPOLICY_SHA3_ONLY: | |
| 191 | + case HPOLICY_SHUN_SHA1: | |
| 192 | + sha3sum_blob(pContent, 256, pHashOut); | |
| 193 | + return 1; | |
| 194 | + } | |
| 195 | + } | |
| 196 | + blob_init(pHashOut, 0, 0); | |
| 197 | + return 0; | |
| 198 | +} | |
| 199 | + | |
| 200 | +/* | |
| 201 | +** Return the default hash policy for repositories that do not currently | |
| 202 | +** have an assigned hash policy. | |
| 203 | +** | |
| 204 | +** Make the default HPOLICY_AUTO if there are SHA1 artficates but no SHA3 | |
| 205 | +** artifacts in the repository. Make the default HPOLICY_SHA3 if there | |
| 206 | +** are one or more SHA3 artifacts or if the repository is initially empty. | |
| 207 | +*/ | |
| 208 | +int hname_default_policy(void){ | |
| 209 | + if( db_exists("SELECT 1 FROM blob WHERE length(uuid)>40") | |
| 210 | + || !db_exists("SELECT 1 FROM blob WHERE length(uuid)==40") | |
| 211 | + ){ | |
| 212 | + return HPOLICY_SHA3; | |
| 213 | + }else{ | |
| 214 | + return HPOLICY_AUTO; | |
| 215 | + } | |
| 216 | +} | |
| 217 | + | |
| 218 | +/* | |
| 219 | +** Names of the hash policies. | |
| 220 | +*/ | |
| 221 | +static const char *azPolicy[] = { | |
| 222 | + "sha1", "auto", "sha3", "sha3-only", "shun-sha1" | |
| 223 | +}; | |
| 224 | + | |
| 225 | +/* Return the name of the current hash policy. | |
| 226 | +*/ | |
| 227 | +const char *hpolicy_name(void){ | |
| 228 | + return azPolicy[g.eHashPolicy]; | |
| 229 | +} | |
| 230 | + | |
| 231 | + | |
| 232 | +/* | |
| 233 | +** COMMAND: hash-policy* | |
| 234 | +** | |
| 235 | +** Usage: fossil hash-policy ?NEW-POLICY? | |
| 236 | +** | |
| 237 | +** Query or set the hash policy for the current repository. Available hash | |
| 238 | +** policies are as follows: | |
| 239 | +** | |
| 240 | +** sha1 New artifact names are created using SHA1 | |
| 241 | +** | |
| 242 | +** auto New artifact names are created using SHA1, but | |
| 243 | +** automatically change the policy to "sha3" when | |
| 244 | +** any SHA3 artifact enters the repository. | |
| 245 | +** | |
| 246 | +** sha3 New artifact names are created using SHA3, but | |
| 247 | +** older artifacts with SHA1 names may be reused. | |
| 248 | +** | |
| 249 | +** sha3-only Use only SHA3 artifact names. Do not reuse legacy | |
| 250 | +** SHA1 names. | |
| 251 | +** | |
| 252 | +** shun-sha1 Shun any SHA1 artifacts received by sync operations | |
| 253 | +** other than clones. Older legacy SHA1 artifacts are | |
| 254 | +** are allowed during a clone. | |
| 255 | +** | |
| 256 | +** The default hash policy for existing repositories is "auto", which will | |
| 257 | +** immediately promote to "sha3" if the repository contains one or more | |
| 258 | +** artifacts with SHA3 names. The default hash policy for new repositories | |
| 259 | +** is "shun-sha1". | |
| 149 | 260 | */ |
| 150 | -void hname_hash(const Blob *pContent, unsigned int iHType, Blob *pHashOut){ | |
| 151 | -#if RELEASE_VERSION_NUMBER>=20100 | |
| 152 | - /* For Fossil 2.1 and later, the preferred hash algorithm is SHA3-256 and | |
| 153 | - ** SHA1 is the secondary hash algorithm. */ | |
| 154 | - switch( iHType ){ | |
| 155 | - case 0: sha3sum_blob(pContent, 256, pHashOut); break; | |
| 156 | - case 1: sha1sum_blob(pContent, pHashOut); break; | |
| 157 | - } | |
| 158 | -#else | |
| 159 | - /* Prior to Fossil 2.1, the preferred hash algorithm is SHA1 (for backwards | |
| 160 | - ** compatibility with Fossil 1.x) and SHA3-256 is the only auxiliary | |
| 161 | - ** algorithm */ | |
| 162 | - switch( iHType ){ | |
| 163 | - case 0: sha1sum_blob(pContent, pHashOut); break; | |
| 164 | - case 1: sha3sum_blob(pContent, 256, pHashOut); break; | |
| 165 | - } | |
| 166 | -#endif | |
| 261 | +void hash_policy_command(void){ | |
| 262 | + int i; | |
| 263 | + db_find_and_open_repository(0, 0); | |
| 264 | + if( g.argc!=2 && g.argc!=3 ) usage("?NEW-POLICY?"); | |
| 265 | + if( g.argc==2 ){ | |
| 266 | + fossil_print("%s\n", azPolicy[g.eHashPolicy]); | |
| 267 | + return; | |
| 268 | + } | |
| 269 | + for(i=HPOLICY_SHA1; i<=HPOLICY_SHUN_SHA1; i++){ | |
| 270 | + if( fossil_strcmp(g.argv[2],azPolicy[i])==0 ){ | |
| 271 | + if( i==HPOLICY_AUTO | |
| 272 | + && db_exists("SELECT 1 FROM blob WHERE length(uuid)>40") | |
| 273 | + ){ | |
| 274 | + i = HPOLICY_SHA3; | |
| 275 | + } | |
| 276 | + g.eHashPolicy = i; | |
| 277 | + db_set_int("hash-policy", i, 0); | |
| 278 | + fossil_print("%s\n", azPolicy[i]); | |
| 279 | + return; | |
| 280 | + } | |
| 281 | + } | |
| 282 | + fossil_fatal("unknown hash policy \"%s\" - should be one of: sha1 auto" | |
| 283 | + " sha3 sha3-only shun-sha1", g.argv[2]); | |
| 167 | 284 | } |
| 168 | 285 |
| --- src/hname.c | |
| +++ src/hname.c | |
| @@ -16,11 +16,13 @@ | |
| 16 | ******************************************************************************* |
| 17 | ** |
| 18 | ** This file contains generic code for dealing with hashes used for |
| 19 | ** naming artifacts. Specific hash algorithms are implemented separately |
| 20 | ** (for example in sha1.c and sha3.c). This file contains the generic |
| 21 | ** interface code. |
| 22 | */ |
| 23 | #include "config.h" |
| 24 | #include "hname.h" |
| 25 | |
| 26 | |
| @@ -47,10 +49,19 @@ | |
| 47 | /* |
| 48 | ** The number of distinct hash algorithms: |
| 49 | */ |
| 50 | #define HNAME_COUNT 2 /* Just SHA1 and SHA3-256. Let's keep it that way! */ |
| 51 | |
| 52 | #endif /* INTERFACE */ |
| 53 | |
| 54 | /* |
| 55 | ** Return a human-readable name for the hash algorithm given a hash with |
| 56 | ** a length of nHash hexadecimal digits. |
| @@ -142,26 +153,132 @@ | |
| 142 | |
| 143 | /* |
| 144 | ** Compute a hash on blob pContent. Write the hash into blob pHashOut. |
| 145 | ** This routine assumes that pHashOut is uninitialized. |
| 146 | ** |
| 147 | ** The preferred hash is used for iHType==0, and various alternative hashes |
| 148 | ** are used for iHType>0 && iHType<NHAME_COUNT. |
| 149 | */ |
| 150 | void hname_hash(const Blob *pContent, unsigned int iHType, Blob *pHashOut){ |
| 151 | #if RELEASE_VERSION_NUMBER>=20100 |
| 152 | /* For Fossil 2.1 and later, the preferred hash algorithm is SHA3-256 and |
| 153 | ** SHA1 is the secondary hash algorithm. */ |
| 154 | switch( iHType ){ |
| 155 | case 0: sha3sum_blob(pContent, 256, pHashOut); break; |
| 156 | case 1: sha1sum_blob(pContent, pHashOut); break; |
| 157 | } |
| 158 | #else |
| 159 | /* Prior to Fossil 2.1, the preferred hash algorithm is SHA1 (for backwards |
| 160 | ** compatibility with Fossil 1.x) and SHA3-256 is the only auxiliary |
| 161 | ** algorithm */ |
| 162 | switch( iHType ){ |
| 163 | case 0: sha1sum_blob(pContent, pHashOut); break; |
| 164 | case 1: sha3sum_blob(pContent, 256, pHashOut); break; |
| 165 | } |
| 166 | #endif |
| 167 | } |
| 168 |
| --- src/hname.c | |
| +++ src/hname.c | |
| @@ -16,11 +16,13 @@ | |
| 16 | ******************************************************************************* |
| 17 | ** |
| 18 | ** This file contains generic code for dealing with hashes used for |
| 19 | ** naming artifacts. Specific hash algorithms are implemented separately |
| 20 | ** (for example in sha1.c and sha3.c). This file contains the generic |
| 21 | ** interface logic. |
| 22 | ** |
| 23 | ** "hname" is intended to be an abbreviation of "hash name". |
| 24 | */ |
| 25 | #include "config.h" |
| 26 | #include "hname.h" |
| 27 | |
| 28 | |
| @@ -47,10 +49,19 @@ | |
| 49 | /* |
| 50 | ** The number of distinct hash algorithms: |
| 51 | */ |
| 52 | #define HNAME_COUNT 2 /* Just SHA1 and SHA3-256. Let's keep it that way! */ |
| 53 | |
| 54 | /* |
| 55 | ** Hash naming policies |
| 56 | */ |
| 57 | #define HPOLICY_SHA1 0 /* Use SHA1 hashes */ |
| 58 | #define HPOLICY_AUTO 1 /* SHA1 but auto-promote to SHA3 */ |
| 59 | #define HPOLICY_SHA3 2 /* Use SHA3 hashes */ |
| 60 | #define HPOLICY_SHA3_ONLY 3 /* Use SHA3 hashes exclusively */ |
| 61 | #define HPOLICY_SHUN_SHA1 4 /* Shun all SHA1 objects */ |
| 62 | |
| 63 | #endif /* INTERFACE */ |
| 64 | |
| 65 | /* |
| 66 | ** Return a human-readable name for the hash algorithm given a hash with |
| 67 | ** a length of nHash hexadecimal digits. |
| @@ -142,26 +153,132 @@ | |
| 153 | |
| 154 | /* |
| 155 | ** Compute a hash on blob pContent. Write the hash into blob pHashOut. |
| 156 | ** This routine assumes that pHashOut is uninitialized. |
| 157 | ** |
| 158 | ** The preferred hash is used for iHType==0 and the alternative hash is |
| 159 | ** used if iHType==1. (The interface is designed to accommodate more than |
| 160 | ** just two hashes, but HNAME_COUNT is currently fixed at 2.) |
| 161 | ** |
| 162 | ** Depending on the hash policy, the alternative hash may be disallowed. |
| 163 | ** If the alterative hash is disallowed, the routine returns 0. This |
| 164 | ** routine returns 1 if iHType>0 and the alternative hash is allowed, |
| 165 | ** and it always returns 1 when iHType==0. |
| 166 | ** |
| 167 | ** Alternative hash is disallowed for all hash policies except auto, |
| 168 | ** sha1 and sha3. |
| 169 | */ |
| 170 | int hname_hash(const Blob *pContent, unsigned int iHType, Blob *pHashOut){ |
| 171 | assert( iHType==0 || iHType==1 ); |
| 172 | if( iHType==1 ){ |
| 173 | switch( g.eHashPolicy ){ |
| 174 | case HPOLICY_AUTO: |
| 175 | case HPOLICY_SHA1: |
| 176 | sha3sum_blob(pContent, 256, pHashOut); |
| 177 | return 1; |
| 178 | case HPOLICY_SHA3: |
| 179 | sha1sum_blob(pContent, pHashOut); |
| 180 | return 1; |
| 181 | } |
| 182 | } |
| 183 | if( iHType==0 ){ |
| 184 | switch( g.eHashPolicy ){ |
| 185 | case HPOLICY_SHA1: |
| 186 | case HPOLICY_AUTO: |
| 187 | sha1sum_blob(pContent, pHashOut); |
| 188 | return 1; |
| 189 | case HPOLICY_SHA3: |
| 190 | case HPOLICY_SHA3_ONLY: |
| 191 | case HPOLICY_SHUN_SHA1: |
| 192 | sha3sum_blob(pContent, 256, pHashOut); |
| 193 | return 1; |
| 194 | } |
| 195 | } |
| 196 | blob_init(pHashOut, 0, 0); |
| 197 | return 0; |
| 198 | } |
| 199 | |
| 200 | /* |
| 201 | ** Return the default hash policy for repositories that do not currently |
| 202 | ** have an assigned hash policy. |
| 203 | ** |
| 204 | ** Make the default HPOLICY_AUTO if there are SHA1 artficates but no SHA3 |
| 205 | ** artifacts in the repository. Make the default HPOLICY_SHA3 if there |
| 206 | ** are one or more SHA3 artifacts or if the repository is initially empty. |
| 207 | */ |
| 208 | int hname_default_policy(void){ |
| 209 | if( db_exists("SELECT 1 FROM blob WHERE length(uuid)>40") |
| 210 | || !db_exists("SELECT 1 FROM blob WHERE length(uuid)==40") |
| 211 | ){ |
| 212 | return HPOLICY_SHA3; |
| 213 | }else{ |
| 214 | return HPOLICY_AUTO; |
| 215 | } |
| 216 | } |
| 217 | |
| 218 | /* |
| 219 | ** Names of the hash policies. |
| 220 | */ |
| 221 | static const char *azPolicy[] = { |
| 222 | "sha1", "auto", "sha3", "sha3-only", "shun-sha1" |
| 223 | }; |
| 224 | |
| 225 | /* Return the name of the current hash policy. |
| 226 | */ |
| 227 | const char *hpolicy_name(void){ |
| 228 | return azPolicy[g.eHashPolicy]; |
| 229 | } |
| 230 | |
| 231 | |
| 232 | /* |
| 233 | ** COMMAND: hash-policy* |
| 234 | ** |
| 235 | ** Usage: fossil hash-policy ?NEW-POLICY? |
| 236 | ** |
| 237 | ** Query or set the hash policy for the current repository. Available hash |
| 238 | ** policies are as follows: |
| 239 | ** |
| 240 | ** sha1 New artifact names are created using SHA1 |
| 241 | ** |
| 242 | ** auto New artifact names are created using SHA1, but |
| 243 | ** automatically change the policy to "sha3" when |
| 244 | ** any SHA3 artifact enters the repository. |
| 245 | ** |
| 246 | ** sha3 New artifact names are created using SHA3, but |
| 247 | ** older artifacts with SHA1 names may be reused. |
| 248 | ** |
| 249 | ** sha3-only Use only SHA3 artifact names. Do not reuse legacy |
| 250 | ** SHA1 names. |
| 251 | ** |
| 252 | ** shun-sha1 Shun any SHA1 artifacts received by sync operations |
| 253 | ** other than clones. Older legacy SHA1 artifacts are |
| 254 | ** are allowed during a clone. |
| 255 | ** |
| 256 | ** The default hash policy for existing repositories is "auto", which will |
| 257 | ** immediately promote to "sha3" if the repository contains one or more |
| 258 | ** artifacts with SHA3 names. The default hash policy for new repositories |
| 259 | ** is "shun-sha1". |
| 260 | */ |
| 261 | void hash_policy_command(void){ |
| 262 | int i; |
| 263 | db_find_and_open_repository(0, 0); |
| 264 | if( g.argc!=2 && g.argc!=3 ) usage("?NEW-POLICY?"); |
| 265 | if( g.argc==2 ){ |
| 266 | fossil_print("%s\n", azPolicy[g.eHashPolicy]); |
| 267 | return; |
| 268 | } |
| 269 | for(i=HPOLICY_SHA1; i<=HPOLICY_SHUN_SHA1; i++){ |
| 270 | if( fossil_strcmp(g.argv[2],azPolicy[i])==0 ){ |
| 271 | if( i==HPOLICY_AUTO |
| 272 | && db_exists("SELECT 1 FROM blob WHERE length(uuid)>40") |
| 273 | ){ |
| 274 | i = HPOLICY_SHA3; |
| 275 | } |
| 276 | g.eHashPolicy = i; |
| 277 | db_set_int("hash-policy", i, 0); |
| 278 | fossil_print("%s\n", azPolicy[i]); |
| 279 | return; |
| 280 | } |
| 281 | } |
| 282 | fossil_fatal("unknown hash policy \"%s\" - should be one of: sha1 auto" |
| 283 | " sha3 sha3-only shun-sha1", g.argv[2]); |
| 284 | } |
| 285 |
+4
-1
| --- src/main.c | ||
| +++ src/main.c | ||
| @@ -140,10 +140,11 @@ | ||
| 140 | 140 | char *zLocalDbName; /* Name of the local database file */ |
| 141 | 141 | char *zOpenRevision; /* Check-in version to use during database open */ |
| 142 | 142 | int localOpen; /* True if the local database is open */ |
| 143 | 143 | char *zLocalRoot; /* The directory holding the local database */ |
| 144 | 144 | int minPrefix; /* Number of digits needed for a distinct UUID */ |
| 145 | + int eHashPolicy; /* Current hash policy. One of HPOLICY_* */ | |
| 145 | 146 | int fNoDirSymlinks; /* True if --no-dir-symlinks flag is present */ |
| 146 | 147 | int fSqlTrace; /* True if --sqltrace flag is present */ |
| 147 | 148 | int fSqlStats; /* True if --sqltrace or --sqlstats are present */ |
| 148 | 149 | int fSqlPrint; /* True if -sqlprint flag is present */ |
| 149 | 150 | int fQuiet; /* True if -quiet flag is present */ |
| @@ -2005,11 +2006,11 @@ | ||
| 2005 | 2006 | ** the name of that directory and the specific repository will be |
| 2006 | 2007 | ** opened later by process_one_web_page() based on the content of |
| 2007 | 2008 | ** the PATH_INFO variable. |
| 2008 | 2009 | ** |
| 2009 | 2010 | ** If the fCreate flag is set, then create the repository if it |
| 2010 | -** does not already exist. | |
| 2011 | +** does not already exist. Always use "auto" hash-policy in this case. | |
| 2011 | 2012 | */ |
| 2012 | 2013 | static void find_server_repository(int arg, int fCreate){ |
| 2013 | 2014 | if( g.argc<=arg ){ |
| 2014 | 2015 | db_must_be_within_tree(); |
| 2015 | 2016 | }else{ |
| @@ -2022,10 +2023,12 @@ | ||
| 2022 | 2023 | if( isDir==0 && fCreate ){ |
| 2023 | 2024 | const char *zPassword; |
| 2024 | 2025 | db_create_repository(zRepo); |
| 2025 | 2026 | db_open_repository(zRepo); |
| 2026 | 2027 | db_begin_transaction(); |
| 2028 | + g.eHashPolicy = HPOLICY_AUTO; | |
| 2029 | + db_set_int("hash-policy", HPOLICY_AUTO, 0); | |
| 2027 | 2030 | db_initial_setup(0, "now", g.zLogin); |
| 2028 | 2031 | db_end_transaction(0); |
| 2029 | 2032 | fossil_print("project-id: %s\n", db_get("project-code", 0)); |
| 2030 | 2033 | fossil_print("server-id: %s\n", db_get("server-code", 0)); |
| 2031 | 2034 | zPassword = db_text(0, "SELECT pw FROM user WHERE login=%Q", g.zLogin); |
| 2032 | 2035 |
| --- src/main.c | |
| +++ src/main.c | |
| @@ -140,10 +140,11 @@ | |
| 140 | char *zLocalDbName; /* Name of the local database file */ |
| 141 | char *zOpenRevision; /* Check-in version to use during database open */ |
| 142 | int localOpen; /* True if the local database is open */ |
| 143 | char *zLocalRoot; /* The directory holding the local database */ |
| 144 | int minPrefix; /* Number of digits needed for a distinct UUID */ |
| 145 | int fNoDirSymlinks; /* True if --no-dir-symlinks flag is present */ |
| 146 | int fSqlTrace; /* True if --sqltrace flag is present */ |
| 147 | int fSqlStats; /* True if --sqltrace or --sqlstats are present */ |
| 148 | int fSqlPrint; /* True if -sqlprint flag is present */ |
| 149 | int fQuiet; /* True if -quiet flag is present */ |
| @@ -2005,11 +2006,11 @@ | |
| 2005 | ** the name of that directory and the specific repository will be |
| 2006 | ** opened later by process_one_web_page() based on the content of |
| 2007 | ** the PATH_INFO variable. |
| 2008 | ** |
| 2009 | ** If the fCreate flag is set, then create the repository if it |
| 2010 | ** does not already exist. |
| 2011 | */ |
| 2012 | static void find_server_repository(int arg, int fCreate){ |
| 2013 | if( g.argc<=arg ){ |
| 2014 | db_must_be_within_tree(); |
| 2015 | }else{ |
| @@ -2022,10 +2023,12 @@ | |
| 2022 | if( isDir==0 && fCreate ){ |
| 2023 | const char *zPassword; |
| 2024 | db_create_repository(zRepo); |
| 2025 | db_open_repository(zRepo); |
| 2026 | db_begin_transaction(); |
| 2027 | db_initial_setup(0, "now", g.zLogin); |
| 2028 | db_end_transaction(0); |
| 2029 | fossil_print("project-id: %s\n", db_get("project-code", 0)); |
| 2030 | fossil_print("server-id: %s\n", db_get("server-code", 0)); |
| 2031 | zPassword = db_text(0, "SELECT pw FROM user WHERE login=%Q", g.zLogin); |
| 2032 |
| --- src/main.c | |
| +++ src/main.c | |
| @@ -140,10 +140,11 @@ | |
| 140 | char *zLocalDbName; /* Name of the local database file */ |
| 141 | char *zOpenRevision; /* Check-in version to use during database open */ |
| 142 | int localOpen; /* True if the local database is open */ |
| 143 | char *zLocalRoot; /* The directory holding the local database */ |
| 144 | int minPrefix; /* Number of digits needed for a distinct UUID */ |
| 145 | int eHashPolicy; /* Current hash policy. One of HPOLICY_* */ |
| 146 | int fNoDirSymlinks; /* True if --no-dir-symlinks flag is present */ |
| 147 | int fSqlTrace; /* True if --sqltrace flag is present */ |
| 148 | int fSqlStats; /* True if --sqltrace or --sqlstats are present */ |
| 149 | int fSqlPrint; /* True if -sqlprint flag is present */ |
| 150 | int fQuiet; /* True if -quiet flag is present */ |
| @@ -2005,11 +2006,11 @@ | |
| 2006 | ** the name of that directory and the specific repository will be |
| 2007 | ** opened later by process_one_web_page() based on the content of |
| 2008 | ** the PATH_INFO variable. |
| 2009 | ** |
| 2010 | ** If the fCreate flag is set, then create the repository if it |
| 2011 | ** does not already exist. Always use "auto" hash-policy in this case. |
| 2012 | */ |
| 2013 | static void find_server_repository(int arg, int fCreate){ |
| 2014 | if( g.argc<=arg ){ |
| 2015 | db_must_be_within_tree(); |
| 2016 | }else{ |
| @@ -2022,10 +2023,12 @@ | |
| 2023 | if( isDir==0 && fCreate ){ |
| 2024 | const char *zPassword; |
| 2025 | db_create_repository(zRepo); |
| 2026 | db_open_repository(zRepo); |
| 2027 | db_begin_transaction(); |
| 2028 | g.eHashPolicy = HPOLICY_AUTO; |
| 2029 | db_set_int("hash-policy", HPOLICY_AUTO, 0); |
| 2030 | db_initial_setup(0, "now", g.zLogin); |
| 2031 | db_end_transaction(0); |
| 2032 | fossil_print("project-id: %s\n", db_get("project-code", 0)); |
| 2033 | fossil_print("server-id: %s\n", db_get("server-code", 0)); |
| 2034 | zPassword = db_text(0, "SELECT pw FROM user WHERE login=%Q", g.zLogin); |
| 2035 |
+3
-3
| --- src/sha3.c | ||
| +++ src/sha3.c | ||
| @@ -378,18 +378,18 @@ | ||
| 378 | 378 | } |
| 379 | 379 | |
| 380 | 380 | /* |
| 381 | 381 | ** Initialize a new hash. iSize determines the size of the hash |
| 382 | 382 | ** in bits and should be one of 224, 256, 384, or 512. Or iSize |
| 383 | -** can be zero to use the default hash size of 224 bits. | |
| 383 | +** can be zero to use the default hash size of 256 bits. | |
| 384 | 384 | */ |
| 385 | 385 | static void SHA3Init(SHA3Context *p, int iSize){ |
| 386 | 386 | memset(p, 0, sizeof(*p)); |
| 387 | 387 | if( iSize>=128 && iSize<=512 ){ |
| 388 | 388 | p->nRate = (1600 - ((iSize + 31)&~31)*2)/8; |
| 389 | 389 | }else{ |
| 390 | - p->nRate = 144; | |
| 390 | + p->nRate = (1600 - 2*256)/8; | |
| 391 | 391 | } |
| 392 | 392 | #if SHA3_BYTEORDER==1234 |
| 393 | 393 | /* Known to be little-endian at compile-time. No-op */ |
| 394 | 394 | #elif SHA3_BYTEORDER==4321 |
| 395 | 395 | p->ixMask = 7; /* Big-endian */ |
| @@ -428,11 +428,11 @@ | ||
| 428 | 428 | } |
| 429 | 429 | } |
| 430 | 430 | } |
| 431 | 431 | #endif |
| 432 | 432 | for(; i<nData; i++){ |
| 433 | -#if SHA1_BYTEORDER==1234 | |
| 433 | +#if SHA3_BYTEORDER==1234 | |
| 434 | 434 | p->u.x[p->nLoaded] ^= aData[i]; |
| 435 | 435 | #elif SHA3_BYTEORDER==4321 |
| 436 | 436 | p->u.x[p->nLoaded^0x07] ^= aData[i]; |
| 437 | 437 | #else |
| 438 | 438 | p->u.x[p->nLoaded^p->ixMask] ^= aData[i]; |
| 439 | 439 |
| --- src/sha3.c | |
| +++ src/sha3.c | |
| @@ -378,18 +378,18 @@ | |
| 378 | } |
| 379 | |
| 380 | /* |
| 381 | ** Initialize a new hash. iSize determines the size of the hash |
| 382 | ** in bits and should be one of 224, 256, 384, or 512. Or iSize |
| 383 | ** can be zero to use the default hash size of 224 bits. |
| 384 | */ |
| 385 | static void SHA3Init(SHA3Context *p, int iSize){ |
| 386 | memset(p, 0, sizeof(*p)); |
| 387 | if( iSize>=128 && iSize<=512 ){ |
| 388 | p->nRate = (1600 - ((iSize + 31)&~31)*2)/8; |
| 389 | }else{ |
| 390 | p->nRate = 144; |
| 391 | } |
| 392 | #if SHA3_BYTEORDER==1234 |
| 393 | /* Known to be little-endian at compile-time. No-op */ |
| 394 | #elif SHA3_BYTEORDER==4321 |
| 395 | p->ixMask = 7; /* Big-endian */ |
| @@ -428,11 +428,11 @@ | |
| 428 | } |
| 429 | } |
| 430 | } |
| 431 | #endif |
| 432 | for(; i<nData; i++){ |
| 433 | #if SHA1_BYTEORDER==1234 |
| 434 | p->u.x[p->nLoaded] ^= aData[i]; |
| 435 | #elif SHA3_BYTEORDER==4321 |
| 436 | p->u.x[p->nLoaded^0x07] ^= aData[i]; |
| 437 | #else |
| 438 | p->u.x[p->nLoaded^p->ixMask] ^= aData[i]; |
| 439 |
| --- src/sha3.c | |
| +++ src/sha3.c | |
| @@ -378,18 +378,18 @@ | |
| 378 | } |
| 379 | |
| 380 | /* |
| 381 | ** Initialize a new hash. iSize determines the size of the hash |
| 382 | ** in bits and should be one of 224, 256, 384, or 512. Or iSize |
| 383 | ** can be zero to use the default hash size of 256 bits. |
| 384 | */ |
| 385 | static void SHA3Init(SHA3Context *p, int iSize){ |
| 386 | memset(p, 0, sizeof(*p)); |
| 387 | if( iSize>=128 && iSize<=512 ){ |
| 388 | p->nRate = (1600 - ((iSize + 31)&~31)*2)/8; |
| 389 | }else{ |
| 390 | p->nRate = (1600 - 2*256)/8; |
| 391 | } |
| 392 | #if SHA3_BYTEORDER==1234 |
| 393 | /* Known to be little-endian at compile-time. No-op */ |
| 394 | #elif SHA3_BYTEORDER==4321 |
| 395 | p->ixMask = 7; /* Big-endian */ |
| @@ -428,11 +428,11 @@ | |
| 428 | } |
| 429 | } |
| 430 | } |
| 431 | #endif |
| 432 | for(; i<nData; i++){ |
| 433 | #if SHA3_BYTEORDER==1234 |
| 434 | p->u.x[p->nLoaded] ^= aData[i]; |
| 435 | #elif SHA3_BYTEORDER==4321 |
| 436 | p->u.x[p->nLoaded^0x07] ^= aData[i]; |
| 437 | #else |
| 438 | p->u.x[p->nLoaded^p->ixMask] ^= aData[i]; |
| 439 |
+1
| --- src/shun.c | ||
| +++ src/shun.c | ||
| @@ -26,10 +26,11 @@ | ||
| 26 | 26 | */ |
| 27 | 27 | int uuid_is_shunned(const char *zUuid){ |
| 28 | 28 | static Stmt q; |
| 29 | 29 | int rc; |
| 30 | 30 | if( zUuid==0 || zUuid[0]==0 ) return 0; |
| 31 | + if( g.eHashPolicy==HPOLICY_SHUN_SHA1 && zUuid[HNAME_LEN_SHA1]==0 ) return 1; | |
| 31 | 32 | db_static_prepare(&q, "SELECT 1 FROM shun WHERE uuid=:uuid"); |
| 32 | 33 | db_bind_text(&q, ":uuid", zUuid); |
| 33 | 34 | rc = db_step(&q); |
| 34 | 35 | db_reset(&q); |
| 35 | 36 | return rc==SQLITE_ROW; |
| 36 | 37 |
| --- src/shun.c | |
| +++ src/shun.c | |
| @@ -26,10 +26,11 @@ | |
| 26 | */ |
| 27 | int uuid_is_shunned(const char *zUuid){ |
| 28 | static Stmt q; |
| 29 | int rc; |
| 30 | if( zUuid==0 || zUuid[0]==0 ) return 0; |
| 31 | db_static_prepare(&q, "SELECT 1 FROM shun WHERE uuid=:uuid"); |
| 32 | db_bind_text(&q, ":uuid", zUuid); |
| 33 | rc = db_step(&q); |
| 34 | db_reset(&q); |
| 35 | return rc==SQLITE_ROW; |
| 36 |
| --- src/shun.c | |
| +++ src/shun.c | |
| @@ -26,10 +26,11 @@ | |
| 26 | */ |
| 27 | int uuid_is_shunned(const char *zUuid){ |
| 28 | static Stmt q; |
| 29 | int rc; |
| 30 | if( zUuid==0 || zUuid[0]==0 ) return 0; |
| 31 | if( g.eHashPolicy==HPOLICY_SHUN_SHA1 && zUuid[HNAME_LEN_SHA1]==0 ) return 1; |
| 32 | db_static_prepare(&q, "SELECT 1 FROM shun WHERE uuid=:uuid"); |
| 33 | db_bind_text(&q, ":uuid", zUuid); |
| 34 | rc = db_step(&q); |
| 35 | db_reset(&q); |
| 36 | return rc==SQLITE_ROW; |
| 37 |
+3
| --- src/sqlcmd.c | ||
| +++ src/sqlcmd.c | ||
| @@ -212,10 +212,13 @@ | ||
| 212 | 212 | */ |
| 213 | 213 | void cmd_sqlite3(void){ |
| 214 | 214 | int noRepository; |
| 215 | 215 | const char *zConfigDb; |
| 216 | 216 | extern int sqlite3_shell(int, char**); |
| 217 | +#ifdef FOSSIL_ENABLE_TH1_HOOKS | |
| 218 | + g.fNoThHook = 1; | |
| 219 | +#endif | |
| 217 | 220 | noRepository = find_option("no-repository", 0, 0)!=0; |
| 218 | 221 | if( !noRepository ){ |
| 219 | 222 | db_find_and_open_repository(OPEN_ANY_SCHEMA, 0); |
| 220 | 223 | } |
| 221 | 224 | db_open_config(1,0); |
| 222 | 225 |
| --- src/sqlcmd.c | |
| +++ src/sqlcmd.c | |
| @@ -212,10 +212,13 @@ | |
| 212 | */ |
| 213 | void cmd_sqlite3(void){ |
| 214 | int noRepository; |
| 215 | const char *zConfigDb; |
| 216 | extern int sqlite3_shell(int, char**); |
| 217 | noRepository = find_option("no-repository", 0, 0)!=0; |
| 218 | if( !noRepository ){ |
| 219 | db_find_and_open_repository(OPEN_ANY_SCHEMA, 0); |
| 220 | } |
| 221 | db_open_config(1,0); |
| 222 |
| --- src/sqlcmd.c | |
| +++ src/sqlcmd.c | |
| @@ -212,10 +212,13 @@ | |
| 212 | */ |
| 213 | void cmd_sqlite3(void){ |
| 214 | int noRepository; |
| 215 | const char *zConfigDb; |
| 216 | extern int sqlite3_shell(int, char**); |
| 217 | #ifdef FOSSIL_ENABLE_TH1_HOOKS |
| 218 | g.fNoThHook = 1; |
| 219 | #endif |
| 220 | noRepository = find_option("no-repository", 0, 0)!=0; |
| 221 | if( !noRepository ){ |
| 222 | db_find_and_open_repository(OPEN_ANY_SCHEMA, 0); |
| 223 | } |
| 224 | db_open_config(1,0); |
| 225 |
+35
-32
| --- src/stash.c | ||
| +++ src/stash.c | ||
| @@ -332,52 +332,45 @@ | ||
| 332 | 332 | isBin2 = fIncludeBinary ? 0 : looks_like_binary(&a); |
| 333 | 333 | diff_file_mem(&empty, &a, isBin1, isBin2, zNew, zDiffCmd, |
| 334 | 334 | zBinGlob, fIncludeBinary, diffFlags); |
| 335 | 335 | }else if( isRemoved ){ |
| 336 | 336 | fossil_print("DELETE %s\n", zOrig); |
| 337 | - if( fBaseline==0 ){ | |
| 338 | - if( file_wd_islink(zOPath) ){ | |
| 339 | - blob_read_link(&a, zOPath); | |
| 340 | - }else{ | |
| 341 | - blob_read_from_file(&a, zOPath); | |
| 342 | - } | |
| 343 | - }else{ | |
| 344 | - content_get(rid, &a); | |
| 345 | - } | |
| 346 | - diff_print_index(zNew, diffFlags); | |
| 347 | - isBin1 = fIncludeBinary ? 0 : looks_like_binary(&a); | |
| 348 | - isBin2 = 0; | |
| 349 | - diff_file_mem(&a, &empty, isBin1, isBin2, zOrig, zDiffCmd, | |
| 350 | - zBinGlob, fIncludeBinary, diffFlags); | |
| 351 | - }else{ | |
| 352 | - Blob delta, disk; | |
| 337 | + diff_print_index(zNew, diffFlags); | |
| 338 | + isBin2 = 0; | |
| 339 | + if( fBaseline ){ | |
| 340 | + content_get(rid, &a); | |
| 341 | + isBin1 = fIncludeBinary ? 0 : looks_like_binary(&a); | |
| 342 | + diff_file_mem(&a, &empty, isBin1, isBin2, zOrig, zDiffCmd, | |
| 343 | + zBinGlob, fIncludeBinary, diffFlags); | |
| 344 | + }else{ | |
| 345 | + } | |
| 346 | + }else{ | |
| 347 | + Blob delta; | |
| 353 | 348 | int isOrigLink = file_wd_islink(zOPath); |
| 354 | 349 | db_ephemeral_blob(&q, 6, &delta); |
| 355 | - if( fBaseline==0 ){ | |
| 356 | - if( isOrigLink ){ | |
| 357 | - blob_read_link(&disk, zOPath); | |
| 358 | - }else{ | |
| 359 | - blob_read_from_file(&disk, zOPath); | |
| 360 | - } | |
| 361 | - } | |
| 362 | 350 | fossil_print("CHANGED %s\n", zNew); |
| 363 | 351 | if( !isOrigLink != !isLink ){ |
| 364 | 352 | diff_print_index(zNew, diffFlags); |
| 365 | 353 | diff_print_filenames(zOrig, zNew, diffFlags); |
| 366 | 354 | printf(DIFF_CANNOT_COMPUTE_SYMLINK); |
| 367 | 355 | }else{ |
| 368 | - Blob *pBase = fBaseline ? &a : &disk; | |
| 369 | 356 | content_get(rid, &a); |
| 370 | 357 | blob_delta_apply(&a, &delta, &b); |
| 371 | - isBin1 = fIncludeBinary ? 0 : looks_like_binary(pBase); | |
| 358 | + isBin1 = fIncludeBinary ? 0 : looks_like_binary(&a); | |
| 372 | 359 | isBin2 = fIncludeBinary ? 0 : looks_like_binary(&b); |
| 373 | - diff_file_mem(fBaseline? &a : &disk, &b, isBin1, isBin2, zNew, | |
| 374 | - zDiffCmd, zBinGlob, fIncludeBinary, diffFlags); | |
| 360 | + if( fBaseline ){ | |
| 361 | + diff_file_mem(&a, &b, isBin1, isBin2, zNew, | |
| 362 | + zDiffCmd, zBinGlob, fIncludeBinary, diffFlags); | |
| 363 | + }else{ | |
| 364 | + /*Diff with file on disk using fSwapDiff=1 to show the diff in the | |
| 365 | + same direction as if fBaseline=1.*/ | |
| 366 | + diff_file(&b, isBin2, zOPath, zNew, zDiffCmd, | |
| 367 | + zBinGlob, fIncludeBinary, diffFlags, 1); | |
| 368 | + } | |
| 375 | 369 | blob_reset(&a); |
| 376 | 370 | blob_reset(&b); |
| 377 | 371 | } |
| 378 | - if( !fBaseline ) blob_reset(&disk); | |
| 379 | 372 | blob_reset(&delta); |
| 380 | 373 | } |
| 381 | 374 | } |
| 382 | 375 | db_finalize(&q); |
| 383 | 376 | } |
| @@ -433,12 +426,15 @@ | ||
| 433 | 426 | ** |
| 434 | 427 | ** List all changes sets currently stashed. Show information about |
| 435 | 428 | ** individual files in each changeset if -v or --verbose is used. |
| 436 | 429 | ** |
| 437 | 430 | ** fossil stash show|cat ?STASHID? ?DIFF-OPTIONS? |
| 431 | +** fossil stash gshow|gcat ?STASHID? ?DIFF-OPTIONS? | |
| 438 | 432 | ** |
| 439 | -** Show the contents of a stash. | |
| 433 | +** Show the contents of a stash as a diff against it's baseline. | |
| 434 | +** With gshow and gcat, gdiff-command is used instead of internal | |
| 435 | +** diff logic. | |
| 440 | 436 | ** |
| 441 | 437 | ** fossil stash pop |
| 442 | 438 | ** fossil stash apply ?STASHID? |
| 443 | 439 | ** |
| 444 | 440 | ** Apply STASHID or the most recently create stash to the current |
| @@ -460,18 +456,20 @@ | ||
| 460 | 456 | ** |
| 461 | 457 | ** fossil stash diff ?STASHID? ?DIFF-OPTIONS? |
| 462 | 458 | ** fossil stash gdiff ?STASHID? ?DIFF-OPTIONS? |
| 463 | 459 | ** |
| 464 | 460 | ** Show diffs of the current working directory and what that |
| 465 | -** directory would be if STASHID were applied. | |
| 461 | +** directory would be if STASHID were applied. With gdiff, | |
| 462 | +** gdiff-command is used instead of internal diff logic. | |
| 466 | 463 | ** |
| 467 | 464 | ** SUMMARY: |
| 468 | 465 | ** fossil stash |
| 469 | 466 | ** fossil stash save ?-m|--comment COMMENT? ?FILES...? |
| 470 | 467 | ** fossil stash snapshot ?-m|--comment COMMENT? ?FILES...? |
| 471 | 468 | ** fossil stash list|ls ?-v|--verbose? ?-W|--width <num>? |
| 472 | 469 | ** fossil stash show|cat ?STASHID? ?DIFF-OPTIONS? |
| 470 | +** fossil stash gshow|gcat ?STASHID? ?DIFF-OPTIONS? | |
| 473 | 471 | ** fossil stash pop |
| 474 | 472 | ** fossil stash apply|goto ?STASHID? |
| 475 | 473 | ** fossil stash drop|rm ?STASHID? ?-a|--all? |
| 476 | 474 | ** fossil stash diff ?STASHID? ?DIFF-OPTIONS? |
| 477 | 475 | ** fossil stash gdiff ?STASHID? ?DIFF-OPTIONS? |
| @@ -654,25 +652,30 @@ | ||
| 654 | 652 | undo_finish(); |
| 655 | 653 | }else |
| 656 | 654 | if( memcmp(zCmd, "diff", nCmd)==0 |
| 657 | 655 | || memcmp(zCmd, "gdiff", nCmd)==0 |
| 658 | 656 | || memcmp(zCmd, "show", nCmd)==0 |
| 657 | + || memcmp(zCmd, "gshow", nCmd)==0 | |
| 659 | 658 | || memcmp(zCmd, "cat", nCmd)==0 |
| 659 | + || memcmp(zCmd, "gcat", nCmd)==0 | |
| 660 | 660 | ){ |
| 661 | 661 | const char *zDiffCmd = 0; |
| 662 | 662 | const char *zBinGlob = 0; |
| 663 | 663 | int fIncludeBinary = 0; |
| 664 | - int fBaseline = zCmd[0]=='s' || zCmd[0]=='c'; | |
| 664 | + int fBaseline = 0; | |
| 665 | 665 | u64 diffFlags; |
| 666 | 666 | |
| 667 | + if( strstr(zCmd,"show")!=0 || strstr(zCmd,"cat")!=0 ){ | |
| 668 | + fBaseline = 1; | |
| 669 | + } | |
| 667 | 670 | if( find_option("tk",0,0)!=0 ){ |
| 668 | 671 | db_close(0); |
| 669 | 672 | diff_tk(fBaseline ? "stash show" : "stash diff", 3); |
| 670 | 673 | return; |
| 671 | 674 | } |
| 672 | 675 | if( find_option("internal","i",0)==0 ){ |
| 673 | - zDiffCmd = diff_command_external(memcmp(zCmd, "gdiff", nCmd)==0); | |
| 676 | + zDiffCmd = diff_command_external(zCmd[0]=='g'); | |
| 674 | 677 | } |
| 675 | 678 | diffFlags = diff_options(); |
| 676 | 679 | if( find_option("verbose","v",0)!=0 ) diffFlags |= DIFF_VERBOSE; |
| 677 | 680 | if( g.argc>4 ) usage(mprintf("%s ?STASHID? ?DIFF-OPTIONS?", zCmd)); |
| 678 | 681 | if( zDiffCmd ){ |
| 679 | 682 |
| --- src/stash.c | |
| +++ src/stash.c | |
| @@ -332,52 +332,45 @@ | |
| 332 | isBin2 = fIncludeBinary ? 0 : looks_like_binary(&a); |
| 333 | diff_file_mem(&empty, &a, isBin1, isBin2, zNew, zDiffCmd, |
| 334 | zBinGlob, fIncludeBinary, diffFlags); |
| 335 | }else if( isRemoved ){ |
| 336 | fossil_print("DELETE %s\n", zOrig); |
| 337 | if( fBaseline==0 ){ |
| 338 | if( file_wd_islink(zOPath) ){ |
| 339 | blob_read_link(&a, zOPath); |
| 340 | }else{ |
| 341 | blob_read_from_file(&a, zOPath); |
| 342 | } |
| 343 | }else{ |
| 344 | content_get(rid, &a); |
| 345 | } |
| 346 | diff_print_index(zNew, diffFlags); |
| 347 | isBin1 = fIncludeBinary ? 0 : looks_like_binary(&a); |
| 348 | isBin2 = 0; |
| 349 | diff_file_mem(&a, &empty, isBin1, isBin2, zOrig, zDiffCmd, |
| 350 | zBinGlob, fIncludeBinary, diffFlags); |
| 351 | }else{ |
| 352 | Blob delta, disk; |
| 353 | int isOrigLink = file_wd_islink(zOPath); |
| 354 | db_ephemeral_blob(&q, 6, &delta); |
| 355 | if( fBaseline==0 ){ |
| 356 | if( isOrigLink ){ |
| 357 | blob_read_link(&disk, zOPath); |
| 358 | }else{ |
| 359 | blob_read_from_file(&disk, zOPath); |
| 360 | } |
| 361 | } |
| 362 | fossil_print("CHANGED %s\n", zNew); |
| 363 | if( !isOrigLink != !isLink ){ |
| 364 | diff_print_index(zNew, diffFlags); |
| 365 | diff_print_filenames(zOrig, zNew, diffFlags); |
| 366 | printf(DIFF_CANNOT_COMPUTE_SYMLINK); |
| 367 | }else{ |
| 368 | Blob *pBase = fBaseline ? &a : &disk; |
| 369 | content_get(rid, &a); |
| 370 | blob_delta_apply(&a, &delta, &b); |
| 371 | isBin1 = fIncludeBinary ? 0 : looks_like_binary(pBase); |
| 372 | isBin2 = fIncludeBinary ? 0 : looks_like_binary(&b); |
| 373 | diff_file_mem(fBaseline? &a : &disk, &b, isBin1, isBin2, zNew, |
| 374 | zDiffCmd, zBinGlob, fIncludeBinary, diffFlags); |
| 375 | blob_reset(&a); |
| 376 | blob_reset(&b); |
| 377 | } |
| 378 | if( !fBaseline ) blob_reset(&disk); |
| 379 | blob_reset(&delta); |
| 380 | } |
| 381 | } |
| 382 | db_finalize(&q); |
| 383 | } |
| @@ -433,12 +426,15 @@ | |
| 433 | ** |
| 434 | ** List all changes sets currently stashed. Show information about |
| 435 | ** individual files in each changeset if -v or --verbose is used. |
| 436 | ** |
| 437 | ** fossil stash show|cat ?STASHID? ?DIFF-OPTIONS? |
| 438 | ** |
| 439 | ** Show the contents of a stash. |
| 440 | ** |
| 441 | ** fossil stash pop |
| 442 | ** fossil stash apply ?STASHID? |
| 443 | ** |
| 444 | ** Apply STASHID or the most recently create stash to the current |
| @@ -460,18 +456,20 @@ | |
| 460 | ** |
| 461 | ** fossil stash diff ?STASHID? ?DIFF-OPTIONS? |
| 462 | ** fossil stash gdiff ?STASHID? ?DIFF-OPTIONS? |
| 463 | ** |
| 464 | ** Show diffs of the current working directory and what that |
| 465 | ** directory would be if STASHID were applied. |
| 466 | ** |
| 467 | ** SUMMARY: |
| 468 | ** fossil stash |
| 469 | ** fossil stash save ?-m|--comment COMMENT? ?FILES...? |
| 470 | ** fossil stash snapshot ?-m|--comment COMMENT? ?FILES...? |
| 471 | ** fossil stash list|ls ?-v|--verbose? ?-W|--width <num>? |
| 472 | ** fossil stash show|cat ?STASHID? ?DIFF-OPTIONS? |
| 473 | ** fossil stash pop |
| 474 | ** fossil stash apply|goto ?STASHID? |
| 475 | ** fossil stash drop|rm ?STASHID? ?-a|--all? |
| 476 | ** fossil stash diff ?STASHID? ?DIFF-OPTIONS? |
| 477 | ** fossil stash gdiff ?STASHID? ?DIFF-OPTIONS? |
| @@ -654,25 +652,30 @@ | |
| 654 | undo_finish(); |
| 655 | }else |
| 656 | if( memcmp(zCmd, "diff", nCmd)==0 |
| 657 | || memcmp(zCmd, "gdiff", nCmd)==0 |
| 658 | || memcmp(zCmd, "show", nCmd)==0 |
| 659 | || memcmp(zCmd, "cat", nCmd)==0 |
| 660 | ){ |
| 661 | const char *zDiffCmd = 0; |
| 662 | const char *zBinGlob = 0; |
| 663 | int fIncludeBinary = 0; |
| 664 | int fBaseline = zCmd[0]=='s' || zCmd[0]=='c'; |
| 665 | u64 diffFlags; |
| 666 | |
| 667 | if( find_option("tk",0,0)!=0 ){ |
| 668 | db_close(0); |
| 669 | diff_tk(fBaseline ? "stash show" : "stash diff", 3); |
| 670 | return; |
| 671 | } |
| 672 | if( find_option("internal","i",0)==0 ){ |
| 673 | zDiffCmd = diff_command_external(memcmp(zCmd, "gdiff", nCmd)==0); |
| 674 | } |
| 675 | diffFlags = diff_options(); |
| 676 | if( find_option("verbose","v",0)!=0 ) diffFlags |= DIFF_VERBOSE; |
| 677 | if( g.argc>4 ) usage(mprintf("%s ?STASHID? ?DIFF-OPTIONS?", zCmd)); |
| 678 | if( zDiffCmd ){ |
| 679 |
| --- src/stash.c | |
| +++ src/stash.c | |
| @@ -332,52 +332,45 @@ | |
| 332 | isBin2 = fIncludeBinary ? 0 : looks_like_binary(&a); |
| 333 | diff_file_mem(&empty, &a, isBin1, isBin2, zNew, zDiffCmd, |
| 334 | zBinGlob, fIncludeBinary, diffFlags); |
| 335 | }else if( isRemoved ){ |
| 336 | fossil_print("DELETE %s\n", zOrig); |
| 337 | diff_print_index(zNew, diffFlags); |
| 338 | isBin2 = 0; |
| 339 | if( fBaseline ){ |
| 340 | content_get(rid, &a); |
| 341 | isBin1 = fIncludeBinary ? 0 : looks_like_binary(&a); |
| 342 | diff_file_mem(&a, &empty, isBin1, isBin2, zOrig, zDiffCmd, |
| 343 | zBinGlob, fIncludeBinary, diffFlags); |
| 344 | }else{ |
| 345 | } |
| 346 | }else{ |
| 347 | Blob delta; |
| 348 | int isOrigLink = file_wd_islink(zOPath); |
| 349 | db_ephemeral_blob(&q, 6, &delta); |
| 350 | fossil_print("CHANGED %s\n", zNew); |
| 351 | if( !isOrigLink != !isLink ){ |
| 352 | diff_print_index(zNew, diffFlags); |
| 353 | diff_print_filenames(zOrig, zNew, diffFlags); |
| 354 | printf(DIFF_CANNOT_COMPUTE_SYMLINK); |
| 355 | }else{ |
| 356 | content_get(rid, &a); |
| 357 | blob_delta_apply(&a, &delta, &b); |
| 358 | isBin1 = fIncludeBinary ? 0 : looks_like_binary(&a); |
| 359 | isBin2 = fIncludeBinary ? 0 : looks_like_binary(&b); |
| 360 | if( fBaseline ){ |
| 361 | diff_file_mem(&a, &b, isBin1, isBin2, zNew, |
| 362 | zDiffCmd, zBinGlob, fIncludeBinary, diffFlags); |
| 363 | }else{ |
| 364 | /*Diff with file on disk using fSwapDiff=1 to show the diff in the |
| 365 | same direction as if fBaseline=1.*/ |
| 366 | diff_file(&b, isBin2, zOPath, zNew, zDiffCmd, |
| 367 | zBinGlob, fIncludeBinary, diffFlags, 1); |
| 368 | } |
| 369 | blob_reset(&a); |
| 370 | blob_reset(&b); |
| 371 | } |
| 372 | blob_reset(&delta); |
| 373 | } |
| 374 | } |
| 375 | db_finalize(&q); |
| 376 | } |
| @@ -433,12 +426,15 @@ | |
| 426 | ** |
| 427 | ** List all changes sets currently stashed. Show information about |
| 428 | ** individual files in each changeset if -v or --verbose is used. |
| 429 | ** |
| 430 | ** fossil stash show|cat ?STASHID? ?DIFF-OPTIONS? |
| 431 | ** fossil stash gshow|gcat ?STASHID? ?DIFF-OPTIONS? |
| 432 | ** |
| 433 | ** Show the contents of a stash as a diff against it's baseline. |
| 434 | ** With gshow and gcat, gdiff-command is used instead of internal |
| 435 | ** diff logic. |
| 436 | ** |
| 437 | ** fossil stash pop |
| 438 | ** fossil stash apply ?STASHID? |
| 439 | ** |
| 440 | ** Apply STASHID or the most recently create stash to the current |
| @@ -460,18 +456,20 @@ | |
| 456 | ** |
| 457 | ** fossil stash diff ?STASHID? ?DIFF-OPTIONS? |
| 458 | ** fossil stash gdiff ?STASHID? ?DIFF-OPTIONS? |
| 459 | ** |
| 460 | ** Show diffs of the current working directory and what that |
| 461 | ** directory would be if STASHID were applied. With gdiff, |
| 462 | ** gdiff-command is used instead of internal diff logic. |
| 463 | ** |
| 464 | ** SUMMARY: |
| 465 | ** fossil stash |
| 466 | ** fossil stash save ?-m|--comment COMMENT? ?FILES...? |
| 467 | ** fossil stash snapshot ?-m|--comment COMMENT? ?FILES...? |
| 468 | ** fossil stash list|ls ?-v|--verbose? ?-W|--width <num>? |
| 469 | ** fossil stash show|cat ?STASHID? ?DIFF-OPTIONS? |
| 470 | ** fossil stash gshow|gcat ?STASHID? ?DIFF-OPTIONS? |
| 471 | ** fossil stash pop |
| 472 | ** fossil stash apply|goto ?STASHID? |
| 473 | ** fossil stash drop|rm ?STASHID? ?-a|--all? |
| 474 | ** fossil stash diff ?STASHID? ?DIFF-OPTIONS? |
| 475 | ** fossil stash gdiff ?STASHID? ?DIFF-OPTIONS? |
| @@ -654,25 +652,30 @@ | |
| 652 | undo_finish(); |
| 653 | }else |
| 654 | if( memcmp(zCmd, "diff", nCmd)==0 |
| 655 | || memcmp(zCmd, "gdiff", nCmd)==0 |
| 656 | || memcmp(zCmd, "show", nCmd)==0 |
| 657 | || memcmp(zCmd, "gshow", nCmd)==0 |
| 658 | || memcmp(zCmd, "cat", nCmd)==0 |
| 659 | || memcmp(zCmd, "gcat", nCmd)==0 |
| 660 | ){ |
| 661 | const char *zDiffCmd = 0; |
| 662 | const char *zBinGlob = 0; |
| 663 | int fIncludeBinary = 0; |
| 664 | int fBaseline = 0; |
| 665 | u64 diffFlags; |
| 666 | |
| 667 | if( strstr(zCmd,"show")!=0 || strstr(zCmd,"cat")!=0 ){ |
| 668 | fBaseline = 1; |
| 669 | } |
| 670 | if( find_option("tk",0,0)!=0 ){ |
| 671 | db_close(0); |
| 672 | diff_tk(fBaseline ? "stash show" : "stash diff", 3); |
| 673 | return; |
| 674 | } |
| 675 | if( find_option("internal","i",0)==0 ){ |
| 676 | zDiffCmd = diff_command_external(zCmd[0]=='g'); |
| 677 | } |
| 678 | diffFlags = diff_options(); |
| 679 | if( find_option("verbose","v",0)!=0 ) diffFlags |= DIFF_VERBOSE; |
| 680 | if( g.argc>4 ) usage(mprintf("%s ?STASHID? ?DIFF-OPTIONS?", zCmd)); |
| 681 | if( zDiffCmd ){ |
| 682 |
+6
-1
| --- src/stat.c | ||
| +++ src/stat.c | ||
| @@ -183,11 +183,16 @@ | ||
| 183 | 183 | @ (%h(RELEASE_VERSION)) <a href='version?verbose=1'>(details)</a> |
| 184 | 184 | @ </td></tr> |
| 185 | 185 | @ <tr><th>SQLite Version:</th><td>%.19s(sqlite3_sourceid()) |
| 186 | 186 | @ [%.10s(&sqlite3_sourceid()[20])] (%s(sqlite3_libversion())) |
| 187 | 187 | @ <a href='version?verbose=2'>(details)</a></td></tr> |
| 188 | - @ <tr><th>Schema Version:</th><td>%h(g.zAuxSchema)</td></tr> | |
| 188 | + if( g.eHashPolicy!=HPOLICY_AUTO ){ | |
| 189 | + @ <tr><th>Schema Version:</th><td>%h(g.zAuxSchema), | |
| 190 | + @ %s(hpolicy_name())</td></tr> | |
| 191 | + }else{ | |
| 192 | + @ <tr><th>Schema Version:</th><td>%h(g.zAuxSchema)</td></tr> | |
| 193 | + } | |
| 189 | 194 | @ <tr><th>Repository Rebuilt:</th><td> |
| 190 | 195 | @ %h(db_get_mtime("rebuilt","%Y-%m-%d %H:%M:%S","Never")) |
| 191 | 196 | @ By Fossil %h(db_get("rebuilt","Unknown"))</td></tr> |
| 192 | 197 | @ <tr><th>Database Stats:</th><td> |
| 193 | 198 | @ %d(db_int(0, "PRAGMA repository.page_count")) pages, |
| 194 | 199 |
| --- src/stat.c | |
| +++ src/stat.c | |
| @@ -183,11 +183,16 @@ | |
| 183 | @ (%h(RELEASE_VERSION)) <a href='version?verbose=1'>(details)</a> |
| 184 | @ </td></tr> |
| 185 | @ <tr><th>SQLite Version:</th><td>%.19s(sqlite3_sourceid()) |
| 186 | @ [%.10s(&sqlite3_sourceid()[20])] (%s(sqlite3_libversion())) |
| 187 | @ <a href='version?verbose=2'>(details)</a></td></tr> |
| 188 | @ <tr><th>Schema Version:</th><td>%h(g.zAuxSchema)</td></tr> |
| 189 | @ <tr><th>Repository Rebuilt:</th><td> |
| 190 | @ %h(db_get_mtime("rebuilt","%Y-%m-%d %H:%M:%S","Never")) |
| 191 | @ By Fossil %h(db_get("rebuilt","Unknown"))</td></tr> |
| 192 | @ <tr><th>Database Stats:</th><td> |
| 193 | @ %d(db_int(0, "PRAGMA repository.page_count")) pages, |
| 194 |
| --- src/stat.c | |
| +++ src/stat.c | |
| @@ -183,11 +183,16 @@ | |
| 183 | @ (%h(RELEASE_VERSION)) <a href='version?verbose=1'>(details)</a> |
| 184 | @ </td></tr> |
| 185 | @ <tr><th>SQLite Version:</th><td>%.19s(sqlite3_sourceid()) |
| 186 | @ [%.10s(&sqlite3_sourceid()[20])] (%s(sqlite3_libversion())) |
| 187 | @ <a href='version?verbose=2'>(details)</a></td></tr> |
| 188 | if( g.eHashPolicy!=HPOLICY_AUTO ){ |
| 189 | @ <tr><th>Schema Version:</th><td>%h(g.zAuxSchema), |
| 190 | @ %s(hpolicy_name())</td></tr> |
| 191 | }else{ |
| 192 | @ <tr><th>Schema Version:</th><td>%h(g.zAuxSchema)</td></tr> |
| 193 | } |
| 194 | @ <tr><th>Repository Rebuilt:</th><td> |
| 195 | @ %h(db_get_mtime("rebuilt","%Y-%m-%d %H:%M:%S","Never")) |
| 196 | @ By Fossil %h(db_get("rebuilt","Unknown"))</td></tr> |
| 197 | @ <tr><th>Database Stats:</th><td> |
| 198 | @ %d(db_int(0, "PRAGMA repository.page_count")) pages, |
| 199 |
+58
-1
| --- src/unversioned.c | ||
| +++ src/unversioned.c | ||
| @@ -456,11 +456,11 @@ | ||
| 456 | 456 | ** Query parameters: |
| 457 | 457 | ** |
| 458 | 458 | ** byage=1 Order the initial display be decreasing age |
| 459 | 459 | ** showdel=0 Show deleted files |
| 460 | 460 | */ |
| 461 | -void uvstat_page(void){ | |
| 461 | +void uvlist_page(void){ | |
| 462 | 462 | Stmt q; |
| 463 | 463 | sqlite3_int64 iNow; |
| 464 | 464 | sqlite3_int64 iTotalSz = 0; |
| 465 | 465 | int cnt = 0; |
| 466 | 466 | int n = 0; |
| @@ -554,5 +554,62 @@ | ||
| 554 | 554 | }else{ |
| 555 | 555 | @ No unversioned files on this server. |
| 556 | 556 | } |
| 557 | 557 | style_footer(); |
| 558 | 558 | } |
| 559 | + | |
| 560 | +/* | |
| 561 | +** WEBPAGE: juvlist | |
| 562 | +** | |
| 563 | +** Return a complete list of unversioned files as JSON. The JSON | |
| 564 | +** looks like this: | |
| 565 | +** | |
| 566 | +** [{"name":NAME, | |
| 567 | +** "mtime":MTIME, | |
| 568 | +** "hash":HASH, | |
| 569 | +** "size":SIZE, | |
| 570 | +** "user":USER}] | |
| 571 | +*/ | |
| 572 | +void uvlist_json_page(void){ | |
| 573 | + Stmt q; | |
| 574 | + char *zSep = "["; | |
| 575 | + Blob json; | |
| 576 | + | |
| 577 | + login_check_credentials(); | |
| 578 | + if( !g.perm.Read ){ login_needed(g.anon.Read); return; } | |
| 579 | + cgi_set_content_type("text/json"); | |
| 580 | + if( !db_table_exists("repository","unversioned") ){ | |
| 581 | + blob_init(&json, "[]", -1); | |
| 582 | + cgi_set_content(&json); | |
| 583 | + return; | |
| 584 | + } | |
| 585 | + blob_init(&json, 0, 0); | |
| 586 | + db_prepare(&q, | |
| 587 | + "SELECT" | |
| 588 | + " name," | |
| 589 | + " mtime," | |
| 590 | + " hash," | |
| 591 | + " sz," | |
| 592 | + " (SELECT login FROM rcvfrom, user" | |
| 593 | + " WHERE user.uid=rcvfrom.uid AND rcvfrom.rcvid=unversioned.rcvid)" | |
| 594 | + " FROM unversioned WHERE hash IS NOT NULL" | |
| 595 | + ); | |
| 596 | + while( db_step(&q)==SQLITE_ROW ){ | |
| 597 | + const char *zName = db_column_text(&q, 0); | |
| 598 | + sqlite3_int64 mtime = db_column_int(&q, 1); | |
| 599 | + const char *zHash = db_column_text(&q, 2); | |
| 600 | + int fullSize = db_column_int(&q, 3); | |
| 601 | + const char *zLogin = db_column_text(&q, 4); | |
| 602 | + if( zLogin==0 ) zLogin = ""; | |
| 603 | + blob_appendf(&json, "%s{\"name\":\"", zSep); | |
| 604 | + zSep = ",\n "; | |
| 605 | + blob_append_json_string(&json, zName); | |
| 606 | + blob_appendf(&json, "\",\n \"mtime\":%lld,\n \"hash\":\"", mtime); | |
| 607 | + blob_append_json_string(&json, zHash); | |
| 608 | + blob_appendf(&json, "\",\n \"size\":%d,\n \"user\":\"", fullSize); | |
| 609 | + blob_append_json_string(&json, zLogin); | |
| 610 | + blob_appendf(&json, "\"}"); | |
| 611 | + } | |
| 612 | + db_finalize(&q); | |
| 613 | + blob_appendf(&json,"]\n"); | |
| 614 | + cgi_set_content(&json); | |
| 615 | +} | |
| 559 | 616 |
| --- src/unversioned.c | |
| +++ src/unversioned.c | |
| @@ -456,11 +456,11 @@ | |
| 456 | ** Query parameters: |
| 457 | ** |
| 458 | ** byage=1 Order the initial display be decreasing age |
| 459 | ** showdel=0 Show deleted files |
| 460 | */ |
| 461 | void uvstat_page(void){ |
| 462 | Stmt q; |
| 463 | sqlite3_int64 iNow; |
| 464 | sqlite3_int64 iTotalSz = 0; |
| 465 | int cnt = 0; |
| 466 | int n = 0; |
| @@ -554,5 +554,62 @@ | |
| 554 | }else{ |
| 555 | @ No unversioned files on this server. |
| 556 | } |
| 557 | style_footer(); |
| 558 | } |
| 559 |
| --- src/unversioned.c | |
| +++ src/unversioned.c | |
| @@ -456,11 +456,11 @@ | |
| 456 | ** Query parameters: |
| 457 | ** |
| 458 | ** byage=1 Order the initial display be decreasing age |
| 459 | ** showdel=0 Show deleted files |
| 460 | */ |
| 461 | void uvlist_page(void){ |
| 462 | Stmt q; |
| 463 | sqlite3_int64 iNow; |
| 464 | sqlite3_int64 iTotalSz = 0; |
| 465 | int cnt = 0; |
| 466 | int n = 0; |
| @@ -554,5 +554,62 @@ | |
| 554 | }else{ |
| 555 | @ No unversioned files on this server. |
| 556 | } |
| 557 | style_footer(); |
| 558 | } |
| 559 | |
| 560 | /* |
| 561 | ** WEBPAGE: juvlist |
| 562 | ** |
| 563 | ** Return a complete list of unversioned files as JSON. The JSON |
| 564 | ** looks like this: |
| 565 | ** |
| 566 | ** [{"name":NAME, |
| 567 | ** "mtime":MTIME, |
| 568 | ** "hash":HASH, |
| 569 | ** "size":SIZE, |
| 570 | ** "user":USER}] |
| 571 | */ |
| 572 | void uvlist_json_page(void){ |
| 573 | Stmt q; |
| 574 | char *zSep = "["; |
| 575 | Blob json; |
| 576 | |
| 577 | login_check_credentials(); |
| 578 | if( !g.perm.Read ){ login_needed(g.anon.Read); return; } |
| 579 | cgi_set_content_type("text/json"); |
| 580 | if( !db_table_exists("repository","unversioned") ){ |
| 581 | blob_init(&json, "[]", -1); |
| 582 | cgi_set_content(&json); |
| 583 | return; |
| 584 | } |
| 585 | blob_init(&json, 0, 0); |
| 586 | db_prepare(&q, |
| 587 | "SELECT" |
| 588 | " name," |
| 589 | " mtime," |
| 590 | " hash," |
| 591 | " sz," |
| 592 | " (SELECT login FROM rcvfrom, user" |
| 593 | " WHERE user.uid=rcvfrom.uid AND rcvfrom.rcvid=unversioned.rcvid)" |
| 594 | " FROM unversioned WHERE hash IS NOT NULL" |
| 595 | ); |
| 596 | while( db_step(&q)==SQLITE_ROW ){ |
| 597 | const char *zName = db_column_text(&q, 0); |
| 598 | sqlite3_int64 mtime = db_column_int(&q, 1); |
| 599 | const char *zHash = db_column_text(&q, 2); |
| 600 | int fullSize = db_column_int(&q, 3); |
| 601 | const char *zLogin = db_column_text(&q, 4); |
| 602 | if( zLogin==0 ) zLogin = ""; |
| 603 | blob_appendf(&json, "%s{\"name\":\"", zSep); |
| 604 | zSep = ",\n "; |
| 605 | blob_append_json_string(&json, zName); |
| 606 | blob_appendf(&json, "\",\n \"mtime\":%lld,\n \"hash\":\"", mtime); |
| 607 | blob_append_json_string(&json, zHash); |
| 608 | blob_appendf(&json, "\",\n \"size\":%d,\n \"user\":\"", fullSize); |
| 609 | blob_append_json_string(&json, zLogin); |
| 610 | blob_appendf(&json, "\"}"); |
| 611 | } |
| 612 | db_finalize(&q); |
| 613 | blob_appendf(&json,"]\n"); |
| 614 | cgi_set_content(&json); |
| 615 | } |
| 616 |
+1
-1
| --- src/wiki.c | ||
| +++ src/wiki.c | ||
| @@ -1122,11 +1122,11 @@ | ||
| 1122 | 1122 | */ |
| 1123 | 1123 | int wiki_technote_to_rid(const char *zETime) { |
| 1124 | 1124 | int rid=0; /* Artifact ID of the tech note */ |
| 1125 | 1125 | int nETime = strlen(zETime); |
| 1126 | 1126 | Stmt q; |
| 1127 | - if( nETime>=4 && hname_validate(zETime, nETime) ){ | |
| 1127 | + if( nETime>=4 && nETime<=HNAME_MAX && validate16(zETime, nETime) ){ | |
| 1128 | 1128 | char zUuid[HNAME_MAX+1]; |
| 1129 | 1129 | memcpy(zUuid, zETime, nETime+1); |
| 1130 | 1130 | canonical16(zUuid, nETime); |
| 1131 | 1131 | db_prepare(&q, |
| 1132 | 1132 | "SELECT e.objid" |
| 1133 | 1133 |
| --- src/wiki.c | |
| +++ src/wiki.c | |
| @@ -1122,11 +1122,11 @@ | |
| 1122 | */ |
| 1123 | int wiki_technote_to_rid(const char *zETime) { |
| 1124 | int rid=0; /* Artifact ID of the tech note */ |
| 1125 | int nETime = strlen(zETime); |
| 1126 | Stmt q; |
| 1127 | if( nETime>=4 && hname_validate(zETime, nETime) ){ |
| 1128 | char zUuid[HNAME_MAX+1]; |
| 1129 | memcpy(zUuid, zETime, nETime+1); |
| 1130 | canonical16(zUuid, nETime); |
| 1131 | db_prepare(&q, |
| 1132 | "SELECT e.objid" |
| 1133 |
| --- src/wiki.c | |
| +++ src/wiki.c | |
| @@ -1122,11 +1122,11 @@ | |
| 1122 | */ |
| 1123 | int wiki_technote_to_rid(const char *zETime) { |
| 1124 | int rid=0; /* Artifact ID of the tech note */ |
| 1125 | int nETime = strlen(zETime); |
| 1126 | Stmt q; |
| 1127 | if( nETime>=4 && nETime<=HNAME_MAX && validate16(zETime, nETime) ){ |
| 1128 | char zUuid[HNAME_MAX+1]; |
| 1129 | memcpy(zUuid, zETime, nETime+1); |
| 1130 | canonical16(zUuid, nETime); |
| 1131 | db_prepare(&q, |
| 1132 | "SELECT e.objid" |
| 1133 |
+1
| --- src/xfer.c | ||
| +++ src/xfer.c | ||
| @@ -1768,10 +1768,11 @@ | ||
| 1768 | 1768 | memset(&xfer, 0, sizeof(xfer)); |
| 1769 | 1769 | xfer.pIn = &recv; |
| 1770 | 1770 | xfer.pOut = &send; |
| 1771 | 1771 | xfer.mxSend = db_get_int("max-upload", 250000); |
| 1772 | 1772 | xfer.maxTime = -1; |
| 1773 | + xfer.clientVersion = RELEASE_VERSION_NUMBER; | |
| 1773 | 1774 | if( syncFlags & SYNC_PRIVATE ){ |
| 1774 | 1775 | g.perm.Private = 1; |
| 1775 | 1776 | xfer.syncPrivate = 1; |
| 1776 | 1777 | } |
| 1777 | 1778 | |
| 1778 | 1779 |
| --- src/xfer.c | |
| +++ src/xfer.c | |
| @@ -1768,10 +1768,11 @@ | |
| 1768 | memset(&xfer, 0, sizeof(xfer)); |
| 1769 | xfer.pIn = &recv; |
| 1770 | xfer.pOut = &send; |
| 1771 | xfer.mxSend = db_get_int("max-upload", 250000); |
| 1772 | xfer.maxTime = -1; |
| 1773 | if( syncFlags & SYNC_PRIVATE ){ |
| 1774 | g.perm.Private = 1; |
| 1775 | xfer.syncPrivate = 1; |
| 1776 | } |
| 1777 | |
| 1778 |
| --- src/xfer.c | |
| +++ src/xfer.c | |
| @@ -1768,10 +1768,11 @@ | |
| 1768 | memset(&xfer, 0, sizeof(xfer)); |
| 1769 | xfer.pIn = &recv; |
| 1770 | xfer.pOut = &send; |
| 1771 | xfer.mxSend = db_get_int("max-upload", 250000); |
| 1772 | xfer.maxTime = -1; |
| 1773 | xfer.clientVersion = RELEASE_VERSION_NUMBER; |
| 1774 | if( syncFlags & SYNC_PRIVATE ){ |
| 1775 | g.perm.Private = 1; |
| 1776 | xfer.syncPrivate = 1; |
| 1777 | } |
| 1778 | |
| 1779 |
| --- win/Makefile.mingw.mistachkin | ||
| +++ win/Makefile.mingw.mistachkin | ||
| @@ -461,10 +461,11 @@ | ||
| 461 | 461 | $(SRCDIR)/fshell.c \ |
| 462 | 462 | $(SRCDIR)/fusefs.c \ |
| 463 | 463 | $(SRCDIR)/glob.c \ |
| 464 | 464 | $(SRCDIR)/graph.c \ |
| 465 | 465 | $(SRCDIR)/gzip.c \ |
| 466 | + $(SRCDIR)/hname.c \ | |
| 466 | 467 | $(SRCDIR)/http.c \ |
| 467 | 468 | $(SRCDIR)/http_socket.c \ |
| 468 | 469 | $(SRCDIR)/http_ssl.c \ |
| 469 | 470 | $(SRCDIR)/http_transport.c \ |
| 470 | 471 | $(SRCDIR)/import.c \ |
| @@ -511,10 +512,12 @@ | ||
| 511 | 512 | $(SRCDIR)/rss.c \ |
| 512 | 513 | $(SRCDIR)/schema.c \ |
| 513 | 514 | $(SRCDIR)/search.c \ |
| 514 | 515 | $(SRCDIR)/setup.c \ |
| 515 | 516 | $(SRCDIR)/sha1.c \ |
| 517 | + $(SRCDIR)/sha1hard.c \ | |
| 518 | + $(SRCDIR)/sha3.c \ | |
| 516 | 519 | $(SRCDIR)/shun.c \ |
| 517 | 520 | $(SRCDIR)/sitemap.c \ |
| 518 | 521 | $(SRCDIR)/skins.c \ |
| 519 | 522 | $(SRCDIR)/sqlcmd.c \ |
| 520 | 523 | $(SRCDIR)/stash.c \ |
| @@ -636,10 +639,11 @@ | ||
| 636 | 639 | $(OBJDIR)/fshell_.c \ |
| 637 | 640 | $(OBJDIR)/fusefs_.c \ |
| 638 | 641 | $(OBJDIR)/glob_.c \ |
| 639 | 642 | $(OBJDIR)/graph_.c \ |
| 640 | 643 | $(OBJDIR)/gzip_.c \ |
| 644 | + $(OBJDIR)/hname_.c \ | |
| 641 | 645 | $(OBJDIR)/http_.c \ |
| 642 | 646 | $(OBJDIR)/http_socket_.c \ |
| 643 | 647 | $(OBJDIR)/http_ssl_.c \ |
| 644 | 648 | $(OBJDIR)/http_transport_.c \ |
| 645 | 649 | $(OBJDIR)/import_.c \ |
| @@ -686,10 +690,12 @@ | ||
| 686 | 690 | $(OBJDIR)/rss_.c \ |
| 687 | 691 | $(OBJDIR)/schema_.c \ |
| 688 | 692 | $(OBJDIR)/search_.c \ |
| 689 | 693 | $(OBJDIR)/setup_.c \ |
| 690 | 694 | $(OBJDIR)/sha1_.c \ |
| 695 | + $(OBJDIR)/sha1hard_.c \ | |
| 696 | + $(OBJDIR)/sha3_.c \ | |
| 691 | 697 | $(OBJDIR)/shun_.c \ |
| 692 | 698 | $(OBJDIR)/sitemap_.c \ |
| 693 | 699 | $(OBJDIR)/skins_.c \ |
| 694 | 700 | $(OBJDIR)/sqlcmd_.c \ |
| 695 | 701 | $(OBJDIR)/stash_.c \ |
| @@ -760,10 +766,11 @@ | ||
| 760 | 766 | $(OBJDIR)/fshell.o \ |
| 761 | 767 | $(OBJDIR)/fusefs.o \ |
| 762 | 768 | $(OBJDIR)/glob.o \ |
| 763 | 769 | $(OBJDIR)/graph.o \ |
| 764 | 770 | $(OBJDIR)/gzip.o \ |
| 771 | + $(OBJDIR)/hname.o \ | |
| 765 | 772 | $(OBJDIR)/http.o \ |
| 766 | 773 | $(OBJDIR)/http_socket.o \ |
| 767 | 774 | $(OBJDIR)/http_ssl.o \ |
| 768 | 775 | $(OBJDIR)/http_transport.o \ |
| 769 | 776 | $(OBJDIR)/import.o \ |
| @@ -810,10 +817,12 @@ | ||
| 810 | 817 | $(OBJDIR)/rss.o \ |
| 811 | 818 | $(OBJDIR)/schema.o \ |
| 812 | 819 | $(OBJDIR)/search.o \ |
| 813 | 820 | $(OBJDIR)/setup.o \ |
| 814 | 821 | $(OBJDIR)/sha1.o \ |
| 822 | + $(OBJDIR)/sha1hard.o \ | |
| 823 | + $(OBJDIR)/sha3.o \ | |
| 815 | 824 | $(OBJDIR)/shun.o \ |
| 816 | 825 | $(OBJDIR)/sitemap.o \ |
| 817 | 826 | $(OBJDIR)/skins.o \ |
| 818 | 827 | $(OBJDIR)/sqlcmd.o \ |
| 819 | 828 | $(OBJDIR)/stash.o \ |
| @@ -1095,10 +1104,11 @@ | ||
| 1095 | 1104 | $(OBJDIR)/fshell_.c:$(OBJDIR)/fshell.h \ |
| 1096 | 1105 | $(OBJDIR)/fusefs_.c:$(OBJDIR)/fusefs.h \ |
| 1097 | 1106 | $(OBJDIR)/glob_.c:$(OBJDIR)/glob.h \ |
| 1098 | 1107 | $(OBJDIR)/graph_.c:$(OBJDIR)/graph.h \ |
| 1099 | 1108 | $(OBJDIR)/gzip_.c:$(OBJDIR)/gzip.h \ |
| 1109 | + $(OBJDIR)/hname_.c:$(OBJDIR)/hname.h \ | |
| 1100 | 1110 | $(OBJDIR)/http_.c:$(OBJDIR)/http.h \ |
| 1101 | 1111 | $(OBJDIR)/http_socket_.c:$(OBJDIR)/http_socket.h \ |
| 1102 | 1112 | $(OBJDIR)/http_ssl_.c:$(OBJDIR)/http_ssl.h \ |
| 1103 | 1113 | $(OBJDIR)/http_transport_.c:$(OBJDIR)/http_transport.h \ |
| 1104 | 1114 | $(OBJDIR)/import_.c:$(OBJDIR)/import.h \ |
| @@ -1145,10 +1155,12 @@ | ||
| 1145 | 1155 | $(OBJDIR)/rss_.c:$(OBJDIR)/rss.h \ |
| 1146 | 1156 | $(OBJDIR)/schema_.c:$(OBJDIR)/schema.h \ |
| 1147 | 1157 | $(OBJDIR)/search_.c:$(OBJDIR)/search.h \ |
| 1148 | 1158 | $(OBJDIR)/setup_.c:$(OBJDIR)/setup.h \ |
| 1149 | 1159 | $(OBJDIR)/sha1_.c:$(OBJDIR)/sha1.h \ |
| 1160 | + $(OBJDIR)/sha1hard_.c:$(OBJDIR)/sha1hard.h \ | |
| 1161 | + $(OBJDIR)/sha3_.c:$(OBJDIR)/sha3.h \ | |
| 1150 | 1162 | $(OBJDIR)/shun_.c:$(OBJDIR)/shun.h \ |
| 1151 | 1163 | $(OBJDIR)/sitemap_.c:$(OBJDIR)/sitemap.h \ |
| 1152 | 1164 | $(OBJDIR)/skins_.c:$(OBJDIR)/skins.h \ |
| 1153 | 1165 | $(OBJDIR)/sqlcmd_.c:$(OBJDIR)/sqlcmd.h \ |
| 1154 | 1166 | $(OBJDIR)/stash_.c:$(OBJDIR)/stash.h \ |
| @@ -1498,10 +1510,18 @@ | ||
| 1498 | 1510 | |
| 1499 | 1511 | $(OBJDIR)/gzip.o: $(OBJDIR)/gzip_.c $(OBJDIR)/gzip.h $(SRCDIR)/config.h |
| 1500 | 1512 | $(XTCC) -o $(OBJDIR)/gzip.o -c $(OBJDIR)/gzip_.c |
| 1501 | 1513 | |
| 1502 | 1514 | $(OBJDIR)/gzip.h: $(OBJDIR)/headers |
| 1515 | + | |
| 1516 | +$(OBJDIR)/hname_.c: $(SRCDIR)/hname.c $(TRANSLATE) | |
| 1517 | + $(TRANSLATE) $(SRCDIR)/hname.c >$@ | |
| 1518 | + | |
| 1519 | +$(OBJDIR)/hname.o: $(OBJDIR)/hname_.c $(OBJDIR)/hname.h $(SRCDIR)/config.h | |
| 1520 | + $(XTCC) -o $(OBJDIR)/hname.o -c $(OBJDIR)/hname_.c | |
| 1521 | + | |
| 1522 | +$(OBJDIR)/hname.h: $(OBJDIR)/headers | |
| 1503 | 1523 | |
| 1504 | 1524 | $(OBJDIR)/http_.c: $(SRCDIR)/http.c $(TRANSLATE) |
| 1505 | 1525 | $(TRANSLATE) $(SRCDIR)/http.c >$@ |
| 1506 | 1526 | |
| 1507 | 1527 | $(OBJDIR)/http.o: $(OBJDIR)/http_.c $(OBJDIR)/http.h $(SRCDIR)/config.h |
| @@ -1898,10 +1918,26 @@ | ||
| 1898 | 1918 | |
| 1899 | 1919 | $(OBJDIR)/sha1.o: $(OBJDIR)/sha1_.c $(OBJDIR)/sha1.h $(SRCDIR)/config.h |
| 1900 | 1920 | $(XTCC) -o $(OBJDIR)/sha1.o -c $(OBJDIR)/sha1_.c |
| 1901 | 1921 | |
| 1902 | 1922 | $(OBJDIR)/sha1.h: $(OBJDIR)/headers |
| 1923 | + | |
| 1924 | +$(OBJDIR)/sha1hard_.c: $(SRCDIR)/sha1hard.c $(TRANSLATE) | |
| 1925 | + $(TRANSLATE) $(SRCDIR)/sha1hard.c >$@ | |
| 1926 | + | |
| 1927 | +$(OBJDIR)/sha1hard.o: $(OBJDIR)/sha1hard_.c $(OBJDIR)/sha1hard.h $(SRCDIR)/config.h | |
| 1928 | + $(XTCC) -o $(OBJDIR)/sha1hard.o -c $(OBJDIR)/sha1hard_.c | |
| 1929 | + | |
| 1930 | +$(OBJDIR)/sha1hard.h: $(OBJDIR)/headers | |
| 1931 | + | |
| 1932 | +$(OBJDIR)/sha3_.c: $(SRCDIR)/sha3.c $(TRANSLATE) | |
| 1933 | + $(TRANSLATE) $(SRCDIR)/sha3.c >$@ | |
| 1934 | + | |
| 1935 | +$(OBJDIR)/sha3.o: $(OBJDIR)/sha3_.c $(OBJDIR)/sha3.h $(SRCDIR)/config.h | |
| 1936 | + $(XTCC) -o $(OBJDIR)/sha3.o -c $(OBJDIR)/sha3_.c | |
| 1937 | + | |
| 1938 | +$(OBJDIR)/sha3.h: $(OBJDIR)/headers | |
| 1903 | 1939 | |
| 1904 | 1940 | $(OBJDIR)/shun_.c: $(SRCDIR)/shun.c $(TRANSLATE) |
| 1905 | 1941 | $(TRANSLATE) $(SRCDIR)/shun.c >$@ |
| 1906 | 1942 | |
| 1907 | 1943 | $(OBJDIR)/shun.o: $(OBJDIR)/shun_.c $(OBJDIR)/shun.h $(SRCDIR)/config.h |
| 1908 | 1944 |
| --- win/Makefile.mingw.mistachkin | |
| +++ win/Makefile.mingw.mistachkin | |
| @@ -461,10 +461,11 @@ | |
| 461 | $(SRCDIR)/fshell.c \ |
| 462 | $(SRCDIR)/fusefs.c \ |
| 463 | $(SRCDIR)/glob.c \ |
| 464 | $(SRCDIR)/graph.c \ |
| 465 | $(SRCDIR)/gzip.c \ |
| 466 | $(SRCDIR)/http.c \ |
| 467 | $(SRCDIR)/http_socket.c \ |
| 468 | $(SRCDIR)/http_ssl.c \ |
| 469 | $(SRCDIR)/http_transport.c \ |
| 470 | $(SRCDIR)/import.c \ |
| @@ -511,10 +512,12 @@ | |
| 511 | $(SRCDIR)/rss.c \ |
| 512 | $(SRCDIR)/schema.c \ |
| 513 | $(SRCDIR)/search.c \ |
| 514 | $(SRCDIR)/setup.c \ |
| 515 | $(SRCDIR)/sha1.c \ |
| 516 | $(SRCDIR)/shun.c \ |
| 517 | $(SRCDIR)/sitemap.c \ |
| 518 | $(SRCDIR)/skins.c \ |
| 519 | $(SRCDIR)/sqlcmd.c \ |
| 520 | $(SRCDIR)/stash.c \ |
| @@ -636,10 +639,11 @@ | |
| 636 | $(OBJDIR)/fshell_.c \ |
| 637 | $(OBJDIR)/fusefs_.c \ |
| 638 | $(OBJDIR)/glob_.c \ |
| 639 | $(OBJDIR)/graph_.c \ |
| 640 | $(OBJDIR)/gzip_.c \ |
| 641 | $(OBJDIR)/http_.c \ |
| 642 | $(OBJDIR)/http_socket_.c \ |
| 643 | $(OBJDIR)/http_ssl_.c \ |
| 644 | $(OBJDIR)/http_transport_.c \ |
| 645 | $(OBJDIR)/import_.c \ |
| @@ -686,10 +690,12 @@ | |
| 686 | $(OBJDIR)/rss_.c \ |
| 687 | $(OBJDIR)/schema_.c \ |
| 688 | $(OBJDIR)/search_.c \ |
| 689 | $(OBJDIR)/setup_.c \ |
| 690 | $(OBJDIR)/sha1_.c \ |
| 691 | $(OBJDIR)/shun_.c \ |
| 692 | $(OBJDIR)/sitemap_.c \ |
| 693 | $(OBJDIR)/skins_.c \ |
| 694 | $(OBJDIR)/sqlcmd_.c \ |
| 695 | $(OBJDIR)/stash_.c \ |
| @@ -760,10 +766,11 @@ | |
| 760 | $(OBJDIR)/fshell.o \ |
| 761 | $(OBJDIR)/fusefs.o \ |
| 762 | $(OBJDIR)/glob.o \ |
| 763 | $(OBJDIR)/graph.o \ |
| 764 | $(OBJDIR)/gzip.o \ |
| 765 | $(OBJDIR)/http.o \ |
| 766 | $(OBJDIR)/http_socket.o \ |
| 767 | $(OBJDIR)/http_ssl.o \ |
| 768 | $(OBJDIR)/http_transport.o \ |
| 769 | $(OBJDIR)/import.o \ |
| @@ -810,10 +817,12 @@ | |
| 810 | $(OBJDIR)/rss.o \ |
| 811 | $(OBJDIR)/schema.o \ |
| 812 | $(OBJDIR)/search.o \ |
| 813 | $(OBJDIR)/setup.o \ |
| 814 | $(OBJDIR)/sha1.o \ |
| 815 | $(OBJDIR)/shun.o \ |
| 816 | $(OBJDIR)/sitemap.o \ |
| 817 | $(OBJDIR)/skins.o \ |
| 818 | $(OBJDIR)/sqlcmd.o \ |
| 819 | $(OBJDIR)/stash.o \ |
| @@ -1095,10 +1104,11 @@ | |
| 1095 | $(OBJDIR)/fshell_.c:$(OBJDIR)/fshell.h \ |
| 1096 | $(OBJDIR)/fusefs_.c:$(OBJDIR)/fusefs.h \ |
| 1097 | $(OBJDIR)/glob_.c:$(OBJDIR)/glob.h \ |
| 1098 | $(OBJDIR)/graph_.c:$(OBJDIR)/graph.h \ |
| 1099 | $(OBJDIR)/gzip_.c:$(OBJDIR)/gzip.h \ |
| 1100 | $(OBJDIR)/http_.c:$(OBJDIR)/http.h \ |
| 1101 | $(OBJDIR)/http_socket_.c:$(OBJDIR)/http_socket.h \ |
| 1102 | $(OBJDIR)/http_ssl_.c:$(OBJDIR)/http_ssl.h \ |
| 1103 | $(OBJDIR)/http_transport_.c:$(OBJDIR)/http_transport.h \ |
| 1104 | $(OBJDIR)/import_.c:$(OBJDIR)/import.h \ |
| @@ -1145,10 +1155,12 @@ | |
| 1145 | $(OBJDIR)/rss_.c:$(OBJDIR)/rss.h \ |
| 1146 | $(OBJDIR)/schema_.c:$(OBJDIR)/schema.h \ |
| 1147 | $(OBJDIR)/search_.c:$(OBJDIR)/search.h \ |
| 1148 | $(OBJDIR)/setup_.c:$(OBJDIR)/setup.h \ |
| 1149 | $(OBJDIR)/sha1_.c:$(OBJDIR)/sha1.h \ |
| 1150 | $(OBJDIR)/shun_.c:$(OBJDIR)/shun.h \ |
| 1151 | $(OBJDIR)/sitemap_.c:$(OBJDIR)/sitemap.h \ |
| 1152 | $(OBJDIR)/skins_.c:$(OBJDIR)/skins.h \ |
| 1153 | $(OBJDIR)/sqlcmd_.c:$(OBJDIR)/sqlcmd.h \ |
| 1154 | $(OBJDIR)/stash_.c:$(OBJDIR)/stash.h \ |
| @@ -1498,10 +1510,18 @@ | |
| 1498 | |
| 1499 | $(OBJDIR)/gzip.o: $(OBJDIR)/gzip_.c $(OBJDIR)/gzip.h $(SRCDIR)/config.h |
| 1500 | $(XTCC) -o $(OBJDIR)/gzip.o -c $(OBJDIR)/gzip_.c |
| 1501 | |
| 1502 | $(OBJDIR)/gzip.h: $(OBJDIR)/headers |
| 1503 | |
| 1504 | $(OBJDIR)/http_.c: $(SRCDIR)/http.c $(TRANSLATE) |
| 1505 | $(TRANSLATE) $(SRCDIR)/http.c >$@ |
| 1506 | |
| 1507 | $(OBJDIR)/http.o: $(OBJDIR)/http_.c $(OBJDIR)/http.h $(SRCDIR)/config.h |
| @@ -1898,10 +1918,26 @@ | |
| 1898 | |
| 1899 | $(OBJDIR)/sha1.o: $(OBJDIR)/sha1_.c $(OBJDIR)/sha1.h $(SRCDIR)/config.h |
| 1900 | $(XTCC) -o $(OBJDIR)/sha1.o -c $(OBJDIR)/sha1_.c |
| 1901 | |
| 1902 | $(OBJDIR)/sha1.h: $(OBJDIR)/headers |
| 1903 | |
| 1904 | $(OBJDIR)/shun_.c: $(SRCDIR)/shun.c $(TRANSLATE) |
| 1905 | $(TRANSLATE) $(SRCDIR)/shun.c >$@ |
| 1906 | |
| 1907 | $(OBJDIR)/shun.o: $(OBJDIR)/shun_.c $(OBJDIR)/shun.h $(SRCDIR)/config.h |
| 1908 |
| --- win/Makefile.mingw.mistachkin | |
| +++ win/Makefile.mingw.mistachkin | |
| @@ -461,10 +461,11 @@ | |
| 461 | $(SRCDIR)/fshell.c \ |
| 462 | $(SRCDIR)/fusefs.c \ |
| 463 | $(SRCDIR)/glob.c \ |
| 464 | $(SRCDIR)/graph.c \ |
| 465 | $(SRCDIR)/gzip.c \ |
| 466 | $(SRCDIR)/hname.c \ |
| 467 | $(SRCDIR)/http.c \ |
| 468 | $(SRCDIR)/http_socket.c \ |
| 469 | $(SRCDIR)/http_ssl.c \ |
| 470 | $(SRCDIR)/http_transport.c \ |
| 471 | $(SRCDIR)/import.c \ |
| @@ -511,10 +512,12 @@ | |
| 512 | $(SRCDIR)/rss.c \ |
| 513 | $(SRCDIR)/schema.c \ |
| 514 | $(SRCDIR)/search.c \ |
| 515 | $(SRCDIR)/setup.c \ |
| 516 | $(SRCDIR)/sha1.c \ |
| 517 | $(SRCDIR)/sha1hard.c \ |
| 518 | $(SRCDIR)/sha3.c \ |
| 519 | $(SRCDIR)/shun.c \ |
| 520 | $(SRCDIR)/sitemap.c \ |
| 521 | $(SRCDIR)/skins.c \ |
| 522 | $(SRCDIR)/sqlcmd.c \ |
| 523 | $(SRCDIR)/stash.c \ |
| @@ -636,10 +639,11 @@ | |
| 639 | $(OBJDIR)/fshell_.c \ |
| 640 | $(OBJDIR)/fusefs_.c \ |
| 641 | $(OBJDIR)/glob_.c \ |
| 642 | $(OBJDIR)/graph_.c \ |
| 643 | $(OBJDIR)/gzip_.c \ |
| 644 | $(OBJDIR)/hname_.c \ |
| 645 | $(OBJDIR)/http_.c \ |
| 646 | $(OBJDIR)/http_socket_.c \ |
| 647 | $(OBJDIR)/http_ssl_.c \ |
| 648 | $(OBJDIR)/http_transport_.c \ |
| 649 | $(OBJDIR)/import_.c \ |
| @@ -686,10 +690,12 @@ | |
| 690 | $(OBJDIR)/rss_.c \ |
| 691 | $(OBJDIR)/schema_.c \ |
| 692 | $(OBJDIR)/search_.c \ |
| 693 | $(OBJDIR)/setup_.c \ |
| 694 | $(OBJDIR)/sha1_.c \ |
| 695 | $(OBJDIR)/sha1hard_.c \ |
| 696 | $(OBJDIR)/sha3_.c \ |
| 697 | $(OBJDIR)/shun_.c \ |
| 698 | $(OBJDIR)/sitemap_.c \ |
| 699 | $(OBJDIR)/skins_.c \ |
| 700 | $(OBJDIR)/sqlcmd_.c \ |
| 701 | $(OBJDIR)/stash_.c \ |
| @@ -760,10 +766,11 @@ | |
| 766 | $(OBJDIR)/fshell.o \ |
| 767 | $(OBJDIR)/fusefs.o \ |
| 768 | $(OBJDIR)/glob.o \ |
| 769 | $(OBJDIR)/graph.o \ |
| 770 | $(OBJDIR)/gzip.o \ |
| 771 | $(OBJDIR)/hname.o \ |
| 772 | $(OBJDIR)/http.o \ |
| 773 | $(OBJDIR)/http_socket.o \ |
| 774 | $(OBJDIR)/http_ssl.o \ |
| 775 | $(OBJDIR)/http_transport.o \ |
| 776 | $(OBJDIR)/import.o \ |
| @@ -810,10 +817,12 @@ | |
| 817 | $(OBJDIR)/rss.o \ |
| 818 | $(OBJDIR)/schema.o \ |
| 819 | $(OBJDIR)/search.o \ |
| 820 | $(OBJDIR)/setup.o \ |
| 821 | $(OBJDIR)/sha1.o \ |
| 822 | $(OBJDIR)/sha1hard.o \ |
| 823 | $(OBJDIR)/sha3.o \ |
| 824 | $(OBJDIR)/shun.o \ |
| 825 | $(OBJDIR)/sitemap.o \ |
| 826 | $(OBJDIR)/skins.o \ |
| 827 | $(OBJDIR)/sqlcmd.o \ |
| 828 | $(OBJDIR)/stash.o \ |
| @@ -1095,10 +1104,11 @@ | |
| 1104 | $(OBJDIR)/fshell_.c:$(OBJDIR)/fshell.h \ |
| 1105 | $(OBJDIR)/fusefs_.c:$(OBJDIR)/fusefs.h \ |
| 1106 | $(OBJDIR)/glob_.c:$(OBJDIR)/glob.h \ |
| 1107 | $(OBJDIR)/graph_.c:$(OBJDIR)/graph.h \ |
| 1108 | $(OBJDIR)/gzip_.c:$(OBJDIR)/gzip.h \ |
| 1109 | $(OBJDIR)/hname_.c:$(OBJDIR)/hname.h \ |
| 1110 | $(OBJDIR)/http_.c:$(OBJDIR)/http.h \ |
| 1111 | $(OBJDIR)/http_socket_.c:$(OBJDIR)/http_socket.h \ |
| 1112 | $(OBJDIR)/http_ssl_.c:$(OBJDIR)/http_ssl.h \ |
| 1113 | $(OBJDIR)/http_transport_.c:$(OBJDIR)/http_transport.h \ |
| 1114 | $(OBJDIR)/import_.c:$(OBJDIR)/import.h \ |
| @@ -1145,10 +1155,12 @@ | |
| 1155 | $(OBJDIR)/rss_.c:$(OBJDIR)/rss.h \ |
| 1156 | $(OBJDIR)/schema_.c:$(OBJDIR)/schema.h \ |
| 1157 | $(OBJDIR)/search_.c:$(OBJDIR)/search.h \ |
| 1158 | $(OBJDIR)/setup_.c:$(OBJDIR)/setup.h \ |
| 1159 | $(OBJDIR)/sha1_.c:$(OBJDIR)/sha1.h \ |
| 1160 | $(OBJDIR)/sha1hard_.c:$(OBJDIR)/sha1hard.h \ |
| 1161 | $(OBJDIR)/sha3_.c:$(OBJDIR)/sha3.h \ |
| 1162 | $(OBJDIR)/shun_.c:$(OBJDIR)/shun.h \ |
| 1163 | $(OBJDIR)/sitemap_.c:$(OBJDIR)/sitemap.h \ |
| 1164 | $(OBJDIR)/skins_.c:$(OBJDIR)/skins.h \ |
| 1165 | $(OBJDIR)/sqlcmd_.c:$(OBJDIR)/sqlcmd.h \ |
| 1166 | $(OBJDIR)/stash_.c:$(OBJDIR)/stash.h \ |
| @@ -1498,10 +1510,18 @@ | |
| 1510 | |
| 1511 | $(OBJDIR)/gzip.o: $(OBJDIR)/gzip_.c $(OBJDIR)/gzip.h $(SRCDIR)/config.h |
| 1512 | $(XTCC) -o $(OBJDIR)/gzip.o -c $(OBJDIR)/gzip_.c |
| 1513 | |
| 1514 | $(OBJDIR)/gzip.h: $(OBJDIR)/headers |
| 1515 | |
| 1516 | $(OBJDIR)/hname_.c: $(SRCDIR)/hname.c $(TRANSLATE) |
| 1517 | $(TRANSLATE) $(SRCDIR)/hname.c >$@ |
| 1518 | |
| 1519 | $(OBJDIR)/hname.o: $(OBJDIR)/hname_.c $(OBJDIR)/hname.h $(SRCDIR)/config.h |
| 1520 | $(XTCC) -o $(OBJDIR)/hname.o -c $(OBJDIR)/hname_.c |
| 1521 | |
| 1522 | $(OBJDIR)/hname.h: $(OBJDIR)/headers |
| 1523 | |
| 1524 | $(OBJDIR)/http_.c: $(SRCDIR)/http.c $(TRANSLATE) |
| 1525 | $(TRANSLATE) $(SRCDIR)/http.c >$@ |
| 1526 | |
| 1527 | $(OBJDIR)/http.o: $(OBJDIR)/http_.c $(OBJDIR)/http.h $(SRCDIR)/config.h |
| @@ -1898,10 +1918,26 @@ | |
| 1918 | |
| 1919 | $(OBJDIR)/sha1.o: $(OBJDIR)/sha1_.c $(OBJDIR)/sha1.h $(SRCDIR)/config.h |
| 1920 | $(XTCC) -o $(OBJDIR)/sha1.o -c $(OBJDIR)/sha1_.c |
| 1921 | |
| 1922 | $(OBJDIR)/sha1.h: $(OBJDIR)/headers |
| 1923 | |
| 1924 | $(OBJDIR)/sha1hard_.c: $(SRCDIR)/sha1hard.c $(TRANSLATE) |
| 1925 | $(TRANSLATE) $(SRCDIR)/sha1hard.c >$@ |
| 1926 | |
| 1927 | $(OBJDIR)/sha1hard.o: $(OBJDIR)/sha1hard_.c $(OBJDIR)/sha1hard.h $(SRCDIR)/config.h |
| 1928 | $(XTCC) -o $(OBJDIR)/sha1hard.o -c $(OBJDIR)/sha1hard_.c |
| 1929 | |
| 1930 | $(OBJDIR)/sha1hard.h: $(OBJDIR)/headers |
| 1931 | |
| 1932 | $(OBJDIR)/sha3_.c: $(SRCDIR)/sha3.c $(TRANSLATE) |
| 1933 | $(TRANSLATE) $(SRCDIR)/sha3.c >$@ |
| 1934 | |
| 1935 | $(OBJDIR)/sha3.o: $(OBJDIR)/sha3_.c $(OBJDIR)/sha3.h $(SRCDIR)/config.h |
| 1936 | $(XTCC) -o $(OBJDIR)/sha3.o -c $(OBJDIR)/sha3_.c |
| 1937 | |
| 1938 | $(OBJDIR)/sha3.h: $(OBJDIR)/headers |
| 1939 | |
| 1940 | $(OBJDIR)/shun_.c: $(SRCDIR)/shun.c $(TRANSLATE) |
| 1941 | $(TRANSLATE) $(SRCDIR)/shun.c >$@ |
| 1942 | |
| 1943 | $(OBJDIR)/shun.o: $(OBJDIR)/shun_.c $(OBJDIR)/shun.h $(SRCDIR)/config.h |
| 1944 |
+11
| --- www/changes.wiki | ||
| +++ www/changes.wiki | ||
| @@ -1,6 +1,17 @@ | ||
| 1 | 1 | <title>Change Log</title> |
| 2 | + | |
| 3 | +<a name='v2_1'></a> | |
| 4 | +<h2>Changes for Version 2.1 (2017-03-??)</h2> | |
| 5 | + | |
| 6 | + * Add support for [./hashpolicy.wiki|hash policies] that control which | |
| 7 | + of the Hardened-SHA1 or SHA3-256 algorithms is used to name new | |
| 8 | + artifacts. | |
| 9 | + * Add the "gshow" and "gcat" subcommands to [/help?cmd=stash|fossil stash]. | |
| 10 | + * Add the [/help?cmd=/juvlist|/juvlist] web page and use it to construct | |
| 11 | + the [/uv/download.html|Download Page] of the Fossil self-hosting website | |
| 12 | + using Ajax. | |
| 2 | 13 | |
| 3 | 14 | <a name='v2_0'></a> |
| 4 | 15 | <h2>Changes for Version 2.0 (2017-03-03)</h2> |
| 5 | 16 | |
| 6 | 17 | * Use the |
| 7 | 18 | |
| 8 | 19 | ADDED www/hashpolicy.wiki |
| --- www/changes.wiki | |
| +++ www/changes.wiki | |
| @@ -1,6 +1,17 @@ | |
| 1 | <title>Change Log</title> |
| 2 | |
| 3 | <a name='v2_0'></a> |
| 4 | <h2>Changes for Version 2.0 (2017-03-03)</h2> |
| 5 | |
| 6 | * Use the |
| 7 | |
| 8 | DDED www/hashpolicy.wiki |
| --- www/changes.wiki | |
| +++ www/changes.wiki | |
| @@ -1,6 +1,17 @@ | |
| 1 | <title>Change Log</title> |
| 2 | |
| 3 | <a name='v2_1'></a> |
| 4 | <h2>Changes for Version 2.1 (2017-03-??)</h2> |
| 5 | |
| 6 | * Add support for [./hashpolicy.wiki|hash policies] that control which |
| 7 | of the Hardened-SHA1 or SHA3-256 algorithms is used to name new |
| 8 | artifacts. |
| 9 | * Add the "gshow" and "gcat" subcommands to [/help?cmd=stash|fossil stash]. |
| 10 | * Add the [/help?cmd=/juvlist|/juvlist] web page and use it to construct |
| 11 | the [/uv/download.html|Download Page] of the Fossil self-hosting website |
| 12 | using Ajax. |
| 13 | |
| 14 | <a name='v2_0'></a> |
| 15 | <h2>Changes for Version 2.0 (2017-03-03)</h2> |
| 16 | |
| 17 | * Use the |
| 18 | |
| 19 | DDED www/hashpolicy.wiki |
+20
| --- a/www/hashpolicy.wiki | ||
| +++ b/www/hashpolicy.wiki | ||
| @@ -0,0 +1,20 @@ | ||
| 1 | +<title>Hash Policy</title> | |
| 2 | + | |
| 3 | +<h2> Executive Summary, Orcutive Summary</h2> | |
| 4 | + | |
| 5 | +<b>Or: How To </h2> | |
| 6 | + | |
| 7 | +There i This Article</b> | |
| 8 | + | |
| 9 | +Thham now | |
| 10 | +upgraded to | |
| 11 | +change texpected to be | |
| 12 | +replaced ot expected to be | |
| 13 | +replaced until Ma | |
| 14 | +out o | |
| 15 | +Debian 9 is implement0 or later | |
| 16 | + | |
| 17 | +work and | |
| 18 | +Hash Policy</title> | |
| 19 | + | |
| 20 | +<h2>< Introduction ha", not generic SHA1sequel |
| --- a/www/hashpolicy.wiki | |
| +++ b/www/hashpolicy.wiki | |
| @@ -0,0 +1,20 @@ | |
| --- a/www/hashpolicy.wiki | |
| +++ b/www/hashpolicy.wiki | |
| @@ -0,0 +1,20 @@ | |
| 1 | <title>Hash Policy</title> |
| 2 | |
| 3 | <h2> Executive Summary, Orcutive Summary</h2> |
| 4 | |
| 5 | <b>Or: How To </h2> |
| 6 | |
| 7 | There i This Article</b> |
| 8 | |
| 9 | Thham now |
| 10 | upgraded to |
| 11 | change texpected to be |
| 12 | replaced ot expected to be |
| 13 | replaced until Ma |
| 14 | out o |
| 15 | Debian 9 is implement0 or later |
| 16 | |
| 17 | work and |
| 18 | Hash Policy</title> |
| 19 | |
| 20 | <h2>< Introduction ha", not generic SHA1sequel |
+2
-2
| --- www/mkdownload.tcl | ||
| +++ www/mkdownload.tcl | ||
| @@ -37,12 +37,12 @@ | ||
| 37 | 37 | set avers($version) 1 |
| 38 | 38 | } |
| 39 | 39 | } |
| 40 | 40 | close $in |
| 41 | 41 | |
| 42 | +set vdate(2.0) 2017-03-03 | |
| 42 | 43 | set vdate(1.37) 2017-01-15 |
| 43 | -set vdate(1.36) 2016-10-24 | |
| 44 | 44 | |
| 45 | 45 | # Do all versions from newest to oldest |
| 46 | 46 | # |
| 47 | 47 | foreach vers [lsort -decr -real [array names avers]] { |
| 48 | 48 | # set hr "../timeline?c=version-$vers;y=ci" |
| @@ -57,11 +57,11 @@ | ||
| 57 | 57 | puts $out "</b></center>" |
| 58 | 58 | puts $out "</td></tr>" |
| 59 | 59 | puts $out "<tr>" |
| 60 | 60 | |
| 61 | 61 | foreach {prefix img desc} { |
| 62 | - fossil-linux-x86 linux.gif {Linux 3.x x86} | |
| 62 | + fossil-linux linux.gif {Linux 3.x x64} | |
| 63 | 63 | fossil-macosx mac.gif {Mac 10.x x86} |
| 64 | 64 | fossil-openbsd-x86 openbsd.gif {OpenBSD 5.x x86} |
| 65 | 65 | fossil-w32 win32.gif {Windows} |
| 66 | 66 | fossil-src src.gif {Source Tarball} |
| 67 | 67 | } { |
| 68 | 68 |
| --- www/mkdownload.tcl | |
| +++ www/mkdownload.tcl | |
| @@ -37,12 +37,12 @@ | |
| 37 | set avers($version) 1 |
| 38 | } |
| 39 | } |
| 40 | close $in |
| 41 | |
| 42 | set vdate(1.37) 2017-01-15 |
| 43 | set vdate(1.36) 2016-10-24 |
| 44 | |
| 45 | # Do all versions from newest to oldest |
| 46 | # |
| 47 | foreach vers [lsort -decr -real [array names avers]] { |
| 48 | # set hr "../timeline?c=version-$vers;y=ci" |
| @@ -57,11 +57,11 @@ | |
| 57 | puts $out "</b></center>" |
| 58 | puts $out "</td></tr>" |
| 59 | puts $out "<tr>" |
| 60 | |
| 61 | foreach {prefix img desc} { |
| 62 | fossil-linux-x86 linux.gif {Linux 3.x x86} |
| 63 | fossil-macosx mac.gif {Mac 10.x x86} |
| 64 | fossil-openbsd-x86 openbsd.gif {OpenBSD 5.x x86} |
| 65 | fossil-w32 win32.gif {Windows} |
| 66 | fossil-src src.gif {Source Tarball} |
| 67 | } { |
| 68 |
| --- www/mkdownload.tcl | |
| +++ www/mkdownload.tcl | |
| @@ -37,12 +37,12 @@ | |
| 37 | set avers($version) 1 |
| 38 | } |
| 39 | } |
| 40 | close $in |
| 41 | |
| 42 | set vdate(2.0) 2017-03-03 |
| 43 | set vdate(1.37) 2017-01-15 |
| 44 | |
| 45 | # Do all versions from newest to oldest |
| 46 | # |
| 47 | foreach vers [lsort -decr -real [array names avers]] { |
| 48 | # set hr "../timeline?c=version-$vers;y=ci" |
| @@ -57,11 +57,11 @@ | |
| 57 | puts $out "</b></center>" |
| 58 | puts $out "</td></tr>" |
| 59 | puts $out "<tr>" |
| 60 | |
| 61 | foreach {prefix img desc} { |
| 62 | fossil-linux linux.gif {Linux 3.x x64} |
| 63 | fossil-macosx mac.gif {Mac 10.x x86} |
| 64 | fossil-openbsd-x86 openbsd.gif {OpenBSD 5.x x86} |
| 65 | fossil-w32 win32.gif {Windows} |
| 66 | fossil-src src.gif {Source Tarball} |
| 67 | } { |
| 68 |
+1
| --- www/mkindex.tcl | ||
| +++ www/mkindex.tcl | ||
| @@ -36,10 +36,11 @@ | ||
| 36 | 36 | fiveminutes.wiki {Update and Running in 5 Minutes as a Single User} |
| 37 | 37 | foss-cklist.wiki {Checklist For Successful Open-Source Projects} |
| 38 | 38 | fossil-from-msvc.wiki {Integrating Fossil in the Microsoft Express 2010 IDE} |
| 39 | 39 | fossil-v-git.wiki {Fossil Versus Git} |
| 40 | 40 | hacker-howto.wiki {Hacker How-To} |
| 41 | + hashpolicy.wiki {Hash Policy: Choosing Between SHA1 and SHA3-256} | |
| 41 | 42 | /help {Lists of Commands and Webpages} |
| 42 | 43 | hints.wiki {Fossil Tips And Usage Hints} |
| 43 | 44 | index.wiki {Home Page} |
| 44 | 45 | inout.wiki {Import And Export To And From Git} |
| 45 | 46 | makefile.wiki {The Fossil Build Process} |
| 46 | 47 |
| --- www/mkindex.tcl | |
| +++ www/mkindex.tcl | |
| @@ -36,10 +36,11 @@ | |
| 36 | fiveminutes.wiki {Update and Running in 5 Minutes as a Single User} |
| 37 | foss-cklist.wiki {Checklist For Successful Open-Source Projects} |
| 38 | fossil-from-msvc.wiki {Integrating Fossil in the Microsoft Express 2010 IDE} |
| 39 | fossil-v-git.wiki {Fossil Versus Git} |
| 40 | hacker-howto.wiki {Hacker How-To} |
| 41 | /help {Lists of Commands and Webpages} |
| 42 | hints.wiki {Fossil Tips And Usage Hints} |
| 43 | index.wiki {Home Page} |
| 44 | inout.wiki {Import And Export To And From Git} |
| 45 | makefile.wiki {The Fossil Build Process} |
| 46 |
| --- www/mkindex.tcl | |
| +++ www/mkindex.tcl | |
| @@ -36,10 +36,11 @@ | |
| 36 | fiveminutes.wiki {Update and Running in 5 Minutes as a Single User} |
| 37 | foss-cklist.wiki {Checklist For Successful Open-Source Projects} |
| 38 | fossil-from-msvc.wiki {Integrating Fossil in the Microsoft Express 2010 IDE} |
| 39 | fossil-v-git.wiki {Fossil Versus Git} |
| 40 | hacker-howto.wiki {Hacker How-To} |
| 41 | hashpolicy.wiki {Hash Policy: Choosing Between SHA1 and SHA3-256} |
| 42 | /help {Lists of Commands and Webpages} |
| 43 | hints.wiki {Fossil Tips And Usage Hints} |
| 44 | index.wiki {Home Page} |
| 45 | inout.wiki {Import And Export To And From Git} |
| 46 | makefile.wiki {The Fossil Build Process} |
| 47 |
| --- www/permutedindex.html | ||
| +++ www/permutedindex.html | ||
| @@ -29,10 +29,11 @@ | ||
| 29 | 29 | <li><a href="blame.wiki">Annotate/Blame Algorithm Of Fossil — The</a></li> |
| 30 | 30 | <li><a href="customskin.md">Appearance of Web Pages — Theming: Customizing The</a></li> |
| 31 | 31 | <li><a href="faq.wiki">Asked Questions — Frequently</a></li> |
| 32 | 32 | <li><a href="password.wiki">Authentication — Password Management And</a></li> |
| 33 | 33 | <li><a href="whyusefossil.wiki"><b>Benefits Of Version Control</b></a></li> |
| 34 | +<li><a href="hashpolicy.wiki">Between SHA1 and SHA3-256 — Hash Policy: Choosing</a></li> | |
| 34 | 35 | <li><a href="antibot.wiki">Bots — Defense against Spiders and</a></li> |
| 35 | 36 | <li><a href="private.wiki">Branches — Creating, Syncing, and Deleting Private</a></li> |
| 36 | 37 | <li><a href="branching.wiki"><b>Branching, Forking, Merging, and Tagging</b></a></li> |
| 37 | 38 | <li><a href="bugtheory.wiki"><b>Bug Tracking In Fossil</b></a></li> |
| 38 | 39 | <li><a href="makefile.wiki">Build Process — The Fossil</a></li> |
| @@ -43,10 +44,11 @@ | ||
| 43 | 44 | <li><a href="checkin.wiki">Checklist — Check-in</a></li> |
| 44 | 45 | <li><a href="../test/release-checklist.wiki">Checklist — Pre-Release Testing</a></li> |
| 45 | 46 | <li><a href="foss-cklist.wiki"><b>Checklist For Successful Open-Source Projects</b></a></li> |
| 46 | 47 | <li><a href="selfcheck.wiki">Checks — Fossil Repository Integrity Self</a></li> |
| 47 | 48 | <li><a href="childprojects.wiki"><b>Child Projects</b></a></li> |
| 49 | +<li><a href="hashpolicy.wiki">Choosing Between SHA1 and SHA3-256 — Hash Policy:</a></li> | |
| 48 | 50 | <li><a href="contribute.wiki">Code or Documentation To The Fossil Project — Contributing</a></li> |
| 49 | 51 | <li><a href="style.wiki">Code Style Guidelines — Source</a></li> |
| 50 | 52 | <li><a href="../../../help">Commands and Webpages — Lists of</a></li> |
| 51 | 53 | <li><a href="build.wiki"><b>Compiling and Installing Fossil</b></a></li> |
| 52 | 54 | <li><a href="concepts.wiki">Concepts — Fossil Core</a></li> |
| @@ -111,10 +113,11 @@ | ||
| 111 | 113 | <li><a href="customgraph.md">Graph — Theming: Customizing the Timeline</a></li> |
| 112 | 114 | <li><a href="quickstart.wiki">Guide — Fossil Quick Start</a></li> |
| 113 | 115 | <li><a href="style.wiki">Guidelines — Source Code Style</a></li> |
| 114 | 116 | <li><a href="hacker-howto.wiki"><b>Hacker How-To</b></a></li> |
| 115 | 117 | <li><a href="adding_code.wiki"><b>Hacking Fossil</b></a></li> |
| 118 | +<li><a href="hashpolicy.wiki"><b>Hash Policy: Choosing Between SHA1 and SHA3-256</b></a></li> | |
| 116 | 119 | <li><a href="hints.wiki">Hints — Fossil Tips And Usage</a></li> |
| 117 | 120 | <li><a href="index.wiki"><b>Home Page</b></a></li> |
| 118 | 121 | <li><a href="selfhost.wiki">Hosting Repositories — Fossil Self</a></li> |
| 119 | 122 | <li><a href="aboutcgi.wiki"><b>How CGI Works In Fossil</b></a></li> |
| 120 | 123 | <li><a href="server.wiki"><b>How To Configure A Fossil Server</b></a></li> |
| @@ -147,10 +150,11 @@ | ||
| 147 | 150 | <li><a href="index.wiki">Page — Home</a></li> |
| 148 | 151 | <li><a href="customskin.md">Pages — Theming: Customizing The Appearance of Web</a></li> |
| 149 | 152 | <li><a href="password.wiki"><b>Password Management And Authentication</b></a></li> |
| 150 | 153 | <li><a href="quotes.wiki">People Are Saying About Fossil, Git, and DVCSes in General — Quotes: What</a></li> |
| 151 | 154 | <li><a href="stats.wiki"><b>Performance Statistics</b></a></li> |
| 155 | +<li><a href="hashpolicy.wiki">Policy: Choosing Between SHA1 and SHA3-256 — Hash</a></li> | |
| 152 | 156 | <li><a href="../test/release-checklist.wiki"><b>Pre-Release Testing Checklist</b></a></li> |
| 153 | 157 | <li><a href="pop.wiki"><b>Principles Of Operation</b></a></li> |
| 154 | 158 | <li><a href="private.wiki">Private Branches — Creating, Syncing, and Deleting</a></li> |
| 155 | 159 | <li><a href="makefile.wiki">Process — The Fossil Build</a></li> |
| 156 | 160 | <li><a href="contribute.wiki">Project — Contributing Code or Documentation To The Fossil</a></li> |
| @@ -174,10 +178,12 @@ | ||
| 174 | 178 | <li><a href="th1.md">Scripting Language — The TH1</a></li> |
| 175 | 179 | <li><a href="selfcheck.wiki">Self Checks — Fossil Repository Integrity</a></li> |
| 176 | 180 | <li><a href="selfhost.wiki">Self Hosting Repositories — Fossil</a></li> |
| 177 | 181 | <li><a href="server.wiki">Server — How To Configure A Fossil</a></li> |
| 178 | 182 | <li><a href="settings.wiki">Settings — Fossil</a></li> |
| 183 | +<li><a href="hashpolicy.wiki">SHA1 and SHA3-256 — Hash Policy: Choosing Between</a></li> | |
| 184 | +<li><a href="hashpolicy.wiki">SHA3-256 — Hash Policy: Choosing Between SHA1 and</a></li> | |
| 179 | 185 | <li><a href="shunning.wiki"><b>Shunning: Deleting Content From Fossil</b></a></li> |
| 180 | 186 | <li><a href="fiveminutes.wiki">Single User — Update and Running in 5 Minutes as a</a></li> |
| 181 | 187 | <li><a href="../../../sitemap"><b>Site Map</b></a></li> |
| 182 | 188 | <li><a href="style.wiki"><b>Source Code Style Guidelines</b></a></li> |
| 183 | 189 | <li><a href="antibot.wiki">Spiders and Bots — Defense against</a></li> |
| 184 | 190 |
| --- www/permutedindex.html | |
| +++ www/permutedindex.html | |
| @@ -29,10 +29,11 @@ | |
| 29 | <li><a href="blame.wiki">Annotate/Blame Algorithm Of Fossil — The</a></li> |
| 30 | <li><a href="customskin.md">Appearance of Web Pages — Theming: Customizing The</a></li> |
| 31 | <li><a href="faq.wiki">Asked Questions — Frequently</a></li> |
| 32 | <li><a href="password.wiki">Authentication — Password Management And</a></li> |
| 33 | <li><a href="whyusefossil.wiki"><b>Benefits Of Version Control</b></a></li> |
| 34 | <li><a href="antibot.wiki">Bots — Defense against Spiders and</a></li> |
| 35 | <li><a href="private.wiki">Branches — Creating, Syncing, and Deleting Private</a></li> |
| 36 | <li><a href="branching.wiki"><b>Branching, Forking, Merging, and Tagging</b></a></li> |
| 37 | <li><a href="bugtheory.wiki"><b>Bug Tracking In Fossil</b></a></li> |
| 38 | <li><a href="makefile.wiki">Build Process — The Fossil</a></li> |
| @@ -43,10 +44,11 @@ | |
| 43 | <li><a href="checkin.wiki">Checklist — Check-in</a></li> |
| 44 | <li><a href="../test/release-checklist.wiki">Checklist — Pre-Release Testing</a></li> |
| 45 | <li><a href="foss-cklist.wiki"><b>Checklist For Successful Open-Source Projects</b></a></li> |
| 46 | <li><a href="selfcheck.wiki">Checks — Fossil Repository Integrity Self</a></li> |
| 47 | <li><a href="childprojects.wiki"><b>Child Projects</b></a></li> |
| 48 | <li><a href="contribute.wiki">Code or Documentation To The Fossil Project — Contributing</a></li> |
| 49 | <li><a href="style.wiki">Code Style Guidelines — Source</a></li> |
| 50 | <li><a href="../../../help">Commands and Webpages — Lists of</a></li> |
| 51 | <li><a href="build.wiki"><b>Compiling and Installing Fossil</b></a></li> |
| 52 | <li><a href="concepts.wiki">Concepts — Fossil Core</a></li> |
| @@ -111,10 +113,11 @@ | |
| 111 | <li><a href="customgraph.md">Graph — Theming: Customizing the Timeline</a></li> |
| 112 | <li><a href="quickstart.wiki">Guide — Fossil Quick Start</a></li> |
| 113 | <li><a href="style.wiki">Guidelines — Source Code Style</a></li> |
| 114 | <li><a href="hacker-howto.wiki"><b>Hacker How-To</b></a></li> |
| 115 | <li><a href="adding_code.wiki"><b>Hacking Fossil</b></a></li> |
| 116 | <li><a href="hints.wiki">Hints — Fossil Tips And Usage</a></li> |
| 117 | <li><a href="index.wiki"><b>Home Page</b></a></li> |
| 118 | <li><a href="selfhost.wiki">Hosting Repositories — Fossil Self</a></li> |
| 119 | <li><a href="aboutcgi.wiki"><b>How CGI Works In Fossil</b></a></li> |
| 120 | <li><a href="server.wiki"><b>How To Configure A Fossil Server</b></a></li> |
| @@ -147,10 +150,11 @@ | |
| 147 | <li><a href="index.wiki">Page — Home</a></li> |
| 148 | <li><a href="customskin.md">Pages — Theming: Customizing The Appearance of Web</a></li> |
| 149 | <li><a href="password.wiki"><b>Password Management And Authentication</b></a></li> |
| 150 | <li><a href="quotes.wiki">People Are Saying About Fossil, Git, and DVCSes in General — Quotes: What</a></li> |
| 151 | <li><a href="stats.wiki"><b>Performance Statistics</b></a></li> |
| 152 | <li><a href="../test/release-checklist.wiki"><b>Pre-Release Testing Checklist</b></a></li> |
| 153 | <li><a href="pop.wiki"><b>Principles Of Operation</b></a></li> |
| 154 | <li><a href="private.wiki">Private Branches — Creating, Syncing, and Deleting</a></li> |
| 155 | <li><a href="makefile.wiki">Process — The Fossil Build</a></li> |
| 156 | <li><a href="contribute.wiki">Project — Contributing Code or Documentation To The Fossil</a></li> |
| @@ -174,10 +178,12 @@ | |
| 174 | <li><a href="th1.md">Scripting Language — The TH1</a></li> |
| 175 | <li><a href="selfcheck.wiki">Self Checks — Fossil Repository Integrity</a></li> |
| 176 | <li><a href="selfhost.wiki">Self Hosting Repositories — Fossil</a></li> |
| 177 | <li><a href="server.wiki">Server — How To Configure A Fossil</a></li> |
| 178 | <li><a href="settings.wiki">Settings — Fossil</a></li> |
| 179 | <li><a href="shunning.wiki"><b>Shunning: Deleting Content From Fossil</b></a></li> |
| 180 | <li><a href="fiveminutes.wiki">Single User — Update and Running in 5 Minutes as a</a></li> |
| 181 | <li><a href="../../../sitemap"><b>Site Map</b></a></li> |
| 182 | <li><a href="style.wiki"><b>Source Code Style Guidelines</b></a></li> |
| 183 | <li><a href="antibot.wiki">Spiders and Bots — Defense against</a></li> |
| 184 |
| --- www/permutedindex.html | |
| +++ www/permutedindex.html | |
| @@ -29,10 +29,11 @@ | |
| 29 | <li><a href="blame.wiki">Annotate/Blame Algorithm Of Fossil — The</a></li> |
| 30 | <li><a href="customskin.md">Appearance of Web Pages — Theming: Customizing The</a></li> |
| 31 | <li><a href="faq.wiki">Asked Questions — Frequently</a></li> |
| 32 | <li><a href="password.wiki">Authentication — Password Management And</a></li> |
| 33 | <li><a href="whyusefossil.wiki"><b>Benefits Of Version Control</b></a></li> |
| 34 | <li><a href="hashpolicy.wiki">Between SHA1 and SHA3-256 — Hash Policy: Choosing</a></li> |
| 35 | <li><a href="antibot.wiki">Bots — Defense against Spiders and</a></li> |
| 36 | <li><a href="private.wiki">Branches — Creating, Syncing, and Deleting Private</a></li> |
| 37 | <li><a href="branching.wiki"><b>Branching, Forking, Merging, and Tagging</b></a></li> |
| 38 | <li><a href="bugtheory.wiki"><b>Bug Tracking In Fossil</b></a></li> |
| 39 | <li><a href="makefile.wiki">Build Process — The Fossil</a></li> |
| @@ -43,10 +44,11 @@ | |
| 44 | <li><a href="checkin.wiki">Checklist — Check-in</a></li> |
| 45 | <li><a href="../test/release-checklist.wiki">Checklist — Pre-Release Testing</a></li> |
| 46 | <li><a href="foss-cklist.wiki"><b>Checklist For Successful Open-Source Projects</b></a></li> |
| 47 | <li><a href="selfcheck.wiki">Checks — Fossil Repository Integrity Self</a></li> |
| 48 | <li><a href="childprojects.wiki"><b>Child Projects</b></a></li> |
| 49 | <li><a href="hashpolicy.wiki">Choosing Between SHA1 and SHA3-256 — Hash Policy:</a></li> |
| 50 | <li><a href="contribute.wiki">Code or Documentation To The Fossil Project — Contributing</a></li> |
| 51 | <li><a href="style.wiki">Code Style Guidelines — Source</a></li> |
| 52 | <li><a href="../../../help">Commands and Webpages — Lists of</a></li> |
| 53 | <li><a href="build.wiki"><b>Compiling and Installing Fossil</b></a></li> |
| 54 | <li><a href="concepts.wiki">Concepts — Fossil Core</a></li> |
| @@ -111,10 +113,11 @@ | |
| 113 | <li><a href="customgraph.md">Graph — Theming: Customizing the Timeline</a></li> |
| 114 | <li><a href="quickstart.wiki">Guide — Fossil Quick Start</a></li> |
| 115 | <li><a href="style.wiki">Guidelines — Source Code Style</a></li> |
| 116 | <li><a href="hacker-howto.wiki"><b>Hacker How-To</b></a></li> |
| 117 | <li><a href="adding_code.wiki"><b>Hacking Fossil</b></a></li> |
| 118 | <li><a href="hashpolicy.wiki"><b>Hash Policy: Choosing Between SHA1 and SHA3-256</b></a></li> |
| 119 | <li><a href="hints.wiki">Hints — Fossil Tips And Usage</a></li> |
| 120 | <li><a href="index.wiki"><b>Home Page</b></a></li> |
| 121 | <li><a href="selfhost.wiki">Hosting Repositories — Fossil Self</a></li> |
| 122 | <li><a href="aboutcgi.wiki"><b>How CGI Works In Fossil</b></a></li> |
| 123 | <li><a href="server.wiki"><b>How To Configure A Fossil Server</b></a></li> |
| @@ -147,10 +150,11 @@ | |
| 150 | <li><a href="index.wiki">Page — Home</a></li> |
| 151 | <li><a href="customskin.md">Pages — Theming: Customizing The Appearance of Web</a></li> |
| 152 | <li><a href="password.wiki"><b>Password Management And Authentication</b></a></li> |
| 153 | <li><a href="quotes.wiki">People Are Saying About Fossil, Git, and DVCSes in General — Quotes: What</a></li> |
| 154 | <li><a href="stats.wiki"><b>Performance Statistics</b></a></li> |
| 155 | <li><a href="hashpolicy.wiki">Policy: Choosing Between SHA1 and SHA3-256 — Hash</a></li> |
| 156 | <li><a href="../test/release-checklist.wiki"><b>Pre-Release Testing Checklist</b></a></li> |
| 157 | <li><a href="pop.wiki"><b>Principles Of Operation</b></a></li> |
| 158 | <li><a href="private.wiki">Private Branches — Creating, Syncing, and Deleting</a></li> |
| 159 | <li><a href="makefile.wiki">Process — The Fossil Build</a></li> |
| 160 | <li><a href="contribute.wiki">Project — Contributing Code or Documentation To The Fossil</a></li> |
| @@ -174,10 +178,12 @@ | |
| 178 | <li><a href="th1.md">Scripting Language — The TH1</a></li> |
| 179 | <li><a href="selfcheck.wiki">Self Checks — Fossil Repository Integrity</a></li> |
| 180 | <li><a href="selfhost.wiki">Self Hosting Repositories — Fossil</a></li> |
| 181 | <li><a href="server.wiki">Server — How To Configure A Fossil</a></li> |
| 182 | <li><a href="settings.wiki">Settings — Fossil</a></li> |
| 183 | <li><a href="hashpolicy.wiki">SHA1 and SHA3-256 — Hash Policy: Choosing Between</a></li> |
| 184 | <li><a href="hashpolicy.wiki">SHA3-256 — Hash Policy: Choosing Between SHA1 and</a></li> |
| 185 | <li><a href="shunning.wiki"><b>Shunning: Deleting Content From Fossil</b></a></li> |
| 186 | <li><a href="fiveminutes.wiki">Single User — Update and Running in 5 Minutes as a</a></li> |
| 187 | <li><a href="../../../sitemap"><b>Site Map</b></a></li> |
| 188 | <li><a href="style.wiki"><b>Source Code Style Guidelines</b></a></li> |
| 189 | <li><a href="antibot.wiki">Spiders and Bots — Defense against</a></li> |
| 190 |