|
1
|
# Hashes: Fossil Artifact Identification |
|
2
|
|
|
3
|
All artifacts in Fossil are identified by a unique hash, currently using |
|
4
|
[the SHA3 algorithm by default][hpol], but historically using the SHA1 |
|
5
|
algorithm: |
|
6
|
|
|
7
|
| Algorithm | Raw Bits | Hexadecimal digits | |
|
8
|
|-----------|----------|--------------------| |
|
9
|
| SHA3-256 | 256 | 64 | |
|
10
|
| SHA1 | 160 | 40 | |
|
11
|
|
|
12
|
There are many types of artifacts in Fossil: commits (a.k.a. check-ins), |
|
13
|
tickets, ticket comments, wiki articles, forum postings, file data |
|
14
|
belonging to check-ins, etc. ([More info...](./concepts.wiki#artifacts)). |
|
15
|
|
|
16
|
There is a loose hierarchy of terms used instead of “hash” in various |
|
17
|
parts of the Fossil UI, which we cover in the sections below. |
|
18
|
|
|
19
|
|
|
20
|
## Names |
|
21
|
|
|
22
|
Several Fossil interfaces accept [a wide variety of check-in |
|
23
|
names][cin]: commit artifact hashes, ISO8601 date strings, branch names, |
|
24
|
etc. Fossil interfaces that accept any of these options usually |
|
25
|
document the parameter as “NAME”, so we will use that form to refer to |
|
26
|
this specialized use. |
|
27
|
|
|
28
|
Artifact hashes are only one of many different types of NAME. We use |
|
29
|
the broad term “NAME” to refer to the whole class of options. We use |
|
30
|
more specific terms when we mean one particular type of NAME. |
|
31
|
|
|
32
|
|
|
33
|
## Versions |
|
34
|
|
|
35
|
When an artifact hash refers to a specific commit, Fossil sometimes |
|
36
|
calls it a “VERSION,” a “commit ID,” or a “check-in ID.” |
|
37
|
We may eventually settle on one of these terms, but all three are |
|
38
|
currently in common use within Fossil’s docs, UI, and programming |
|
39
|
interfaces. |
|
40
|
|
|
41
|
A VERSION is a specific type of artifact hash, distinct |
|
42
|
from, let us say, a wiki article artifact hash. |
|
43
|
|
|
44
|
A unique prefix of a VERSION hash is itself a VERSION. That is, if your |
|
45
|
repository has exactly one commit artifact with a hash prefix of |
|
46
|
“abc123”, then that is a valid version string as long as it remains |
|
47
|
unambiguous. |
|
48
|
|
|
49
|
|
|
50
|
|
|
51
|
## <a id="uvh"></a>UUIDs |
|
52
|
|
|
53
|
Fossil uses the term “UUID” as a short alias for “artifact hash” in its |
|
54
|
internals. There are a few places where this leaks out into external |
|
55
|
interfaces, which we cover in the sections below. Going forward, we |
|
56
|
prefer one of the terms above in public interfaces instead. |
|
57
|
|
|
58
|
Whether this short alias is correct is debateable. |
|
59
|
|
|
60
|
One argument is that since "UUID" is an acronym for “Universally Unique |
|
61
|
Identifier,” and both SHA1 and SHA3-256 are larger and stronger than the |
|
62
|
128-bit algorithms used by “proper” UUIDs, Fossil artifact hashes are |
|
63
|
*more universally unique*. It is therefore quibbling to say that Fossil |
|
64
|
UUIDs are not actually UUIDs. One wag suggested that Fossil artifact |
|
65
|
hashes be called MUIDs: multiversally unique IDs. |
|
66
|
|
|
67
|
The common counterargument is that the acronym “UUID” was created for [a |
|
68
|
particular type of universally-unique ID][uuid], with particular ASCII |
|
69
|
and bitfield formats, and with particular meaning given to certain of |
|
70
|
its bits. In that sense, no Fossil “UUID” can be used as a proper UUID. |
|
71
|
|
|
72
|
Be warned: attempting to advance the second position on the Fossil |
|
73
|
discussion forum will get you nowhere at this late date. We’ve had the |
|
74
|
debates, we’ve done the engineering, and we’ve made our evaluation. It’s |
|
75
|
a settled matter: internally within Fossil, “UUID” is defined as in this |
|
76
|
section’s leading paragraph. |
|
77
|
|
|
78
|
To those who remain unconvinced, “fixing” this would require touching |
|
79
|
almost every source code file in Fossil in a total of about a thousand |
|
80
|
separate locations. (Not exaggeration, actual data.) This would be a |
|
81
|
massive undertaking simply to deal with a small matter of terminology, |
|
82
|
with a high risk of creating bugs and downstream incompatibilities. |
|
83
|
Therefore, we are highly unlikely to change this ourselves, and we are |
|
84
|
also unlikely to accept a patch that attempts to fix it. |
|
85
|
|
|
86
|
|
|
87
|
### Repository DB Schema |
|
88
|
|
|
89
|
The primary place where you find "UUID" in Fossil is in the `blob.uuid` |
|
90
|
table column, in code dealing with that column, and in code manipulating |
|
91
|
*other* data that *refers* to that column. This is a key lookup column |
|
92
|
in the most important Fossil DB table, so it influences broad swaths of |
|
93
|
the Fossil internals. |
|
94
|
|
|
95
|
For example, C code that refers to SQL result data on `blob.uuid` |
|
96
|
usually calls the variable `zUuid`. That value may then be inserted into |
|
97
|
a table like `ticket.tkt_uuid`, creating a reference back to |
|
98
|
`blob.uuid`, and then be passed to a function like `uuid_to_rid()`. |
|
99
|
There is no point renaming a single one of these in isolation: it would |
|
100
|
create needless terminology conflicts, making the code hard to read and |
|
101
|
understand, risking the creation of new bugs. |
|
102
|
|
|
103
|
You may have local SQL code that digs into the repository DB using these |
|
104
|
column names. While you may rest easy, assured now that we are highly |
|
105
|
unlikely to ever rename these columns, the Fossil repository DB schema |
|
106
|
is not considered an external user interface, and internal interfaces |
|
107
|
are subject to change at any time. We suggest switching to a more stable |
|
108
|
API: [the JSON API][japi], [`timeline.rss`][trss], [TH1][th1], etc. |
|
109
|
|
|
110
|
|
|
111
|
### TH1 Scripting Interfaces |
|
112
|
|
|
113
|
Some [TH1][th1] interfaces expose Fossil internals flowing from |
|
114
|
`blob.uuid`, so “UUID” is a short alias for “artifact hash” in TH1. For |
|
115
|
example, the `$tkt_uuid` variable — available when [customizing |
|
116
|
the ticket system][ctkt] — is a ticket artifact hash, exposing the |
|
117
|
`ticket.tkt_uuid` column, which has a SQL relation to `blob.uuid`. |
|
118
|
|
|
119
|
TH1 is a longstanding public programming interface. We cannot rename its |
|
120
|
interfaces without breaking existing TH1 Fossil customizations. We are |
|
121
|
also unlikely to provide a parallel set of variables with “better” |
|
122
|
names, since that would create a mismatch with respect to the internals |
|
123
|
they expose, creating a different sort of developer confusion in its |
|
124
|
place. |
|
125
|
|
|
126
|
|
|
127
|
### JSON API Parameters and Outputs |
|
128
|
|
|
129
|
[The JSON API][japi] frequently uses the term “UUID” in the same sort of way, |
|
130
|
most commonly in [artifact][jart] and [timeline][jtim] APIs. As with |
|
131
|
TH1, we can’t change this without breaking code that uses the JSON |
|
132
|
API as originally designed, so we take the same stance. |
|
133
|
|
|
134
|
|
|
135
|
### `manifest.uuid` |
|
136
|
|
|
137
|
If you have [the `manifest` setting][mset] enabled, Fossil writes a file |
|
138
|
called `manifest.uuid` at the root of the check-out tree containing the |
|
139
|
commit hash for the current checked-out version. Because this is a |
|
140
|
public interface that existing code depends on, we are unwilling to |
|
141
|
rename the file. |
|
142
|
|
|
143
|
|
|
144
|
[cin]: ./checkin_names.wiki |
|
145
|
[ctkt]: ./custom_ticket.wiki |
|
146
|
[hpol]: ./hashpolicy.wiki |
|
147
|
[japi]: ./json-api/ |
|
148
|
[jart]: ./json-api/api-artifact.md |
|
149
|
[jtim]: ./json-api/api-timeline.md |
|
150
|
[mset]: /help/manifest |
|
151
|
[th1]: ./th1.md |
|
152
|
[trss]: /help/www/timeline.rss |
|
153
|
[tvb]: ./branching.wiki |
|
154
|
[uuid]: https://en.wikipedia.org/wiki/Universally_unique_identifier |
|
155
|
|