Fossil SCM

fossil-scm / www / hashes.md
1
# Hashes: Fossil Artifact Identification
2
3
All artifacts in Fossil are identified by a unique hash, currently using
4
[the SHA3 algorithm by default][hpol], but historically using the SHA1
5
algorithm:
6
7
| Algorithm | Raw Bits | Hexadecimal digits |
8
|-----------|----------|--------------------|
9
| SHA3-256 | 256 | 64 |
10
| SHA1 | 160 | 40 |
11
12
There are many types of artifacts in Fossil: commits (a.k.a. check-ins),
13
tickets, ticket comments, wiki articles, forum postings, file data
14
belonging to check-ins, etc. ([More info...](./concepts.wiki#artifacts)).
15
16
There is a loose hierarchy of terms used instead of “hash” in various
17
parts of the Fossil UI, which we cover in the sections below.
18
19
20
## Names
21
22
Several Fossil interfaces accept [a wide variety of check-in
23
names][cin]: commit artifact hashes, ISO8601 date strings, branch names,
24
etc. Fossil interfaces that accept any of these options usually
25
document the parameter as “NAME”, so we will use that form to refer to
26
this specialized use.
27
28
Artifact hashes are only one of many different types of NAME. We use
29
the broad term “NAME” to refer to the whole class of options. We use
30
more specific terms when we mean one particular type of NAME.
31
32
33
## Versions
34
35
When an artifact hash refers to a specific commit, Fossil sometimes
36
calls it a “VERSION,” a “commit ID,” or a “check-in ID.”
37
We may eventually settle on one of these terms, but all three are
38
currently in common use within Fossil’s docs, UI, and programming
39
interfaces.
40
41
A VERSION is a specific type of artifact hash, distinct
42
from, let us say, a wiki article artifact hash.
43
44
A unique prefix of a VERSION hash is itself a VERSION. That is, if your
45
repository has exactly one commit artifact with a hash prefix of
46
“abc123”, then that is a valid version string as long as it remains
47
unambiguous.
48
49
50
51
## <a id="uvh"></a>UUIDs
52
53
Fossil uses the term “UUID” as a short alias for “artifact hash” in its
54
internals. There are a few places where this leaks out into external
55
interfaces, which we cover in the sections below. Going forward, we
56
prefer one of the terms above in public interfaces instead.
57
58
Whether this short alias is correct is debateable.
59
60
One argument is that since "UUID" is an acronym for “Universally Unique
61
Identifier,” and both SHA1 and SHA3-256 are larger and stronger than the
62
128-bit algorithms used by “proper” UUIDs, Fossil artifact hashes are
63
*more universally unique*. It is therefore quibbling to say that Fossil
64
UUIDs are not actually UUIDs. One wag suggested that Fossil artifact
65
hashes be called MUIDs: multiversally unique IDs.
66
67
The common counterargument is that the acronym “UUID” was created for [a
68
particular type of universally-unique ID][uuid], with particular ASCII
69
and bitfield formats, and with particular meaning given to certain of
70
its bits. In that sense, no Fossil “UUID” can be used as a proper UUID.
71
72
Be warned: attempting to advance the second position on the Fossil
73
discussion forum will get you nowhere at this late date. We’ve had the
74
debates, we’ve done the engineering, and we’ve made our evaluation. It’s
75
a settled matter: internally within Fossil, “UUID” is defined as in this
76
section’s leading paragraph.
77
78
To those who remain unconvinced, “fixing” this would require touching
79
almost every source code file in Fossil in a total of about a thousand
80
separate locations. (Not exaggeration, actual data.) This would be a
81
massive undertaking simply to deal with a small matter of terminology,
82
with a high risk of creating bugs and downstream incompatibilities.
83
Therefore, we are highly unlikely to change this ourselves, and we are
84
also unlikely to accept a patch that attempts to fix it.
85
86
87
### Repository DB Schema
88
89
The primary place where you find "UUID" in Fossil is in the `blob.uuid`
90
table column, in code dealing with that column, and in code manipulating
91
*other* data that *refers* to that column. This is a key lookup column
92
in the most important Fossil DB table, so it influences broad swaths of
93
the Fossil internals.
94
95
For example, C code that refers to SQL result data on `blob.uuid`
96
usually calls the variable `zUuid`. That value may then be inserted into
97
a table like `ticket.tkt_uuid`, creating a reference back to
98
`blob.uuid`, and then be passed to a function like `uuid_to_rid()`.
99
There is no point renaming a single one of these in isolation: it would
100
create needless terminology conflicts, making the code hard to read and
101
understand, risking the creation of new bugs.
102
103
You may have local SQL code that digs into the repository DB using these
104
column names. While you may rest easy, assured now that we are highly
105
unlikely to ever rename these columns, the Fossil repository DB schema
106
is not considered an external user interface, and internal interfaces
107
are subject to change at any time. We suggest switching to a more stable
108
API: [the JSON API][japi], [`timeline.rss`][trss], [TH1][th1], etc.
109
110
111
### TH1 Scripting Interfaces
112
113
Some [TH1][th1] interfaces expose Fossil internals flowing from
114
`blob.uuid`, so “UUID” is a short alias for “artifact hash” in TH1. For
115
example, the `$tkt_uuid` variable &mdash; available when [customizing
116
the ticket system][ctkt] &mdash; is a ticket artifact hash, exposing the
117
`ticket.tkt_uuid` column, which has a SQL relation to `blob.uuid`.
118
119
TH1 is a longstanding public programming interface. We cannot rename its
120
interfaces without breaking existing TH1 Fossil customizations. We are
121
also unlikely to provide a parallel set of variables with “better”
122
names, since that would create a mismatch with respect to the internals
123
they expose, creating a different sort of developer confusion in its
124
place.
125
126
127
### JSON API Parameters and Outputs
128
129
[The JSON API][japi] frequently uses the term “UUID” in the same sort of way,
130
most commonly in [artifact][jart] and [timeline][jtim] APIs. As with
131
TH1, we can’t change this without breaking code that uses the JSON
132
API as originally designed, so we take the same stance.
133
134
135
### `manifest.uuid`
136
137
If you have [the `manifest` setting][mset] enabled, Fossil writes a file
138
called `manifest.uuid` at the root of the check-out tree containing the
139
commit hash for the current checked-out version. Because this is a
140
public interface that existing code depends on, we are unwilling to
141
rename the file.
142
143
144
[cin]: ./checkin_names.wiki
145
[ctkt]: ./custom_ticket.wiki
146
[hpol]: ./hashpolicy.wiki
147
[japi]: ./json-api/
148
[jart]: ./json-api/api-artifact.md
149
[jtim]: ./json-api/api-timeline.md
150
[mset]: /help/manifest
151
[th1]: ./th1.md
152
[trss]: /help/www/timeline.rss
153
[tvb]: ./branching.wiki
154
[uuid]: https://en.wikipedia.org/wiki/Universally_unique_identifier
155

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button