Fossil SCM
Deferred discussion of data modeling from the intro of fossil-v-git to section 2.3 where it's fully covered. This material now talks more clearly about Fossil's hybrid NoSQL/relational data model, rather than handwave it as "relational".
Commit
e2998923418e6b6616c853e8ffba897fd6a588dd3a96640eb8808b49b958920c
Parent
efc873ec6845ea2…
1 file changed
+39
-30
+39
-30
| --- www/fossil-v-git.wiki | ||
| +++ www/fossil-v-git.wiki | ||
| @@ -3,25 +3,25 @@ | ||
| 3 | 3 | <h2>1.0 Don't Stress!</h2> |
| 4 | 4 | |
| 5 | 5 | The feature sets of Fossil and [http://git-scm.com | Git] overlap in |
| 6 | 6 | many ways. Both are |
| 7 | 7 | [https://en.wikipedia.org/wiki/Distributed_version_control | distributed |
| 8 | -version control systems] managing a | |
| 9 | -[https://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic | |
| 10 | -graph] (DAG) of [https://en.wikipedia.org/wiki/Merkle_tree | Merkle | |
| 11 | -tree] / [./blockchain.md | block chain] structured check-ins to a local | |
| 12 | -repository clone. In both systems, new content added to the local repo | |
| 13 | -clone can be pushed up to a remote parent, and changes to the remote can | |
| 14 | -be easily pulled down to the local clone. Both systems offer bisecting, | |
| 8 | +version control systems] which store a tree of check-in objects to a | |
| 9 | +local repository clone. In both systems, the local clone starts out as a | |
| 10 | +full copy of the remote parent. New content gets added to the local | |
| 11 | +clone and then later optionally pushed up to the remote, and changes to | |
| 12 | +the remote can be pulled down to the local clone at will. Both systems | |
| 13 | +offer diffing, patching, branching, merging, cherrypicking, bisecting, | |
| 15 | 14 | private branches, a stash, etc. |
| 16 | 15 | |
| 17 | 16 | Fossil has inbound and outbound Git conversion features, so if you start |
| 18 | 17 | out using one DVCS and later decide you like the other better, you can |
| 19 | 18 | easily [./inout.wiki | move your version-controlled file content].¹ |
| 20 | 19 | |
| 21 | -The purpose of this document is to cover the important differences | |
| 22 | -between the two, especially those that impact the user experience. | |
| 20 | +In this document, we set all of that similarity and interoperability | |
| 21 | +aside and focus on the important differences between the two, especially | |
| 22 | +those that impact the user experience. | |
| 23 | 23 | |
| 24 | 24 | Keep in mind that you are reading this on a Fossil website, and though |
| 25 | 25 | we try to be fair, the information here |
| 26 | 26 | might be biased in favor of Fossil, if only because we spend most of our |
| 27 | 27 | time using Fossil, not Git. Ask around for second opinions from |
| @@ -39,11 +39,11 @@ | ||
| 39 | 39 | <td>VCS, tickets, wiki, docs, notes, forum, UI, |
| 40 | 40 | [https://en.wikipedia.org/wiki/Role-based_access_control|RBAC]</td></tr> |
| 41 | 41 | <tr><td>Sprawling, incoherent, and inefficient</td> |
| 42 | 42 | <td>Self-contained and efficient</td></tr> |
| 43 | 43 | <tr><td>Ad-hoc pile-of-files key/value database</td> |
| 44 | - <td>Relational SQL database</td></tr> | |
| 44 | + <td>[https://sqlite.org/famous.html|The most popular database in the world]</td></tr> | |
| 45 | 45 | <tr><td>Portable to POSIX systems only</td><td>Runs just about anywhere</td></tr> |
| 46 | 46 | <tr><td>Bazaar-style development</td><td>Cathedral-style development</td></tr> |
| 47 | 47 | <tr><td>Designed for Linux kernel development</td> |
| 48 | 48 | <td>Designed for SQLite development</td></tr> |
| 49 | 49 | <tr><td>Many contributors</td> |
| @@ -144,24 +144,28 @@ | ||
| 144 | 144 | |
| 145 | 145 | |
| 146 | 146 | <h3 id="durable" name="database">2.3 Durable</h3> |
| 147 | 147 | |
| 148 | 148 | The baseline data structures for Fossil and Git are the same, modulo |
| 149 | -formatting details. Both systems store check-ins as immutable | |
| 150 | -objects referencing their immediate ancestors and named by a | |
| 151 | -cryptographic hash of the check-in content. | |
| 152 | - | |
| 153 | -The difference is that Git stores its objects as individual files | |
| 154 | -in the <tt>.git</tt> folder or compressed into | |
| 155 | -bespoke [https://git-scm.com/book/en/v2/Git-Internals-Packfiles|pack-files], | |
| 156 | -whereas Fossil stores its objects in a | |
| 157 | -relational ([https://www.sqlite.org/|SQLite]) database file. To put it | |
| 158 | -another way, Git uses an ad-hoc pile-of-files key/value database whereas | |
| 159 | -Fossil uses a proven, [https://sqlite.org/testing.html|heavily-tested], | |
| 160 | -general-purpose, [https://sqlite.org/transactional.html|durable] SQL | |
| 161 | -database. This difference is more than an implementation detail. It has | |
| 162 | -important practical consequences. | |
| 149 | +formatting details. Both systems manage a | |
| 150 | +[https://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic | |
| 151 | +graph] (DAG) of [https://en.wikipedia.org/wiki/Merkle_tree | Merkle | |
| 152 | +tree] / [./blockchain.md | block chain] structured check-in objects. | |
| 153 | +Check-ins are identified by a cryptographic hash of the check-in | |
| 154 | +comment, and each check-in refers to its parent via <i>its</i> hash. | |
| 155 | + | |
| 156 | +The difference is that Git stores its objects as individual files in the | |
| 157 | +<tt>.git</tt> folder or compressed into bespoke | |
| 158 | +[https://git-scm.com/book/en/v2/Git-Internals-Packfiles|pack-files], | |
| 159 | +whereas Fossil stores its objects in a [https://www.sqlite.org/|SQLite] | |
| 160 | +database file using a hybrid NoSQL/relational data model of the check-in | |
| 161 | +history. Git's data storage system is an ad-hoc pile-of-files key/value | |
| 162 | +database, whereas Fossil uses a proven, | |
| 163 | +[https://sqlite.org/testing.html|heavily-tested], general-purpose, | |
| 164 | +[https://sqlite.org/transactional.html|durable] SQL database. This | |
| 165 | +difference is more than an implementation detail. It has important | |
| 166 | +practical consequences. | |
| 163 | 167 | |
| 164 | 168 | With Git, one can easily locate the ancestors of a particular check-in |
| 165 | 169 | by following the pointers embedded in the check-in object, but it is |
| 166 | 170 | difficult to go the other direction and locate the descendants of a |
| 167 | 171 | check-in. It is so difficult, in fact, that neither native Git nor |
| @@ -170,21 +174,26 @@ | ||
| 170 | 174 | [https://www.git-scm.com/docs/git-log|commit log]. With Git, if you |
| 171 | 175 | are looking at some historical check-in then you cannot ask "What came |
| 172 | 176 | next?" or "What are the children of this check-in?" |
| 173 | 177 | |
| 174 | 178 | Fossil, on the other hand, parses essential information about check-ins |
| 175 | -(parents, children, committers, comments, files changed, etc.) | |
| 176 | -into a relational database that can be easily | |
| 177 | -queried using concise SQL statements to find both ancestors and | |
| 178 | -descendants of a check-in. | |
| 179 | +(parents, children, committers, comments, files changed, etc.) into a | |
| 180 | +relational database that can be easily queried using concise SQL | |
| 181 | +statements to find both ancestors and descendants of a check-in. This is | |
| 182 | +the hybrid data model mentioned above: Fossil manages your check-in and | |
| 183 | +other data in NoSQL block chain structured data store, but that's backed | |
| 184 | +by a set of relational lookup tables for quick indexing into that | |
| 185 | +artifact store. (See "[./theory1.wiki|Thoughts On The Design Of The | |
| 186 | +Fossil DVCS]" for more details.) | |
| 179 | 187 | |
| 180 | 188 | Leaf check-ins in Git that lack a "ref" become "detached," making them |
| 181 | 189 | difficult to locate and subject to garbage collection. This |
| 182 | 190 | [http://gitfaq.org/articles/what-is-a-detached-head.html|detached head |
| 183 | 191 | state] problem has caused untold grief for countless Git users. With |
| 184 | -Fossil, all check-ins are easily located via multiple possible paths, | |
| 185 | -so that detached heads are simply not possible in Fossil. | |
| 192 | +Fossil, detached heads are simply impossible because we can always find | |
| 193 | +our way back into the block chain using one or more of the relational | |
| 194 | +indices it automatically manages for you. | |
| 186 | 195 | |
| 187 | 196 | This design difference shows up in several other places within each |
| 188 | 197 | tool. It is why Fossil's [/help?cmd=timeline|timeline] is generally more |
| 189 | 198 | detailed yet more clear than those available in Git front-ends. |
| 190 | 199 | (Contrast [/timeline?c=6df7a853ec16865b|this Fossil timeline] with |
| 191 | 200 |
| --- www/fossil-v-git.wiki | |
| +++ www/fossil-v-git.wiki | |
| @@ -3,25 +3,25 @@ | |
| 3 | <h2>1.0 Don't Stress!</h2> |
| 4 | |
| 5 | The feature sets of Fossil and [http://git-scm.com | Git] overlap in |
| 6 | many ways. Both are |
| 7 | [https://en.wikipedia.org/wiki/Distributed_version_control | distributed |
| 8 | version control systems] managing a |
| 9 | [https://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic |
| 10 | graph] (DAG) of [https://en.wikipedia.org/wiki/Merkle_tree | Merkle |
| 11 | tree] / [./blockchain.md | block chain] structured check-ins to a local |
| 12 | repository clone. In both systems, new content added to the local repo |
| 13 | clone can be pushed up to a remote parent, and changes to the remote can |
| 14 | be easily pulled down to the local clone. Both systems offer bisecting, |
| 15 | private branches, a stash, etc. |
| 16 | |
| 17 | Fossil has inbound and outbound Git conversion features, so if you start |
| 18 | out using one DVCS and later decide you like the other better, you can |
| 19 | easily [./inout.wiki | move your version-controlled file content].¹ |
| 20 | |
| 21 | The purpose of this document is to cover the important differences |
| 22 | between the two, especially those that impact the user experience. |
| 23 | |
| 24 | Keep in mind that you are reading this on a Fossil website, and though |
| 25 | we try to be fair, the information here |
| 26 | might be biased in favor of Fossil, if only because we spend most of our |
| 27 | time using Fossil, not Git. Ask around for second opinions from |
| @@ -39,11 +39,11 @@ | |
| 39 | <td>VCS, tickets, wiki, docs, notes, forum, UI, |
| 40 | [https://en.wikipedia.org/wiki/Role-based_access_control|RBAC]</td></tr> |
| 41 | <tr><td>Sprawling, incoherent, and inefficient</td> |
| 42 | <td>Self-contained and efficient</td></tr> |
| 43 | <tr><td>Ad-hoc pile-of-files key/value database</td> |
| 44 | <td>Relational SQL database</td></tr> |
| 45 | <tr><td>Portable to POSIX systems only</td><td>Runs just about anywhere</td></tr> |
| 46 | <tr><td>Bazaar-style development</td><td>Cathedral-style development</td></tr> |
| 47 | <tr><td>Designed for Linux kernel development</td> |
| 48 | <td>Designed for SQLite development</td></tr> |
| 49 | <tr><td>Many contributors</td> |
| @@ -144,24 +144,28 @@ | |
| 144 | |
| 145 | |
| 146 | <h3 id="durable" name="database">2.3 Durable</h3> |
| 147 | |
| 148 | The baseline data structures for Fossil and Git are the same, modulo |
| 149 | formatting details. Both systems store check-ins as immutable |
| 150 | objects referencing their immediate ancestors and named by a |
| 151 | cryptographic hash of the check-in content. |
| 152 | |
| 153 | The difference is that Git stores its objects as individual files |
| 154 | in the <tt>.git</tt> folder or compressed into |
| 155 | bespoke [https://git-scm.com/book/en/v2/Git-Internals-Packfiles|pack-files], |
| 156 | whereas Fossil stores its objects in a |
| 157 | relational ([https://www.sqlite.org/|SQLite]) database file. To put it |
| 158 | another way, Git uses an ad-hoc pile-of-files key/value database whereas |
| 159 | Fossil uses a proven, [https://sqlite.org/testing.html|heavily-tested], |
| 160 | general-purpose, [https://sqlite.org/transactional.html|durable] SQL |
| 161 | database. This difference is more than an implementation detail. It has |
| 162 | important practical consequences. |
| 163 | |
| 164 | With Git, one can easily locate the ancestors of a particular check-in |
| 165 | by following the pointers embedded in the check-in object, but it is |
| 166 | difficult to go the other direction and locate the descendants of a |
| 167 | check-in. It is so difficult, in fact, that neither native Git nor |
| @@ -170,21 +174,26 @@ | |
| 170 | [https://www.git-scm.com/docs/git-log|commit log]. With Git, if you |
| 171 | are looking at some historical check-in then you cannot ask "What came |
| 172 | next?" or "What are the children of this check-in?" |
| 173 | |
| 174 | Fossil, on the other hand, parses essential information about check-ins |
| 175 | (parents, children, committers, comments, files changed, etc.) |
| 176 | into a relational database that can be easily |
| 177 | queried using concise SQL statements to find both ancestors and |
| 178 | descendants of a check-in. |
| 179 | |
| 180 | Leaf check-ins in Git that lack a "ref" become "detached," making them |
| 181 | difficult to locate and subject to garbage collection. This |
| 182 | [http://gitfaq.org/articles/what-is-a-detached-head.html|detached head |
| 183 | state] problem has caused untold grief for countless Git users. With |
| 184 | Fossil, all check-ins are easily located via multiple possible paths, |
| 185 | so that detached heads are simply not possible in Fossil. |
| 186 | |
| 187 | This design difference shows up in several other places within each |
| 188 | tool. It is why Fossil's [/help?cmd=timeline|timeline] is generally more |
| 189 | detailed yet more clear than those available in Git front-ends. |
| 190 | (Contrast [/timeline?c=6df7a853ec16865b|this Fossil timeline] with |
| 191 |
| --- www/fossil-v-git.wiki | |
| +++ www/fossil-v-git.wiki | |
| @@ -3,25 +3,25 @@ | |
| 3 | <h2>1.0 Don't Stress!</h2> |
| 4 | |
| 5 | The feature sets of Fossil and [http://git-scm.com | Git] overlap in |
| 6 | many ways. Both are |
| 7 | [https://en.wikipedia.org/wiki/Distributed_version_control | distributed |
| 8 | version control systems] which store a tree of check-in objects to a |
| 9 | local repository clone. In both systems, the local clone starts out as a |
| 10 | full copy of the remote parent. New content gets added to the local |
| 11 | clone and then later optionally pushed up to the remote, and changes to |
| 12 | the remote can be pulled down to the local clone at will. Both systems |
| 13 | offer diffing, patching, branching, merging, cherrypicking, bisecting, |
| 14 | private branches, a stash, etc. |
| 15 | |
| 16 | Fossil has inbound and outbound Git conversion features, so if you start |
| 17 | out using one DVCS and later decide you like the other better, you can |
| 18 | easily [./inout.wiki | move your version-controlled file content].¹ |
| 19 | |
| 20 | In this document, we set all of that similarity and interoperability |
| 21 | aside and focus on the important differences between the two, especially |
| 22 | those that impact the user experience. |
| 23 | |
| 24 | Keep in mind that you are reading this on a Fossil website, and though |
| 25 | we try to be fair, the information here |
| 26 | might be biased in favor of Fossil, if only because we spend most of our |
| 27 | time using Fossil, not Git. Ask around for second opinions from |
| @@ -39,11 +39,11 @@ | |
| 39 | <td>VCS, tickets, wiki, docs, notes, forum, UI, |
| 40 | [https://en.wikipedia.org/wiki/Role-based_access_control|RBAC]</td></tr> |
| 41 | <tr><td>Sprawling, incoherent, and inefficient</td> |
| 42 | <td>Self-contained and efficient</td></tr> |
| 43 | <tr><td>Ad-hoc pile-of-files key/value database</td> |
| 44 | <td>[https://sqlite.org/famous.html|The most popular database in the world]</td></tr> |
| 45 | <tr><td>Portable to POSIX systems only</td><td>Runs just about anywhere</td></tr> |
| 46 | <tr><td>Bazaar-style development</td><td>Cathedral-style development</td></tr> |
| 47 | <tr><td>Designed for Linux kernel development</td> |
| 48 | <td>Designed for SQLite development</td></tr> |
| 49 | <tr><td>Many contributors</td> |
| @@ -144,24 +144,28 @@ | |
| 144 | |
| 145 | |
| 146 | <h3 id="durable" name="database">2.3 Durable</h3> |
| 147 | |
| 148 | The baseline data structures for Fossil and Git are the same, modulo |
| 149 | formatting details. Both systems manage a |
| 150 | [https://en.wikipedia.org/wiki/Directed_acyclic_graph | directed acyclic |
| 151 | graph] (DAG) of [https://en.wikipedia.org/wiki/Merkle_tree | Merkle |
| 152 | tree] / [./blockchain.md | block chain] structured check-in objects. |
| 153 | Check-ins are identified by a cryptographic hash of the check-in |
| 154 | comment, and each check-in refers to its parent via <i>its</i> hash. |
| 155 | |
| 156 | The difference is that Git stores its objects as individual files in the |
| 157 | <tt>.git</tt> folder or compressed into bespoke |
| 158 | [https://git-scm.com/book/en/v2/Git-Internals-Packfiles|pack-files], |
| 159 | whereas Fossil stores its objects in a [https://www.sqlite.org/|SQLite] |
| 160 | database file using a hybrid NoSQL/relational data model of the check-in |
| 161 | history. Git's data storage system is an ad-hoc pile-of-files key/value |
| 162 | database, whereas Fossil uses a proven, |
| 163 | [https://sqlite.org/testing.html|heavily-tested], general-purpose, |
| 164 | [https://sqlite.org/transactional.html|durable] SQL database. This |
| 165 | difference is more than an implementation detail. It has important |
| 166 | practical consequences. |
| 167 | |
| 168 | With Git, one can easily locate the ancestors of a particular check-in |
| 169 | by following the pointers embedded in the check-in object, but it is |
| 170 | difficult to go the other direction and locate the descendants of a |
| 171 | check-in. It is so difficult, in fact, that neither native Git nor |
| @@ -170,21 +174,26 @@ | |
| 174 | [https://www.git-scm.com/docs/git-log|commit log]. With Git, if you |
| 175 | are looking at some historical check-in then you cannot ask "What came |
| 176 | next?" or "What are the children of this check-in?" |
| 177 | |
| 178 | Fossil, on the other hand, parses essential information about check-ins |
| 179 | (parents, children, committers, comments, files changed, etc.) into a |
| 180 | relational database that can be easily queried using concise SQL |
| 181 | statements to find both ancestors and descendants of a check-in. This is |
| 182 | the hybrid data model mentioned above: Fossil manages your check-in and |
| 183 | other data in NoSQL block chain structured data store, but that's backed |
| 184 | by a set of relational lookup tables for quick indexing into that |
| 185 | artifact store. (See "[./theory1.wiki|Thoughts On The Design Of The |
| 186 | Fossil DVCS]" for more details.) |
| 187 | |
| 188 | Leaf check-ins in Git that lack a "ref" become "detached," making them |
| 189 | difficult to locate and subject to garbage collection. This |
| 190 | [http://gitfaq.org/articles/what-is-a-detached-head.html|detached head |
| 191 | state] problem has caused untold grief for countless Git users. With |
| 192 | Fossil, detached heads are simply impossible because we can always find |
| 193 | our way back into the block chain using one or more of the relational |
| 194 | indices it automatically manages for you. |
| 195 | |
| 196 | This design difference shows up in several other places within each |
| 197 | tool. It is why Fossil's [/help?cmd=timeline|timeline] is generally more |
| 198 | detailed yet more clear than those available in Git front-ends. |
| 199 | (Contrast [/timeline?c=6df7a853ec16865b|this Fossil timeline] with |
| 200 |