Fossil SCM
Proper explanation for the inconsistent results from the "git show" command examples in Case Study 1 of the gitusers doc.
Commit
8b1eddef96b524a9067c95a24e3fb6fec0e7a89b89d45fda6fc6f50deb5cbfec
Parent
81ca92b854c4727…
1 file changed
+65
-51
+65
-51
| --- www/gitusers.md | ||
| +++ www/gitusers.md | ||
| @@ -773,35 +773,57 @@ | ||
| 773 | 773 | |
| 774 | 774 | ## <a id="cvdate" name="cs1"></a> Case Study 1: Checking Out A Version By Date |
| 775 | 775 | |
| 776 | 776 | Let’s get into something a bit more complicated: a case study showing |
| 777 | 777 | how the concepts lined out above cause Fossil to materially differ in |
| 778 | -day-to-day operation from Git. The goal: you need to check out | |
| 779 | -a version of a project by date. Perhaps your customer gave you a | |
| 780 | -vague bug report referencing only a date rather than a version, or perhaps you’re | |
| 781 | -poking semi-randomly through history to find a “good” version to anchor | |
| 782 | -the start point of a [`bisect`][bis] operation. | |
| 783 | - | |
| 784 | -My search engine’s first result for “git checkout by date” gives [a | |
| 785 | -highly-upvoted accepted answer on Stack Overflow][gcod]. It gives two | |
| 786 | -alternative commands, the first of which is based on Git’s [`rev-parse` | |
| 787 | -feature][grp]: | |
| 778 | +day-to-day operation from Git. | |
| 779 | + | |
| 780 | +Why would you want to check out a version of a project by date? Perhaps | |
| 781 | +because your customer gave you a vague bug report referencing only a | |
| 782 | +date rather than a version. Or, you may be poking semi-randomly through | |
| 783 | +history to find a “good” version to anchor the start point of a | |
| 784 | +[`bisect`][bis] operation. | |
| 785 | + | |
| 786 | +My search engine’s first result for “git checkout by date” is [this | |
| 787 | +highly-upvoted accepted Stack Overflow answer][gcod]. The first command | |
| 788 | +it gives is based on Git’s [`rev-parse` feature][grp]: | |
| 788 | 789 | |
| 789 | 790 | git checkout master@{2020-03-17} |
| 790 | 791 | |
| 791 | -It’s a bit cryptic, but that’s not its major flaw: it only works if the | |
| 792 | -target commit is in Git’s [reflog], which Git [automatically | |
| 793 | -prunes][gle] to 90 days of history, by default. Worse, the command won’t | |
| 794 | -fail outright if the reflog can’t resolve the given date, it will | |
| 795 | -instead give an *incorrect* result on `stdout`, being the closest it can | |
| 796 | -come to your requested date, even if that’s months or years out from | |
| 797 | -your target! I’ve even managed to get it to give an incorrect result | |
| 798 | -without the warning by running it on stale Git clones. | |
| 799 | - | |
| 800 | -In other words, Git tries its best, and it may or may not warn if it | |
| 801 | -fails, but it absolutely should never be trusted, because it’s working | |
| 802 | -from a purgeable and possibly-stale local cache. | |
| 792 | +There are a number of weaknesses in this command. From least to most | |
| 793 | +critical: | |
| 794 | + | |
| 795 | +1. It’s a bit cryptic. Leave off the refname or punctuation, and it | |
| 796 | + means something else. You cannot simplify the cryptic incantation in | |
| 797 | + the typical use case. | |
| 798 | + | |
| 799 | +2. A date string in Git without a time will be interpreted as | |
| 800 | + “[at localtime on that date][gapxd],” so the command means something | |
| 801 | + different from one second to the next! If there are multiple commits | |
| 802 | + on that date, that command can give different results depending on | |
| 803 | + the time of day you run it. | |
| 804 | + | |
| 805 | +3. It gives misleading output if there is no close match for the date | |
| 806 | + in target commit in the local [reflog]. On a fresh clone, the reflog | |
| 807 | + is empty, and even on a well-established clone, Git [automatically | |
| 808 | + prunes][gle] the reflog to 90 days of history by default. This means | |
| 809 | + the command above can give different results from one machine to the | |
| 810 | + next, or even from one day to the next on the same clone. | |
| 811 | + | |
| 812 | + The command won’t fail outright if the reflog can’t resolve the | |
| 813 | + given date: it simply gives the closest commit it can come up with, | |
| 814 | + even if it’s months or years out from your target! Sometimes it | |
| 815 | + gives a warning about the reflog not going back far enough to give a | |
| 816 | + useful result, and sometimes it doesn’t. If you’re on a fresh clone, | |
| 817 | + you are likely to get the “tip” commit’s revision ID no matter what | |
| 818 | + date value you give. | |
| 819 | + | |
| 820 | + Git tries its best, but because it’s working from a purgeable and | |
| 821 | + possibly-stale local cache, you cannot trust its results. | |
| 822 | + | |
| 823 | +We cannot recommend this command at all. It’s unreliable even in the | |
| 824 | +best case. | |
| 803 | 825 | |
| 804 | 826 | That same Stack Overflow answer therefore goes on to recommend an |
| 805 | 827 | entirely different command: |
| 806 | 828 | |
| 807 | 829 | git checkout $(git rev-list -n 1 --first-parent --before="2020-03-17" master) |
| @@ -811,15 +833,10 @@ | ||
| 811 | 833 | part because of its “small tools loosely joined” design philosophy. This |
| 812 | 834 | sort of command is therefore composed piece by piece: |
| 813 | 835 | |
| 814 | 836 | <center>◆ ◆ ◆</center> |
| 815 | 837 | |
| 816 | -**Given:** Git lacks a reliable lookup-by-date index into its log. | |
| 817 | - | |
| 818 | -**Goal:** Find the commit ID nearest a given date, you poor unfortunate | |
| 819 | -user. | |
| 820 | - | |
| 821 | 838 | “Oh, I know, I’ll search the rev-list, which outputs commit IDs by |
| 822 | 839 | parsing the log backwards from `HEAD`! Easy!” |
| 823 | 840 | |
| 824 | 841 | git rev-list --before=2020-03-17 |
| 825 | 842 | |
| @@ -859,41 +876,38 @@ | ||
| 859 | 876 | |
| 860 | 877 | And too bad if you’re a Windows user who doesn’t want to use [Git |
| 861 | 878 | Bash][gbash], since neither of the stock OS command shells have a |
| 862 | 879 | command interpolation feature needed to run that horrid command. |
| 863 | 880 | |
| 864 | -All of the command examples above were done on [Git’s own | |
| 865 | -repository][gitgh]. Your results with the first command — the one based | |
| 866 | -on [Git’s `rev-parse` feature][grp] — will vary depending on the state | |
| 867 | -of your local reflog. | |
| 868 | - | |
| 869 | -The date we’re using is simply our attempt to produce an example that | |
| 870 | -always points at the same merge commit. As I write this, it’s pointing | |
| 871 | -at [this one][gmc], but this is my third attempt: prior examples | |
| 872 | -(2020-04-12 and 2020-03-12) broke for no obvious reason, suggesting that | |
| 873 | -a given date in the above command isn’t always guaranteed to give the | |
| 874 | -same commit. These example dates are far enough back in history that I | |
| 875 | -doubt this is due to history rewriting. My pet hypothesis is that Git | |
| 876 | -isn’t always traversing the log strictly in date order, and the order of | |
| 877 | -entries in the log can shift about from one clone to the next, so the | |
| 878 | -commit “before” a given date might differ from one to the next. If | |
| 879 | -that’s true, then even the second command isn’t wholly reliable. | |
| 881 | +This alternative command still has weakness #2 above: if you run the | |
| 882 | +second `git show` command above on [Git’s own repository][gitgh], your | |
| 883 | +results may vary because there were four non-merge commits to Git on the | |
| 884 | +17th of March, 2020. | |
| 880 | 885 | |
| 881 | 886 | You may be asking with an exasperated huff, “What is your *point*, man?” |
| 882 | 887 | The point is that the equivalent in Fossil is simply: |
| 883 | 888 | |
| 884 | 889 | fossil up 2020-03-17 |
| 885 | 890 | |
| 886 | -…which will *always* give the commit closest to the 17th of March, 2020, | |
| 887 | -no matter whether you do it on a fresh clone or a stale one because of | |
| 888 | -Fossil’s autosync feature. Because this uses a SQLite indexed | |
| 889 | -“`ORDER BY`” query, the answer won’t shift about from one clone to the | |
| 890 | -next. | |
| 891 | - | |
| 892 | -In Git terms, Fossil’s “reflog” is always complete and up-to-date. | |
| 893 | - | |
| 891 | +…which will *always* give the commit closest to midnight UTC on the 17th | |
| 892 | +of March, 2020, no matter whether you do it on a fresh clone or a stale | |
| 893 | +one. The answer won’t shift about from one clone to the next or from | |
| 894 | +one local time of day to the next. We owe this reliability and stability | |
| 895 | +to three Fossil design choices: | |
| 896 | + | |
| 897 | +* Parse timestamps from all commits on clone into a local commit index, | |
| 898 | + then maintain that index through subsequent commits and syncs. | |
| 899 | + | |
| 900 | +* Use an indexed SQL `ORDER BY` query to match timestamps to commit | |
| 901 | + IDs for a fast and consistent result. | |
| 902 | + | |
| 903 | +* Round timestamp strings up using [rules][frud] consistent across | |
| 904 | + computers and local time of day. | |
| 905 | + | |
| 906 | +[frud]: https://fossil-scm.org/home/file/src/name.c?ci=d2a59b03727bc3&ln=122-141 | |
| 894 | 907 | [gbash]: https://appuals.com/what-is-git-bash/ |
| 908 | +[gapxd]: https://github.com/git/git/blob/7f7ebe054a/date.c#L1298-L1300 | |
| 895 | 909 | [gcod]: https://stackoverflow.com/a/6990682/142454 |
| 896 | 910 | [gdh]: https://www.git-tower.com/learn/git/faq/detached-head-when-checkout-commit/ |
| 897 | 911 | [gitgh]: https://github.com/git/git/ |
| 898 | 912 | [gle]: https://git-scm.com/docs/git-reflog#_options_for_expire |
| 899 | 913 | [gmc]: https://github.com/git/git/commit/67b0a24910fbb23c8f5e7a2c61c339818bc68296 |
| 900 | 914 |
| --- www/gitusers.md | |
| +++ www/gitusers.md | |
| @@ -773,35 +773,57 @@ | |
| 773 | |
| 774 | ## <a id="cvdate" name="cs1"></a> Case Study 1: Checking Out A Version By Date |
| 775 | |
| 776 | Let’s get into something a bit more complicated: a case study showing |
| 777 | how the concepts lined out above cause Fossil to materially differ in |
| 778 | day-to-day operation from Git. The goal: you need to check out |
| 779 | a version of a project by date. Perhaps your customer gave you a |
| 780 | vague bug report referencing only a date rather than a version, or perhaps you’re |
| 781 | poking semi-randomly through history to find a “good” version to anchor |
| 782 | the start point of a [`bisect`][bis] operation. |
| 783 | |
| 784 | My search engine’s first result for “git checkout by date” gives [a |
| 785 | highly-upvoted accepted answer on Stack Overflow][gcod]. It gives two |
| 786 | alternative commands, the first of which is based on Git’s [`rev-parse` |
| 787 | feature][grp]: |
| 788 | |
| 789 | git checkout master@{2020-03-17} |
| 790 | |
| 791 | It’s a bit cryptic, but that’s not its major flaw: it only works if the |
| 792 | target commit is in Git’s [reflog], which Git [automatically |
| 793 | prunes][gle] to 90 days of history, by default. Worse, the command won’t |
| 794 | fail outright if the reflog can’t resolve the given date, it will |
| 795 | instead give an *incorrect* result on `stdout`, being the closest it can |
| 796 | come to your requested date, even if that’s months or years out from |
| 797 | your target! I’ve even managed to get it to give an incorrect result |
| 798 | without the warning by running it on stale Git clones. |
| 799 | |
| 800 | In other words, Git tries its best, and it may or may not warn if it |
| 801 | fails, but it absolutely should never be trusted, because it’s working |
| 802 | from a purgeable and possibly-stale local cache. |
| 803 | |
| 804 | That same Stack Overflow answer therefore goes on to recommend an |
| 805 | entirely different command: |
| 806 | |
| 807 | git checkout $(git rev-list -n 1 --first-parent --before="2020-03-17" master) |
| @@ -811,15 +833,10 @@ | |
| 811 | part because of its “small tools loosely joined” design philosophy. This |
| 812 | sort of command is therefore composed piece by piece: |
| 813 | |
| 814 | <center>◆ ◆ ◆</center> |
| 815 | |
| 816 | **Given:** Git lacks a reliable lookup-by-date index into its log. |
| 817 | |
| 818 | **Goal:** Find the commit ID nearest a given date, you poor unfortunate |
| 819 | user. |
| 820 | |
| 821 | “Oh, I know, I’ll search the rev-list, which outputs commit IDs by |
| 822 | parsing the log backwards from `HEAD`! Easy!” |
| 823 | |
| 824 | git rev-list --before=2020-03-17 |
| 825 | |
| @@ -859,41 +876,38 @@ | |
| 859 | |
| 860 | And too bad if you’re a Windows user who doesn’t want to use [Git |
| 861 | Bash][gbash], since neither of the stock OS command shells have a |
| 862 | command interpolation feature needed to run that horrid command. |
| 863 | |
| 864 | All of the command examples above were done on [Git’s own |
| 865 | repository][gitgh]. Your results with the first command — the one based |
| 866 | on [Git’s `rev-parse` feature][grp] — will vary depending on the state |
| 867 | of your local reflog. |
| 868 | |
| 869 | The date we’re using is simply our attempt to produce an example that |
| 870 | always points at the same merge commit. As I write this, it’s pointing |
| 871 | at [this one][gmc], but this is my third attempt: prior examples |
| 872 | (2020-04-12 and 2020-03-12) broke for no obvious reason, suggesting that |
| 873 | a given date in the above command isn’t always guaranteed to give the |
| 874 | same commit. These example dates are far enough back in history that I |
| 875 | doubt this is due to history rewriting. My pet hypothesis is that Git |
| 876 | isn’t always traversing the log strictly in date order, and the order of |
| 877 | entries in the log can shift about from one clone to the next, so the |
| 878 | commit “before” a given date might differ from one to the next. If |
| 879 | that’s true, then even the second command isn’t wholly reliable. |
| 880 | |
| 881 | You may be asking with an exasperated huff, “What is your *point*, man?” |
| 882 | The point is that the equivalent in Fossil is simply: |
| 883 | |
| 884 | fossil up 2020-03-17 |
| 885 | |
| 886 | …which will *always* give the commit closest to the 17th of March, 2020, |
| 887 | no matter whether you do it on a fresh clone or a stale one because of |
| 888 | Fossil’s autosync feature. Because this uses a SQLite indexed |
| 889 | “`ORDER BY`” query, the answer won’t shift about from one clone to the |
| 890 | next. |
| 891 | |
| 892 | In Git terms, Fossil’s “reflog” is always complete and up-to-date. |
| 893 | |
| 894 | [gbash]: https://appuals.com/what-is-git-bash/ |
| 895 | [gcod]: https://stackoverflow.com/a/6990682/142454 |
| 896 | [gdh]: https://www.git-tower.com/learn/git/faq/detached-head-when-checkout-commit/ |
| 897 | [gitgh]: https://github.com/git/git/ |
| 898 | [gle]: https://git-scm.com/docs/git-reflog#_options_for_expire |
| 899 | [gmc]: https://github.com/git/git/commit/67b0a24910fbb23c8f5e7a2c61c339818bc68296 |
| 900 |
| --- www/gitusers.md | |
| +++ www/gitusers.md | |
| @@ -773,35 +773,57 @@ | |
| 773 | |
| 774 | ## <a id="cvdate" name="cs1"></a> Case Study 1: Checking Out A Version By Date |
| 775 | |
| 776 | Let’s get into something a bit more complicated: a case study showing |
| 777 | how the concepts lined out above cause Fossil to materially differ in |
| 778 | day-to-day operation from Git. |
| 779 | |
| 780 | Why would you want to check out a version of a project by date? Perhaps |
| 781 | because your customer gave you a vague bug report referencing only a |
| 782 | date rather than a version. Or, you may be poking semi-randomly through |
| 783 | history to find a “good” version to anchor the start point of a |
| 784 | [`bisect`][bis] operation. |
| 785 | |
| 786 | My search engine’s first result for “git checkout by date” is [this |
| 787 | highly-upvoted accepted Stack Overflow answer][gcod]. The first command |
| 788 | it gives is based on Git’s [`rev-parse` feature][grp]: |
| 789 | |
| 790 | git checkout master@{2020-03-17} |
| 791 | |
| 792 | There are a number of weaknesses in this command. From least to most |
| 793 | critical: |
| 794 | |
| 795 | 1. It’s a bit cryptic. Leave off the refname or punctuation, and it |
| 796 | means something else. You cannot simplify the cryptic incantation in |
| 797 | the typical use case. |
| 798 | |
| 799 | 2. A date string in Git without a time will be interpreted as |
| 800 | “[at localtime on that date][gapxd],” so the command means something |
| 801 | different from one second to the next! If there are multiple commits |
| 802 | on that date, that command can give different results depending on |
| 803 | the time of day you run it. |
| 804 | |
| 805 | 3. It gives misleading output if there is no close match for the date |
| 806 | in target commit in the local [reflog]. On a fresh clone, the reflog |
| 807 | is empty, and even on a well-established clone, Git [automatically |
| 808 | prunes][gle] the reflog to 90 days of history by default. This means |
| 809 | the command above can give different results from one machine to the |
| 810 | next, or even from one day to the next on the same clone. |
| 811 | |
| 812 | The command won’t fail outright if the reflog can’t resolve the |
| 813 | given date: it simply gives the closest commit it can come up with, |
| 814 | even if it’s months or years out from your target! Sometimes it |
| 815 | gives a warning about the reflog not going back far enough to give a |
| 816 | useful result, and sometimes it doesn’t. If you’re on a fresh clone, |
| 817 | you are likely to get the “tip” commit’s revision ID no matter what |
| 818 | date value you give. |
| 819 | |
| 820 | Git tries its best, but because it’s working from a purgeable and |
| 821 | possibly-stale local cache, you cannot trust its results. |
| 822 | |
| 823 | We cannot recommend this command at all. It’s unreliable even in the |
| 824 | best case. |
| 825 | |
| 826 | That same Stack Overflow answer therefore goes on to recommend an |
| 827 | entirely different command: |
| 828 | |
| 829 | git checkout $(git rev-list -n 1 --first-parent --before="2020-03-17" master) |
| @@ -811,15 +833,10 @@ | |
| 833 | part because of its “small tools loosely joined” design philosophy. This |
| 834 | sort of command is therefore composed piece by piece: |
| 835 | |
| 836 | <center>◆ ◆ ◆</center> |
| 837 | |
| 838 | “Oh, I know, I’ll search the rev-list, which outputs commit IDs by |
| 839 | parsing the log backwards from `HEAD`! Easy!” |
| 840 | |
| 841 | git rev-list --before=2020-03-17 |
| 842 | |
| @@ -859,41 +876,38 @@ | |
| 876 | |
| 877 | And too bad if you’re a Windows user who doesn’t want to use [Git |
| 878 | Bash][gbash], since neither of the stock OS command shells have a |
| 879 | command interpolation feature needed to run that horrid command. |
| 880 | |
| 881 | This alternative command still has weakness #2 above: if you run the |
| 882 | second `git show` command above on [Git’s own repository][gitgh], your |
| 883 | results may vary because there were four non-merge commits to Git on the |
| 884 | 17th of March, 2020. |
| 885 | |
| 886 | You may be asking with an exasperated huff, “What is your *point*, man?” |
| 887 | The point is that the equivalent in Fossil is simply: |
| 888 | |
| 889 | fossil up 2020-03-17 |
| 890 | |
| 891 | …which will *always* give the commit closest to midnight UTC on the 17th |
| 892 | of March, 2020, no matter whether you do it on a fresh clone or a stale |
| 893 | one. The answer won’t shift about from one clone to the next or from |
| 894 | one local time of day to the next. We owe this reliability and stability |
| 895 | to three Fossil design choices: |
| 896 | |
| 897 | * Parse timestamps from all commits on clone into a local commit index, |
| 898 | then maintain that index through subsequent commits and syncs. |
| 899 | |
| 900 | * Use an indexed SQL `ORDER BY` query to match timestamps to commit |
| 901 | IDs for a fast and consistent result. |
| 902 | |
| 903 | * Round timestamp strings up using [rules][frud] consistent across |
| 904 | computers and local time of day. |
| 905 | |
| 906 | [frud]: https://fossil-scm.org/home/file/src/name.c?ci=d2a59b03727bc3&ln=122-141 |
| 907 | [gbash]: https://appuals.com/what-is-git-bash/ |
| 908 | [gapxd]: https://github.com/git/git/blob/7f7ebe054a/date.c#L1298-L1300 |
| 909 | [gcod]: https://stackoverflow.com/a/6990682/142454 |
| 910 | [gdh]: https://www.git-tower.com/learn/git/faq/detached-head-when-checkout-commit/ |
| 911 | [gitgh]: https://github.com/git/git/ |
| 912 | [gle]: https://git-scm.com/docs/git-reflog#_options_for_expire |
| 913 | [gmc]: https://github.com/git/git/commit/67b0a24910fbb23c8f5e7a2c61c339818bc68296 |
| 914 |