Fossil SCM

Proper explanation for the inconsistent results from the "git show" command examples in Case Study 1 of the gitusers doc.

wyoung 2020-11-06 23:25 trunk
Commit 8b1eddef96b524a9067c95a24e3fb6fec0e7a89b89d45fda6fc6f50deb5cbfec
1 file changed +65 -51
+65 -51
--- www/gitusers.md
+++ www/gitusers.md
@@ -773,35 +773,57 @@
773773
774774
## <a id="cvdate" name="cs1"></a> Case Study 1: Checking Out A Version By Date
775775
776776
Let’s get into something a bit more complicated: a case study showing
777777
how the concepts lined out above cause Fossil to materially differ in
778
-day-to-day operation from Git. The goal: you need to check out
779
-a version of a project by date. Perhaps your customer gave you a
780
-vague bug report referencing only a date rather than a version, or perhaps you’re
781
-poking semi-randomly through history to find a “good” version to anchor
782
-the start point of a [`bisect`][bis] operation.
783
-
784
-My search engine’s first result for “git checkout by date” gives [a
785
-highly-upvoted accepted answer on Stack Overflow][gcod]. It gives two
786
-alternative commands, the first of which is based on Git’s [`rev-parse`
787
-feature][grp]:
778
+day-to-day operation from Git.
779
+
780
+Why would you want to check out a version of a project by date? Perhaps
781
+because your customer gave you a vague bug report referencing only a
782
+date rather than a version. Or, you may be poking semi-randomly through
783
+history to find a “good” version to anchor the start point of a
784
+[`bisect`][bis] operation.
785
+
786
+My search engine’s first result for “git checkout by date” is [this
787
+highly-upvoted accepted Stack Overflow answer][gcod]. The first command
788
+it gives is based on Git’s [`rev-parse` feature][grp]:
788789
789790
git checkout master@{2020-03-17}
790791
791
-It’s a bit cryptic, but that’s not its major flaw: it only works if the
792
-target commit is in Git’s [reflog], which Git [automatically
793
-prunes][gle] to 90 days of history, by default. Worse, the command won’t
794
-fail outright if the reflog can’t resolve the given date, it will
795
-instead give an *incorrect* result on `stdout`, being the closest it can
796
-come to your requested date, even if that’s months or years out from
797
-your target! I’ve even managed to get it to give an incorrect result
798
-without the warning by running it on stale Git clones.
799
-
800
-In other words, Git tries its best, and it may or may not warn if it
801
-fails, but it absolutely should never be trusted, because it’s working
802
-from a purgeable and possibly-stale local cache.
792
+There are a number of weaknesses in this command. From least to most
793
+critical:
794
+
795
+1. It’s a bit cryptic. Leave off the refname or punctuation, and it
796
+ means something else. You cannot simplify the cryptic incantation in
797
+ the typical use case.
798
+
799
+2. A date string in Git without a time will be interpreted as
800
+ “[at localtime on that date][gapxd],” so the command means something
801
+ different from one second to the next! If there are multiple commits
802
+ on that date, that command can give different results depending on
803
+ the time of day you run it.
804
+
805
+3. It gives misleading output if there is no close match for the date
806
+ in target commit in the local [reflog]. On a fresh clone, the reflog
807
+ is empty, and even on a well-established clone, Git [automatically
808
+ prunes][gle] the reflog to 90 days of history by default. This means
809
+ the command above can give different results from one machine to the
810
+ next, or even from one day to the next on the same clone.
811
+
812
+ The command won’t fail outright if the reflog can’t resolve the
813
+ given date: it simply gives the closest commit it can come up with,
814
+ even if it’s months or years out from your target! Sometimes it
815
+ gives a warning about the reflog not going back far enough to give a
816
+ useful result, and sometimes it doesn’t. If you’re on a fresh clone,
817
+ you are likely to get the “tip” commit’s revision ID no matter what
818
+ date value you give.
819
+
820
+ Git tries its best, but because it’s working from a purgeable and
821
+ possibly-stale local cache, you cannot trust its results.
822
+
823
+We cannot recommend this command at all. It’s unreliable even in the
824
+best case.
803825
804826
That same Stack Overflow answer therefore goes on to recommend an
805827
entirely different command:
806828
807829
git checkout $(git rev-list -n 1 --first-parent --before="2020-03-17" master)
@@ -811,15 +833,10 @@
811833
part because of its “small tools loosely joined” design philosophy. This
812834
sort of command is therefore composed piece by piece:
813835
814836
<center>◆  ◆  ◆</center>
815837
816
-**Given:** Git lacks a reliable lookup-by-date index into its log.
817
-
818
-**Goal:** Find the commit ID nearest a given date, you poor unfortunate
819
-user.
820
-
821838
“Oh, I know, I’ll search the rev-list, which outputs commit IDs by
822839
parsing the log backwards from `HEAD`! Easy!”
823840
824841
git rev-list --before=2020-03-17
825842
@@ -859,41 +876,38 @@
859876
860877
And too bad if you’re a Windows user who doesn’t want to use [Git
861878
Bash][gbash], since neither of the stock OS command shells have a
862879
command interpolation feature needed to run that horrid command.
863880
864
-All of the command examples above were done on [Git’s own
865
-repository][gitgh]. Your results with the first command — the one based
866
-on [Git’s `rev-parse` feature][grp] — will vary depending on the state
867
-of your local reflog.
868
-
869
-The date we’re using is simply our attempt to produce an example that
870
-always points at the same merge commit. As I write this, it’s pointing
871
-at [this one][gmc], but this is my third attempt: prior examples
872
-(2020-04-12 and 2020-03-12) broke for no obvious reason, suggesting that
873
-a given date in the above command isn’t always guaranteed to give the
874
-same commit. These example dates are far enough back in history that I
875
-doubt this is due to history rewriting. My pet hypothesis is that Git
876
-isn’t always traversing the log strictly in date order, and the order of
877
-entries in the log can shift about from one clone to the next, so the
878
-commit “before” a given date might differ from one to the next. If
879
-that’s true, then even the second command isn’t wholly reliable.
881
+This alternative command still has weakness #2 above: if you run the
882
+second `git show` command above on [Git’s own repository][gitgh], your
883
+results may vary because there were four non-merge commits to Git on the
884
+17th of March, 2020.
880885
881886
You may be asking with an exasperated huff, “What is your *point*, man?”
882887
The point is that the equivalent in Fossil is simply:
883888
884889
fossil up 2020-03-17
885890
886
-…which will *always* give the commit closest to the 17th of March, 2020,
887
-no matter whether you do it on a fresh clone or a stale one because of
888
-Fossil’s autosync feature. Because this uses a SQLite indexed
889
-“`ORDER BY`” query, the answer won’t shift about from one clone to the
890
-next.
891
-
892
-In Git terms, Fossil’s “reflog” is always complete and up-to-date.
893
-
891
+…which will *always* give the commit closest to midnight UTC on the 17th
892
+of March, 2020, no matter whether you do it on a fresh clone or a stale
893
+one. The answer won’t shift about from one clone to the next or from
894
+one local time of day to the next. We owe this reliability and stability
895
+to three Fossil design choices:
896
+
897
+* Parse timestamps from all commits on clone into a local commit index,
898
+ then maintain that index through subsequent commits and syncs.
899
+
900
+* Use an indexed SQL `ORDER BY` query to match timestamps to commit
901
+ IDs for a fast and consistent result.
902
+
903
+* Round timestamp strings up using [rules][frud] consistent across
904
+ computers and local time of day.
905
+
906
+[frud]: https://fossil-scm.org/home/file/src/name.c?ci=d2a59b03727bc3&ln=122-141
894907
[gbash]: https://appuals.com/what-is-git-bash/
908
+[gapxd]: https://github.com/git/git/blob/7f7ebe054a/date.c#L1298-L1300
895909
[gcod]: https://stackoverflow.com/a/6990682/142454
896910
[gdh]: https://www.git-tower.com/learn/git/faq/detached-head-when-checkout-commit/
897911
[gitgh]: https://github.com/git/git/
898912
[gle]: https://git-scm.com/docs/git-reflog#_options_for_expire
899913
[gmc]: https://github.com/git/git/commit/67b0a24910fbb23c8f5e7a2c61c339818bc68296
900914
--- www/gitusers.md
+++ www/gitusers.md
@@ -773,35 +773,57 @@
773
774 ## <a id="cvdate" name="cs1"></a> Case Study 1: Checking Out A Version By Date
775
776 Let’s get into something a bit more complicated: a case study showing
777 how the concepts lined out above cause Fossil to materially differ in
778 day-to-day operation from Git. The goal: you need to check out
779 a version of a project by date. Perhaps your customer gave you a
780 vague bug report referencing only a date rather than a version, or perhaps you’re
781 poking semi-randomly through history to find a “good” version to anchor
782 the start point of a [`bisect`][bis] operation.
783
784 My search engine’s first result for “git checkout by date” gives [a
785 highly-upvoted accepted answer on Stack Overflow][gcod]. It gives two
786 alternative commands, the first of which is based on Git’s [`rev-parse`
787 feature][grp]:
 
788
789 git checkout master@{2020-03-17}
790
791 It’s a bit cryptic, but that’s not its major flaw: it only works if the
792 target commit is in Git’s [reflog], which Git [automatically
793 prunes][gle] to 90 days of history, by default. Worse, the command won’t
794 fail outright if the reflog can’t resolve the given date, it will
795 instead give an *incorrect* result on `stdout`, being the closest it can
796 come to your requested date, even if that’s months or years out from
797 your target! I’ve even managed to get it to give an incorrect result
798 without the warning by running it on stale Git clones.
799
800 In other words, Git tries its best, and it may or may not warn if it
801 fails, but it absolutely should never be trusted, because it’s working
802 from a purgeable and possibly-stale local cache.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
803
804 That same Stack Overflow answer therefore goes on to recommend an
805 entirely different command:
806
807 git checkout $(git rev-list -n 1 --first-parent --before="2020-03-17" master)
@@ -811,15 +833,10 @@
811 part because of its “small tools loosely joined” design philosophy. This
812 sort of command is therefore composed piece by piece:
813
814 <center>◆  ◆  ◆</center>
815
816 **Given:** Git lacks a reliable lookup-by-date index into its log.
817
818 **Goal:** Find the commit ID nearest a given date, you poor unfortunate
819 user.
820
821 “Oh, I know, I’ll search the rev-list, which outputs commit IDs by
822 parsing the log backwards from `HEAD`! Easy!”
823
824 git rev-list --before=2020-03-17
825
@@ -859,41 +876,38 @@
859
860 And too bad if you’re a Windows user who doesn’t want to use [Git
861 Bash][gbash], since neither of the stock OS command shells have a
862 command interpolation feature needed to run that horrid command.
863
864 All of the command examples above were done on [Git’s own
865 repository][gitgh]. Your results with the first command — the one based
866 on [Git’s `rev-parse` feature][grp] — will vary depending on the state
867 of your local reflog.
868
869 The date we’re using is simply our attempt to produce an example that
870 always points at the same merge commit. As I write this, it’s pointing
871 at [this one][gmc], but this is my third attempt: prior examples
872 (2020-04-12 and 2020-03-12) broke for no obvious reason, suggesting that
873 a given date in the above command isn’t always guaranteed to give the
874 same commit. These example dates are far enough back in history that I
875 doubt this is due to history rewriting. My pet hypothesis is that Git
876 isn’t always traversing the log strictly in date order, and the order of
877 entries in the log can shift about from one clone to the next, so the
878 commit “before” a given date might differ from one to the next. If
879 that’s true, then even the second command isn’t wholly reliable.
880
881 You may be asking with an exasperated huff, “What is your *point*, man?”
882 The point is that the equivalent in Fossil is simply:
883
884 fossil up 2020-03-17
885
886 …which will *always* give the commit closest to the 17th of March, 2020,
887 no matter whether you do it on a fresh clone or a stale one because of
888 Fossil’s autosync feature. Because this uses a SQLite indexed
889 “`ORDER BY`” query, the answer won’t shift about from one clone to the
890 next.
891
892 In Git terms, Fossil’s “reflog” is always complete and up-to-date.
893
 
 
 
 
 
 
 
 
894 [gbash]: https://appuals.com/what-is-git-bash/
 
895 [gcod]: https://stackoverflow.com/a/6990682/142454
896 [gdh]: https://www.git-tower.com/learn/git/faq/detached-head-when-checkout-commit/
897 [gitgh]: https://github.com/git/git/
898 [gle]: https://git-scm.com/docs/git-reflog#_options_for_expire
899 [gmc]: https://github.com/git/git/commit/67b0a24910fbb23c8f5e7a2c61c339818bc68296
900
--- www/gitusers.md
+++ www/gitusers.md
@@ -773,35 +773,57 @@
773
774 ## <a id="cvdate" name="cs1"></a> Case Study 1: Checking Out A Version By Date
775
776 Let’s get into something a bit more complicated: a case study showing
777 how the concepts lined out above cause Fossil to materially differ in
778 day-to-day operation from Git.
779
780 Why would you want to check out a version of a project by date? Perhaps
781 because your customer gave you a vague bug report referencing only a
782 date rather than a version. Or, you may be poking semi-randomly through
783 history to find a “good” version to anchor the start point of a
784 [`bisect`][bis] operation.
785
786 My search engine’s first result for “git checkout by date” is [this
787 highly-upvoted accepted Stack Overflow answer][gcod]. The first command
788 it gives is based on Git’s [`rev-parse` feature][grp]:
789
790 git checkout master@{2020-03-17}
791
792 There are a number of weaknesses in this command. From least to most
793 critical:
794
795 1. It’s a bit cryptic. Leave off the refname or punctuation, and it
796 means something else. You cannot simplify the cryptic incantation in
797 the typical use case.
798
799 2. A date string in Git without a time will be interpreted as
800 “[at localtime on that date][gapxd],” so the command means something
801 different from one second to the next! If there are multiple commits
802 on that date, that command can give different results depending on
803 the time of day you run it.
804
805 3. It gives misleading output if there is no close match for the date
806 in target commit in the local [reflog]. On a fresh clone, the reflog
807 is empty, and even on a well-established clone, Git [automatically
808 prunes][gle] the reflog to 90 days of history by default. This means
809 the command above can give different results from one machine to the
810 next, or even from one day to the next on the same clone.
811
812 The command won’t fail outright if the reflog can’t resolve the
813 given date: it simply gives the closest commit it can come up with,
814 even if it’s months or years out from your target! Sometimes it
815 gives a warning about the reflog not going back far enough to give a
816 useful result, and sometimes it doesn’t. If you’re on a fresh clone,
817 you are likely to get the “tip” commit’s revision ID no matter what
818 date value you give.
819
820 Git tries its best, but because it’s working from a purgeable and
821 possibly-stale local cache, you cannot trust its results.
822
823 We cannot recommend this command at all. It’s unreliable even in the
824 best case.
825
826 That same Stack Overflow answer therefore goes on to recommend an
827 entirely different command:
828
829 git checkout $(git rev-list -n 1 --first-parent --before="2020-03-17" master)
@@ -811,15 +833,10 @@
833 part because of its “small tools loosely joined” design philosophy. This
834 sort of command is therefore composed piece by piece:
835
836 <center>◆  ◆  ◆</center>
837
 
 
 
 
 
838 “Oh, I know, I’ll search the rev-list, which outputs commit IDs by
839 parsing the log backwards from `HEAD`! Easy!”
840
841 git rev-list --before=2020-03-17
842
@@ -859,41 +876,38 @@
876
877 And too bad if you’re a Windows user who doesn’t want to use [Git
878 Bash][gbash], since neither of the stock OS command shells have a
879 command interpolation feature needed to run that horrid command.
880
881 This alternative command still has weakness #2 above: if you run the
882 second `git show` command above on [Git’s own repository][gitgh], your
883 results may vary because there were four non-merge commits to Git on the
884 17th of March, 2020.
 
 
 
 
 
 
 
 
 
 
 
 
885
886 You may be asking with an exasperated huff, “What is your *point*, man?”
887 The point is that the equivalent in Fossil is simply:
888
889 fossil up 2020-03-17
890
891 …which will *always* give the commit closest to midnight UTC on the 17th
892 of March, 2020, no matter whether you do it on a fresh clone or a stale
893 one. The answer won’t shift about from one clone to the next or from
894 one local time of day to the next. We owe this reliability and stability
895 to three Fossil design choices:
896
897 * Parse timestamps from all commits on clone into a local commit index,
898 then maintain that index through subsequent commits and syncs.
899
900 * Use an indexed SQL `ORDER BY` query to match timestamps to commit
901 IDs for a fast and consistent result.
902
903 * Round timestamp strings up using [rules][frud] consistent across
904 computers and local time of day.
905
906 [frud]: https://fossil-scm.org/home/file/src/name.c?ci=d2a59b03727bc3&ln=122-141
907 [gbash]: https://appuals.com/what-is-git-bash/
908 [gapxd]: https://github.com/git/git/blob/7f7ebe054a/date.c#L1298-L1300
909 [gcod]: https://stackoverflow.com/a/6990682/142454
910 [gdh]: https://www.git-tower.com/learn/git/faq/detached-head-when-checkout-commit/
911 [gitgh]: https://github.com/git/git/
912 [gle]: https://git-scm.com/docs/git-reflog#_options_for_expire
913 [gmc]: https://github.com/git/git/commit/67b0a24910fbb23c8f5e7a2c61c339818bc68296
914

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button