Fossil SCM

Updated my work list, added first notes about 'cvs import' functionality.

aku 2007-08-28 03:34 trunk
Commit 103c397e4b36b1ddfb4afe52d120489361cac9c8
+50
--- a/ci_cvs.txt
+++ b/ci_cvs.txt
@@ -0,0 +1,50 @@
1
+ M {Wed Nov 22 09:28:49 AM PST 2000} ericm 1.7 tcllib/modules/ftpd/ftpd.tcl
2
+ files: 2
3
+ delta: 0
4
+ range: 0 seconds
5
+ =============================/cmsg
6
+ M {Wed Nov 29 02:14:33 PM PST 2000} ericm 1.3 tcllib/aclocal.m4
7
+ files: 1
8
+ delta:
9
+ range: 0 seconds
10
+ =============================/cmsg
11
+ M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.9 tcllib/modules/mime/ChangeLog
12
+ M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.12 tcllib/modules/mime/mime.tcl
13
+ files: 2
14
+ delta: 0
15
+ range: 0 seconds
16
+
17
+All csets modify files which already have several revisions. We have
18
+no csets from before that in the history, but these csets are in the
19
+RCS files.
20
+
21
+I wonder, is SF maybe removing old entries from the history when it
22
+grows too large ?
23
+
24
+This also affects incremental import ... I cannot assume that the
25
+history always grows. It may shrink ... I cannot keep an offset, will
26
+have to record the time of the last entry, or even the full entry
27
+processed last, to allow me to skip ahead to anything not known yet.
28
+
29
+I might have to try to implement the algorithm outlined below,
30
+matching the revision trees of the individual RCS files to each other
31
+to form the global tree of revisions. Maybe we can use the history to
32
+help in the matchup, for the parts where we do have it.
33
+
34
+Wait. This might be easier ... Take the delta information from the RCS
35
+files and generate a fake history ... Actually, this might even allow
36
+us to create a total history ... No, not quite, the merge entries the
37
+actual history may contain will be missing. These we can mix in from
38
+the actual history, as much as we have.
39
+
40
+Still, lets try that, a fake history, and then run this script on it
41
+to see if/where are differences.
42
+
43
+===============================================================================
44
+
45
+
46
+Notes about CVS import, regarding CVS.
47
+
48
+- Problem: CVS does not really track changesets, but only individual
49
+ revisions of files. To recover changesets it is necessary to look at
50
+ author, branch, timestamp information, and the commit
--- a/ci_cvs.txt
+++ b/ci_cvs.txt
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/ci_cvs.txt
+++ b/ci_cvs.txt
@@ -0,0 +1,50 @@
1 M {Wed Nov 22 09:28:49 AM PST 2000} ericm 1.7 tcllib/modules/ftpd/ftpd.tcl
2 files: 2
3 delta: 0
4 range: 0 seconds
5 =============================/cmsg
6 M {Wed Nov 29 02:14:33 PM PST 2000} ericm 1.3 tcllib/aclocal.m4
7 files: 1
8 delta:
9 range: 0 seconds
10 =============================/cmsg
11 M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.9 tcllib/modules/mime/ChangeLog
12 M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.12 tcllib/modules/mime/mime.tcl
13 files: 2
14 delta: 0
15 range: 0 seconds
16
17 All csets modify files which already have several revisions. We have
18 no csets from before that in the history, but these csets are in the
19 RCS files.
20
21 I wonder, is SF maybe removing old entries from the history when it
22 grows too large ?
23
24 This also affects incremental import ... I cannot assume that the
25 history always grows. It may shrink ... I cannot keep an offset, will
26 have to record the time of the last entry, or even the full entry
27 processed last, to allow me to skip ahead to anything not known yet.
28
29 I might have to try to implement the algorithm outlined below,
30 matching the revision trees of the individual RCS files to each other
31 to form the global tree of revisions. Maybe we can use the history to
32 help in the matchup, for the parts where we do have it.
33
34 Wait. This might be easier ... Take the delta information from the RCS
35 files and generate a fake history ... Actually, this might even allow
36 us to create a total history ... No, not quite, the merge entries the
37 actual history may contain will be missing. These we can mix in from
38 the actual history, as much as we have.
39
40 Still, lets try that, a fake history, and then run this script on it
41 to see if/where are differences.
42
43 ===============================================================================
44
45
46 Notes about CVS import, regarding CVS.
47
48 - Problem: CVS does not really track changesets, but only individual
49 revisions of files. To recover changesets it is necessary to look at
50 author, branch, timestamp information, and the commit
--- a/ci_fossil.txt
+++ b/ci_fossil.txt
@@ -0,0 +1,49 @@
1
+
2
+To perform CVS imports for fossil we need at least the ability to
3
+parse CVS files, i.e. RCS files, with slight differences.
4
+
5
+For the general architecture of the import facility we have two major
6
+paths to choose between.
7
+
8
+One is to use an external tool which processes a cvs repository and
9
+drives fossil through its CLI to insert the found changesets.
10
+
11
+The other is to integrate the whole facility into the fossil binary
12
+itself.
13
+
14
+I dislike the second choice. It may be faster, as the implementation
15
+can use all internal functionality of fossil to perform the import,
16
+however it will also bloat the binary with functionality not needed
17
+most of the time. Which becomes especially obvious if more importers
18
+are to be written, like for monotone, bazaar, mercurial, bitkeeper,
19
+git, SVN, Arc, etc. Keeping all this out of the core fossil binary is
20
+IMHO more beneficial in the long term, also from a maintenance point
21
+of view. The tools can evolve separately. Especially important for CVS
22
+as it will have to deal with lots of broken repositories, all
23
+different.
24
+
25
+However, nothing speaks against looking for common parts in all
26
+possible import tools, and having these in the fossil core, as a
27
+general backend all importer macollection of files, some of which may be manifests, others are data
28
+files, and if it imports them in a random order it might find that
29
+file X, which was imported first and therefore has no delta
30
+compression, is actually somewhere in the middle of a line of
31
+revisions, and should be delta-compressed, and then it has to find out
32
+the predecessor and do the compression, etc.
33
+
34
+So depending on how the internal logic of delta-compression is done
35
+reconstruct might need more logic to help the lower level achieve good
36
+compression.
37
+
38
+Like, in a first pass determine which files are manifests, and read
39
+enough of them to determine their parent/child structure, and in a
40
+second pass actually imports them, in topological order, with all
41
+relevant non-manifest files for a manifest imported as that time
42
+too. With that the underlying engine would see the files basically in
43
+the same order as generated by a regular series of commits.
44
+
45
+Problems for reconstruct: Files referenced, but not present, and,
46
+conversely, files present, but not referenced. This can done as part
47
+of the second pass, aborting when a missing file is encountered, with
48
+(un)marking of used files, and at the end we know the unused
49
+files. Could also be a separate pass between first and second.
--- a/ci_fossil.txt
+++ b/ci_fossil.txt
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/ci_fossil.txt
+++ b/ci_fossil.txt
@@ -0,0 +1,49 @@
1
2 To perform CVS imports for fossil we need at least the ability to
3 parse CVS files, i.e. RCS files, with slight differences.
4
5 For the general architecture of the import facility we have two major
6 paths to choose between.
7
8 One is to use an external tool which processes a cvs repository and
9 drives fossil through its CLI to insert the found changesets.
10
11 The other is to integrate the whole facility into the fossil binary
12 itself.
13
14 I dislike the second choice. It may be faster, as the implementation
15 can use all internal functionality of fossil to perform the import,
16 however it will also bloat the binary with functionality not needed
17 most of the time. Which becomes especially obvious if more importers
18 are to be written, like for monotone, bazaar, mercurial, bitkeeper,
19 git, SVN, Arc, etc. Keeping all this out of the core fossil binary is
20 IMHO more beneficial in the long term, also from a maintenance point
21 of view. The tools can evolve separately. Especially important for CVS
22 as it will have to deal with lots of broken repositories, all
23 different.
24
25 However, nothing speaks against looking for common parts in all
26 possible import tools, and having these in the fossil core, as a
27 general backend all importer macollection of files, some of which may be manifests, others are data
28 files, and if it imports them in a random order it might find that
29 file X, which was imported first and therefore has no delta
30 compression, is actually somewhere in the middle of a line of
31 revisions, and should be delta-compressed, and then it has to find out
32 the predecessor and do the compression, etc.
33
34 So depending on how the internal logic of delta-compression is done
35 reconstruct might need more logic to help the lower level achieve good
36 compression.
37
38 Like, in a first pass determine which files are manifests, and read
39 enough of them to determine their parent/child structure, and in a
40 second pass actually imports them, in topological order, with all
41 relevant non-manifest files for a manifest imported as that time
42 too. With that the underlying engine would see the files basically in
43 the same order as generated by a regular series of commits.
44
45 Problems for reconstruct: Files referenced, but not present, and,
46 conversely, files present, but not referenced. This can done as part
47 of the second pass, aborting when a missing file is encountered, with
48 (un)marking of used files, and at the end we know the unused
49 files. Could also be a separate pass between first and second.
+2 -2
--- todo-ak.txt
+++ todo-ak.txt
@@ -11,15 +11,15 @@
1111
1212
* Think about exposure of functionality as libraries, i.e. Tcl
1313
packages. Foundations like delta, etc. first, work up to
1414
higher-levels.
1515
16
-* Document delta format, delta encoder.
17
-
1816
* Document the merge algorithm.
1917
2018
* Document the xfer protocol.
19
+
20
+* CVS import. Testcases: Tcl, Tk, Tcllib.
2121
2222
Questions
2323
2424
* In the timeline seen at http://fossil-scm.hwaci.com/fossil/timeline
2525
the manifest uuids are links to pages providing additional
2626
--- todo-ak.txt
+++ todo-ak.txt
@@ -11,15 +11,15 @@
11
12 * Think about exposure of functionality as libraries, i.e. Tcl
13 packages. Foundations like delta, etc. first, work up to
14 higher-levels.
15
16 * Document delta format, delta encoder.
17
18 * Document the merge algorithm.
19
20 * Document the xfer protocol.
 
 
21
22 Questions
23
24 * In the timeline seen at http://fossil-scm.hwaci.com/fossil/timeline
25 the manifest uuids are links to pages providing additional
26
--- todo-ak.txt
+++ todo-ak.txt
@@ -11,15 +11,15 @@
11
12 * Think about exposure of functionality as libraries, i.e. Tcl
13 packages. Foundations like delta, etc. first, work up to
14 higher-levels.
15
 
 
16 * Document the merge algorithm.
17
18 * Document the xfer protocol.
19
20 * CVS import. Testcases: Tcl, Tk, Tcllib.
21
22 Questions
23
24 * In the timeline seen at http://fossil-scm.hwaci.com/fossil/timeline
25 the manifest uuids are links to pages providing additional
26

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button