Fossil SCM

Added first code regarding import from cvs, processing a CVSROOT/history file. Looks good, except that the history I have is incomplete, truncated at the beginning. Extended my notes with results from this experiment, thinking about a possible different method.

aku 2007-08-31 04:57 trunk
Commit f166b0a63c5e75950de23d297d46e03dafaf4e51
+84
--- ci_cvs.txt
+++ ci_cvs.txt
@@ -1,5 +1,89 @@
1
+===============================================================================
2
+
3
+First experimental codes ...
4
+
5
+toosl/import-cvs.tcl
6
+tools/lib/rcsparser.tcl
7
+
8
+No actual import, right now only working on getting csets right. The
9
+code uses CVSROOT/history as foundation, and augments that with data
10
+from the individual RCS files (commit messages).
11
+
12
+Statistics of a run ...
13
+ 3516 csets.
14
+
15
+ 1545 breaks on user change
16
+ 558 breaks on file duplicate
17
+ 13 breaks on branch/trunk change
18
+ 1402 breaks on commit message change
19
+
20
+Time statistics ...
21
+ 3297 were processed in <= 1 seconds (93.77%)
22
+ 217 were processed in between 2 seconds and 14 minutes.
23
+ 1 was processed in ~41 minutes
24
+ 1 was processed in ~22 hours
25
+
26
+Time fuzz - Differences between csets range from 0 seconds to 66
27
+days. Needs stats analysis to see if there is an obvious break. Even
28
+so the times within csets and between csets overlap a great deal,
29
+making time a bad criterium for cset separation, IMHO.
30
+
31
+Leaving that topic, back to the current cset separator ...
32
+
33
+It has a problem:
34
+ The history file is not starting at the root!
35
+
36
+Examples:
37
+ The first three changesets are
38
+
39
+ =============================/user
40
+ M {Wed Nov 22 09:28:49 AM PST 2000} ericm 1.4 tcllib/modules/ftpd/ChangeLog
41
+ M {Wed Nov 22 09:28:49 AM PST 2000} ericm 1.7 tcllib/modules/ftpd/ftpd.tcl
42
+ files: 2
43
+ delta: 0
44
+ range: 0 seconds
45
+ =============================/cmsg
46
+ M {Wed Nov 29 02:14:33 PM PST 2000} ericm 1.3 tcllib/aclocal.m4
47
+ files: 1
48
+ delta:
49
+ range: 0 seconds
50
+ =============================/cmsg
51
+ M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.9 tcllib/modules/mime/ChangeLog
52
+ M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.12 tcllib/modules/mime/mime.tcl
53
+ files: 2
54
+ delta: 0
55
+ range: 0 seconds
56
+
57
+All csets modify files which already have several revisions. We have
58
+no csets from before that in the history, but these csets are in the
59
+RCS files.
60
+
61
+I wonder, is SF maybe removing old entries from the history when it
62
+grows too large ?
63
+
64
+This also affects incremental import ... I cannot assume that the
65
+history always grows. It may shrink ... I cannot keep an offset, will
66
+have to record the time of the last entry, or even the full entry
67
+processed last, to allow me to skip ahead to anything not known yet.
68
+
69
+I might have to try to implement the algorithm outlined below,
70
+matching the revision trees of the individual RCS files to each other
71
+to form the global tree of revisions. Maybe we can use the history to
72
+help in the matchup, for the parts where we do have it.
73
+
74
+Wait. This might be easier ... Take the delta information from the RCS
75
+files and generate a fake history ... Actually, this might even allow
76
+us to create a total history ... No, not quite, the merge entries the
77
+actual history may contain will be missing. These we can mix in from
78
+the actual history, as much as we have.
79
+
80
+Still, lets try that, a fake history, and then run this script on it
81
+to see if/where are differences.
82
+
83
+===============================================================================
84
+
185
286
Notes about CVS import, regarding CVS.
387
488
- Problem: CVS does not really track changesets, but only individual
589
revisions of files. To recover changesets it is necessary to look at
690
791
ADDED tools/import-cvs.tcl
892
ADDED tools/lib/pkgIndex.tcl
993
ADDED tools/lib/rcsparser.tcl
--- ci_cvs.txt
+++ ci_cvs.txt
@@ -1,5 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2 Notes about CVS import, regarding CVS.
3
4 - Problem: CVS does not really track changesets, but only individual
5 revisions of files. To recover changesets it is necessary to look at
6
7 DDED tools/import-cvs.tcl
8 DDED tools/lib/pkgIndex.tcl
9 DDED tools/lib/rcsparser.tcl
--- ci_cvs.txt
+++ ci_cvs.txt
@@ -1,5 +1,89 @@
1 ===============================================================================
2
3 First experimental codes ...
4
5 toosl/import-cvs.tcl
6 tools/lib/rcsparser.tcl
7
8 No actual import, right now only working on getting csets right. The
9 code uses CVSROOT/history as foundation, and augments that with data
10 from the individual RCS files (commit messages).
11
12 Statistics of a run ...
13 3516 csets.
14
15 1545 breaks on user change
16 558 breaks on file duplicate
17 13 breaks on branch/trunk change
18 1402 breaks on commit message change
19
20 Time statistics ...
21 3297 were processed in <= 1 seconds (93.77%)
22 217 were processed in between 2 seconds and 14 minutes.
23 1 was processed in ~41 minutes
24 1 was processed in ~22 hours
25
26 Time fuzz - Differences between csets range from 0 seconds to 66
27 days. Needs stats analysis to see if there is an obvious break. Even
28 so the times within csets and between csets overlap a great deal,
29 making time a bad criterium for cset separation, IMHO.
30
31 Leaving that topic, back to the current cset separator ...
32
33 It has a problem:
34 The history file is not starting at the root!
35
36 Examples:
37 The first three changesets are
38
39 =============================/user
40 M {Wed Nov 22 09:28:49 AM PST 2000} ericm 1.4 tcllib/modules/ftpd/ChangeLog
41 M {Wed Nov 22 09:28:49 AM PST 2000} ericm 1.7 tcllib/modules/ftpd/ftpd.tcl
42 files: 2
43 delta: 0
44 range: 0 seconds
45 =============================/cmsg
46 M {Wed Nov 29 02:14:33 PM PST 2000} ericm 1.3 tcllib/aclocal.m4
47 files: 1
48 delta:
49 range: 0 seconds
50 =============================/cmsg
51 M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.9 tcllib/modules/mime/ChangeLog
52 M {Sun Feb 04 12:28:35 AM PST 2001} ericm 1.12 tcllib/modules/mime/mime.tcl
53 files: 2
54 delta: 0
55 range: 0 seconds
56
57 All csets modify files which already have several revisions. We have
58 no csets from before that in the history, but these csets are in the
59 RCS files.
60
61 I wonder, is SF maybe removing old entries from the history when it
62 grows too large ?
63
64 This also affects incremental import ... I cannot assume that the
65 history always grows. It may shrink ... I cannot keep an offset, will
66 have to record the time of the last entry, or even the full entry
67 processed last, to allow me to skip ahead to anything not known yet.
68
69 I might have to try to implement the algorithm outlined below,
70 matching the revision trees of the individual RCS files to each other
71 to form the global tree of revisions. Maybe we can use the history to
72 help in the matchup, for the parts where we do have it.
73
74 Wait. This might be easier ... Take the delta information from the RCS
75 files and generate a fake history ... Actually, this might even allow
76 us to create a total history ... No, not quite, the merge entries the
77 actual history may contain will be missing. These we can mix in from
78 the actual history, as much as we have.
79
80 Still, lets try that, a fake history, and then run this script on it
81 to see if/where are differences.
82
83 ===============================================================================
84
85
86 Notes about CVS import, regarding CVS.
87
88 - Problem: CVS does not really track changesets, but only individual
89 revisions of files. To recover changesets it is necessary to look at
90
91 DDED tools/import-cvs.tcl
92 DDED tools/lib/pkgIndex.tcl
93 DDED tools/lib/rcsparser.tcl
--- a/tools/import-cvs.tcl
+++ b/tools/import-cvs.tcl
@@ -0,0 +1,370 @@
1
+#!/bin/sh
2
+# -*- tcl -*- \
3
+exec tclsh "$0" ${1+"$@"}
4
+
5
+# -----------------------------------------------------------------------------
6
+# Make private packages accessible.
7
+
8
+lappend auto_path [file join [file dirname [info script]] lib]
9
+package require rcsparser
10
+package require fileutil
11
+
12
+# -----------------------------------------------------------------------------
13
+# Repository management (CVS)
14
+
15
+namespace eval ::cvs {
16
+ variable base ; set base {} ; # Repository toplevel directory.
17
+ variable npaths ; array set npaths {} ; # path -> actual path mapping.
18
+ variable rpaths ; array set rpaths {} ; # path -> rcs file mapping.
19
+ variable cmsg ; array set cmsg {} ; # Cache of commit messages.
20
+}
21
+
22
+proc ::cvs::hextime {hex} {
23
+ set t 0
24
+ foreach d [string map {
25
+ a 10 b 11 c 12 d 13 e 14 f 15
26
+ A 10 B 11 C 12 D 13 E 14 F 15
27
+ } [split $hex {}]] {
28
+ set t [expr {($t << 4) + $d}];#horner
29
+ }
30
+ return $t
31
+}
32
+
33
+proc ::cvs::at {path} {
34
+ variable base $path
35
+ return
36
+}
37
+
38
+proc ::cvs::cmsg {path rev} {
39
+ variable cmsg
40
+ set key [list $path $rev]
41
+
42
+ if {![info exists cmsg($key)]} {
43
+ set rcs [cvs::rcsfile $path]
44
+
45
+ #puts stderr "scan $path => $rcs"
46
+
47
+ array set p [::rcsparser::process $rcs]
48
+
49
+ foreach {r msg} $p(commit) {
50
+ set cmsg([list $path $r]) $msg
51
+ }
52
+
53
+ if {![info exists cmsg($key)]} {
54
+ return -code error "Bogus revision $rev of file $path"
55
+ }
56
+ }
57
+
58
+ return $cmsg($key)
59
+}
60
+
61
+proc ::cvs::norm {path} {
62
+ variable npaths
63
+ if {![info exists npaths($path)]} {
64
+ set npaths($path) [NormFile $path]
65
+ }
66
+ return $npaths($path)
67
+}
68
+
69
+proc ::cvs::NormFile {path} {
70
+ variable base
71
+
72
+ set f $base/$path,v
73
+ if {[::file exists $f]} {return $path}
74
+
75
+ set hd [::file dirname $path]
76
+ set tl [::file tail $path]
77
+
78
+ set f $base/$hd/Attic/$tl,v
79
+ if {[::file exists $f]} {return $path}
80
+
81
+ # Bad. The dir can be truncated, i.e. it may not be an exact
82
+ # subdirectory of base, but deeper inside, with parents between it
83
+ # and base left out. Example (from the tcllib history file):
84
+ #
85
+ # M3f1d1245|afaupell|<remote>|ipentry|1.2|ChangeLog
86
+ # The correct path is 'tklib/modules/ipentry'.
87
+ # This we have to resolve too.
88
+
89
+ # normalize dance - old fileutil, modern fileutil (cvs head) doesn't do that.
90
+ set bx [file normalize $base]
91
+ foreach c [fileutil::findByPattern $bx -glob $hd] {
92
+ set cx [fileutil::stripPath $bx $c]
93
+ set c $base/$cx
94
+
95
+ set f $c/$tl,v
96
+ if {[::file exists $f]} {return $cx/$tl}
97
+ set f $c/Attic/$tl,v
98
+ if {[::file exists $f]} {return $cx/$tl}
99
+ }
100
+
101
+ puts stderr <$f>
102
+ return -code error "Unable to locate actual file for $path"
103
+}
104
+
105
+proc ::cvs::rcsfile {path} {
106
+ variable rpaths
107
+ if {![info exists rpaths($path)]} {
108
+ set rpaths($path) [RcsFile $path]
109
+ }
110
+ return $rpaths($path)
111
+}
112
+
113
+proc ::cvs::RcsFile {path} {
114
+ variable base
115
+
116
+ set f $base/$path,v
117
+ if {[::file exists $f]} {return $f}
118
+
119
+ set hd [::file dirname $path]
120
+ set tl [::file tail $path]
121
+
122
+ set f $base/$hd/Attic/$tl,v
123
+ if {[::file exists $f]} {return $f}
124
+
125
+ # We do not have truncated directories here, assuming that only
126
+ # norm paths are fed into this command.
127
+
128
+ if 0 {
129
+ # Bad. The dir can be truncated, i.e. it may not be an exact
130
+ # subdirectory of base, but deeper inside, with parents
131
+ # between it and base left out. Example (from the tcllib
132
+ # history file):
133
+ #
134
+ # M3f1d1245|afaupell|<remote>|ipentry|1.2|ChangeLog The
135
+ # correct path is 'tklib/modules/ipentry'. This we have to
136
+ # resolve too.
137
+
138
+ # normalize dance - old fileutil, modern fileutil (cvs head)
139
+ # doesn't do that.
140
+ set bx [file normalize $base]
141
+ foreach c [fileutil::findByPattern $bx -glob $hd] {
142
+ set c $base/[fileutil::stripPath $bx $c]
143
+
144
+ set f $c/$tl,v
145
+ if {[::file exists $f]} {return $f}
146
+ set f $c/Attic/$tl,v
147
+ if {[::file exists $f]} {return $f}
148
+ }
149
+ }
150
+
151
+ puts stderr <$f>
152
+ return -code error "Unable to locate rcs file for $path"
153
+}
154
+
155
+proc ::cvs::history {} {
156
+ variable base
157
+ return $base/CVSROOT/history
158
+}
159
+
160
+# -----------------------------------------------------------------------------
161
+
162
+# -----------------------------------------------------------------------------
163
+
164
+cvs::at [lindex $argv 0]
165
+
166
+#puts [::cvs::norm ipentry/ChangeLog]
167
+#exit
168
+
169
+#changeset state
170
+global cs csf
171
+array set cs {
172
+ start {} end {} cm {}
173
+ usr {} dt {}
174
+}
175
+array set csf {}
176
+
177
+proc rh {} {
178
+ global argv cs csf repo
179
+
180
+ set f [open [cvs::history] r]
181
+
182
+ while {[gets $f line] >= 0} {
183
+ # Decode line
184
+ foreach {op usr _ dir rev file} [split [string trim $line] |] break
185
+ set ts [cvs::hextime [string range $op 1 end]]
186
+ set op [string index $op 0]
187
+
188
+ # Filter out irrelevant parts
189
+ if {$op eq "O"} continue ; # checkout
190
+ if {$op eq "F"} continue ; # release
191
+ if {$op eq "T"} continue ; # rtag
192
+ if {$op eq "W"} continue ; # delete on update
193
+ if {$op eq "U"} continue ; # update
194
+ if {$op eq "P"} continue ; # update by patch
195
+ #if {$op eq "G"} continue ; # merge on update - FUTURE - identifies mergepoints.
196
+ if {$op eq "C"} continue ; # conflict on update - s.a.
197
+ if {$op eq "E"} continue ; # export
198
+ # left types
199
+ # M: commit
200
+ # A: addition
201
+ # R: removal
202
+
203
+ set df $dir/$file
204
+ if {[newcs $op $usr $ts $rev df cause]} {
205
+
206
+ # NOTE 1: ChangeSets containing CVSROOT => remove such files.
207
+ # NOTE 2: Empty changesets, ignore.
208
+
209
+ #commit
210
+ csstats
211
+
212
+ if {$cause eq "cmsg"} {
213
+set msg
214
+ } else {
215
+set msg ""
216
+ }
217
+
218
+ if {$cs(end) ne ""} {
219
+ puts =============================/$cause\ delta\ [expr {$ts - $cs(end)}]
220
+ } else {
221
+ puts =============================/$cause
222
+ }
223
+ csclear
224
+ }
225
+
226
+ # Note: newcs normalizes df, in case the log information is
227
+ # bogus. So the df here may be different from before newcs
228
+ csadd $op $usr $ts $rev $df
229
+ # apply modification to workspace
230
+ }
231
+}
232
+
233
+proc newcs {op usr ts rev dfv rv} {
234
+ global cs csf
235
+ upvar 1 $rv reason $dfv df
236
+
237
+ # Logic to detect when a new change set begins. A new change sets
238
+ # has started with the current entry when the entry
239
+ #
240
+ # 1. is for a different user than the last.
241
+ # 2. tries to add a file to the changeset which is already part of it.
242
+ # 3.is on the trunk, and the last on a branch, or vice versa.
243
+ # 4. the current entry has a different commit message than the last.
244
+
245
+ set df [cvs::norm $df]
246
+
247
+ # User changed
248
+ if {$usr ne $cs(usr)} {
249
+ set reason user
250
+ return 1
251
+ }
252
+
253
+ # File is already in the changeset
254
+ if {[info exists csf($df)]} {
255
+ set reason file
256
+ return 1
257
+ }
258
+
259
+ # last/current are different regarding trunk/branch
260
+ set depth [llength [split $rev .]]
261
+ if {($cs(lastd) == 2) != ($depth == 2)} {
262
+ set reason branch
263
+ return 1
264
+ }
265
+
266
+ # Commit message changed
267
+ if {[cvs::cmsg $cs(lastf) $cs(lastr)] ne [cvs::cmsg $df $rev]} {
268
+ set reason cmsg
269
+ return 1
270
+ }
271
+
272
+ # Same changeset
273
+ return 0
274
+}
275
+
276
+proc csclear {} {
277
+ global cs csf
278
+ array set cs {start {} usr {} end {} dt {}}
279
+ array unset csf *
280
+ return
281
+}
282
+
283
+proc csadd {op usr ts rev df} {
284
+ global cs csf
285
+
286
+ if {$cs(usr) eq ""} {set cs(usr) $usr}
287
+ if {$cs(start) eq ""} {
288
+ set cs(start) $ts
289
+ } else {
290
+ lappend cs(dt) [expr {$ts - $cs(end)}]
291
+ }
292
+ set cs(end) $ts
293
+
294
+ set csf($df) [list $op $rev]
295
+ set cs(lastf) $df
296
+ set cs(lastr) $rev
297
+ set cs(lastd) [llength [split $rev .]]
298
+
299
+ puts [list $op [clock format $ts] $usr $rev $df]
300
+ return
301
+}
302
+
303
+proc csstats {} {
304
+ global cs csf
305
+
306
+ if {$cs(start) eq ""} return
307
+
308
+ puts "files: [array size csf]"
309
+ puts "delta: $cs(dt)"
310
+ puts "range: [expr {$cs(end) - $cs(start)}] seconds"
311
+ return
312
+}
313
+
314
+rh
315
+
316
+exit
317
+
318
+=========================================
319
+new fossil
320
+new fossil workspace
321
+
322
+open history
323
+
324
+foreach line {
325
+ ignore unwanted lines
326
+
327
+ accumulate changesets data
328
+ new change-set => commit and continue
329
+
330
+ current branch and branch of new change different ?
331
+ => move fossil workspace to proper revision.
332
+
333
+ apply change to workspace
334
+ uncommitted
335
+}
336
+
337
+if uncommitted => commit
338
+delete workspace
339
+copy fossil repo to destination
340
+=========================================
341
+
342
+Not dealt with in outline: branches, tags, merging
343
+
344
+=========================================
345
+
346
+complexities
347
+- apply to workspace
348
+ - remove simple, direct translation
349
+ - add => requires extraction of indicated revision from ,v
350
+ - modify => see above (without add following)
351
+
352
+- ,v file => Can be the either dir/file,v, or dir/Attic/file,v
353
+ Both ? Priority ?
354
+
355
+- How to detect changes on branches ?
356
+
357
+- Have to keep knowledge of which branches went there.
358
+ => save change-sets information, + uuid in fossil
359
+ => need only the leaves of each branch, and of branch points.
360
+ => better keep all until complete.
361
+ => uuid can be gotten from 'manifest.uuid' in workspace.
362
+- keep tag information ? (symbolics)
363
+
364
+=========================================
365
+
366
+CVSROOT=ORIGIN
367
+
368
+cvs -d ORIGIN checkout -r REV FILE
369
+Extract specific revision of a specific file.
370
+-q, -Q quietness
--- a/tools/import-cvs.tcl
+++ b/tools/import-cvs.tcl
@@ -0,0 +1,370 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/tools/import-cvs.tcl
+++ b/tools/import-cvs.tcl
@@ -0,0 +1,370 @@
1 #!/bin/sh
2 # -*- tcl -*- \
3 exec tclsh "$0" ${1+"$@"}
4
5 # -----------------------------------------------------------------------------
6 # Make private packages accessible.
7
8 lappend auto_path [file join [file dirname [info script]] lib]
9 package require rcsparser
10 package require fileutil
11
12 # -----------------------------------------------------------------------------
13 # Repository management (CVS)
14
15 namespace eval ::cvs {
16 variable base ; set base {} ; # Repository toplevel directory.
17 variable npaths ; array set npaths {} ; # path -> actual path mapping.
18 variable rpaths ; array set rpaths {} ; # path -> rcs file mapping.
19 variable cmsg ; array set cmsg {} ; # Cache of commit messages.
20 }
21
22 proc ::cvs::hextime {hex} {
23 set t 0
24 foreach d [string map {
25 a 10 b 11 c 12 d 13 e 14 f 15
26 A 10 B 11 C 12 D 13 E 14 F 15
27 } [split $hex {}]] {
28 set t [expr {($t << 4) + $d}];#horner
29 }
30 return $t
31 }
32
33 proc ::cvs::at {path} {
34 variable base $path
35 return
36 }
37
38 proc ::cvs::cmsg {path rev} {
39 variable cmsg
40 set key [list $path $rev]
41
42 if {![info exists cmsg($key)]} {
43 set rcs [cvs::rcsfile $path]
44
45 #puts stderr "scan $path => $rcs"
46
47 array set p [::rcsparser::process $rcs]
48
49 foreach {r msg} $p(commit) {
50 set cmsg([list $path $r]) $msg
51 }
52
53 if {![info exists cmsg($key)]} {
54 return -code error "Bogus revision $rev of file $path"
55 }
56 }
57
58 return $cmsg($key)
59 }
60
61 proc ::cvs::norm {path} {
62 variable npaths
63 if {![info exists npaths($path)]} {
64 set npaths($path) [NormFile $path]
65 }
66 return $npaths($path)
67 }
68
69 proc ::cvs::NormFile {path} {
70 variable base
71
72 set f $base/$path,v
73 if {[::file exists $f]} {return $path}
74
75 set hd [::file dirname $path]
76 set tl [::file tail $path]
77
78 set f $base/$hd/Attic/$tl,v
79 if {[::file exists $f]} {return $path}
80
81 # Bad. The dir can be truncated, i.e. it may not be an exact
82 # subdirectory of base, but deeper inside, with parents between it
83 # and base left out. Example (from the tcllib history file):
84 #
85 # M3f1d1245|afaupell|<remote>|ipentry|1.2|ChangeLog
86 # The correct path is 'tklib/modules/ipentry'.
87 # This we have to resolve too.
88
89 # normalize dance - old fileutil, modern fileutil (cvs head) doesn't do that.
90 set bx [file normalize $base]
91 foreach c [fileutil::findByPattern $bx -glob $hd] {
92 set cx [fileutil::stripPath $bx $c]
93 set c $base/$cx
94
95 set f $c/$tl,v
96 if {[::file exists $f]} {return $cx/$tl}
97 set f $c/Attic/$tl,v
98 if {[::file exists $f]} {return $cx/$tl}
99 }
100
101 puts stderr <$f>
102 return -code error "Unable to locate actual file for $path"
103 }
104
105 proc ::cvs::rcsfile {path} {
106 variable rpaths
107 if {![info exists rpaths($path)]} {
108 set rpaths($path) [RcsFile $path]
109 }
110 return $rpaths($path)
111 }
112
113 proc ::cvs::RcsFile {path} {
114 variable base
115
116 set f $base/$path,v
117 if {[::file exists $f]} {return $f}
118
119 set hd [::file dirname $path]
120 set tl [::file tail $path]
121
122 set f $base/$hd/Attic/$tl,v
123 if {[::file exists $f]} {return $f}
124
125 # We do not have truncated directories here, assuming that only
126 # norm paths are fed into this command.
127
128 if 0 {
129 # Bad. The dir can be truncated, i.e. it may not be an exact
130 # subdirectory of base, but deeper inside, with parents
131 # between it and base left out. Example (from the tcllib
132 # history file):
133 #
134 # M3f1d1245|afaupell|<remote>|ipentry|1.2|ChangeLog The
135 # correct path is 'tklib/modules/ipentry'. This we have to
136 # resolve too.
137
138 # normalize dance - old fileutil, modern fileutil (cvs head)
139 # doesn't do that.
140 set bx [file normalize $base]
141 foreach c [fileutil::findByPattern $bx -glob $hd] {
142 set c $base/[fileutil::stripPath $bx $c]
143
144 set f $c/$tl,v
145 if {[::file exists $f]} {return $f}
146 set f $c/Attic/$tl,v
147 if {[::file exists $f]} {return $f}
148 }
149 }
150
151 puts stderr <$f>
152 return -code error "Unable to locate rcs file for $path"
153 }
154
155 proc ::cvs::history {} {
156 variable base
157 return $base/CVSROOT/history
158 }
159
160 # -----------------------------------------------------------------------------
161
162 # -----------------------------------------------------------------------------
163
164 cvs::at [lindex $argv 0]
165
166 #puts [::cvs::norm ipentry/ChangeLog]
167 #exit
168
169 #changeset state
170 global cs csf
171 array set cs {
172 start {} end {} cm {}
173 usr {} dt {}
174 }
175 array set csf {}
176
177 proc rh {} {
178 global argv cs csf repo
179
180 set f [open [cvs::history] r]
181
182 while {[gets $f line] >= 0} {
183 # Decode line
184 foreach {op usr _ dir rev file} [split [string trim $line] |] break
185 set ts [cvs::hextime [string range $op 1 end]]
186 set op [string index $op 0]
187
188 # Filter out irrelevant parts
189 if {$op eq "O"} continue ; # checkout
190 if {$op eq "F"} continue ; # release
191 if {$op eq "T"} continue ; # rtag
192 if {$op eq "W"} continue ; # delete on update
193 if {$op eq "U"} continue ; # update
194 if {$op eq "P"} continue ; # update by patch
195 #if {$op eq "G"} continue ; # merge on update - FUTURE - identifies mergepoints.
196 if {$op eq "C"} continue ; # conflict on update - s.a.
197 if {$op eq "E"} continue ; # export
198 # left types
199 # M: commit
200 # A: addition
201 # R: removal
202
203 set df $dir/$file
204 if {[newcs $op $usr $ts $rev df cause]} {
205
206 # NOTE 1: ChangeSets containing CVSROOT => remove such files.
207 # NOTE 2: Empty changesets, ignore.
208
209 #commit
210 csstats
211
212 if {$cause eq "cmsg"} {
213 set msg
214 } else {
215 set msg ""
216 }
217
218 if {$cs(end) ne ""} {
219 puts =============================/$cause\ delta\ [expr {$ts - $cs(end)}]
220 } else {
221 puts =============================/$cause
222 }
223 csclear
224 }
225
226 # Note: newcs normalizes df, in case the log information is
227 # bogus. So the df here may be different from before newcs
228 csadd $op $usr $ts $rev $df
229 # apply modification to workspace
230 }
231 }
232
233 proc newcs {op usr ts rev dfv rv} {
234 global cs csf
235 upvar 1 $rv reason $dfv df
236
237 # Logic to detect when a new change set begins. A new change sets
238 # has started with the current entry when the entry
239 #
240 # 1. is for a different user than the last.
241 # 2. tries to add a file to the changeset which is already part of it.
242 # 3.is on the trunk, and the last on a branch, or vice versa.
243 # 4. the current entry has a different commit message than the last.
244
245 set df [cvs::norm $df]
246
247 # User changed
248 if {$usr ne $cs(usr)} {
249 set reason user
250 return 1
251 }
252
253 # File is already in the changeset
254 if {[info exists csf($df)]} {
255 set reason file
256 return 1
257 }
258
259 # last/current are different regarding trunk/branch
260 set depth [llength [split $rev .]]
261 if {($cs(lastd) == 2) != ($depth == 2)} {
262 set reason branch
263 return 1
264 }
265
266 # Commit message changed
267 if {[cvs::cmsg $cs(lastf) $cs(lastr)] ne [cvs::cmsg $df $rev]} {
268 set reason cmsg
269 return 1
270 }
271
272 # Same changeset
273 return 0
274 }
275
276 proc csclear {} {
277 global cs csf
278 array set cs {start {} usr {} end {} dt {}}
279 array unset csf *
280 return
281 }
282
283 proc csadd {op usr ts rev df} {
284 global cs csf
285
286 if {$cs(usr) eq ""} {set cs(usr) $usr}
287 if {$cs(start) eq ""} {
288 set cs(start) $ts
289 } else {
290 lappend cs(dt) [expr {$ts - $cs(end)}]
291 }
292 set cs(end) $ts
293
294 set csf($df) [list $op $rev]
295 set cs(lastf) $df
296 set cs(lastr) $rev
297 set cs(lastd) [llength [split $rev .]]
298
299 puts [list $op [clock format $ts] $usr $rev $df]
300 return
301 }
302
303 proc csstats {} {
304 global cs csf
305
306 if {$cs(start) eq ""} return
307
308 puts "files: [array size csf]"
309 puts "delta: $cs(dt)"
310 puts "range: [expr {$cs(end) - $cs(start)}] seconds"
311 return
312 }
313
314 rh
315
316 exit
317
318 =========================================
319 new fossil
320 new fossil workspace
321
322 open history
323
324 foreach line {
325 ignore unwanted lines
326
327 accumulate changesets data
328 new change-set => commit and continue
329
330 current branch and branch of new change different ?
331 => move fossil workspace to proper revision.
332
333 apply change to workspace
334 uncommitted
335 }
336
337 if uncommitted => commit
338 delete workspace
339 copy fossil repo to destination
340 =========================================
341
342 Not dealt with in outline: branches, tags, merging
343
344 =========================================
345
346 complexities
347 - apply to workspace
348 - remove simple, direct translation
349 - add => requires extraction of indicated revision from ,v
350 - modify => see above (without add following)
351
352 - ,v file => Can be the either dir/file,v, or dir/Attic/file,v
353 Both ? Priority ?
354
355 - How to detect changes on branches ?
356
357 - Have to keep knowledge of which branches went there.
358 => save change-sets information, + uuid in fossil
359 => need only the leaves of each branch, and of branch points.
360 => better keep all until complete.
361 => uuid can be gotten from 'manifest.uuid' in workspace.
362 - keep tag information ? (symbolics)
363
364 =========================================
365
366 CVSROOT=ORIGIN
367
368 cvs -d ORIGIN checkout -r REV FILE
369 Extract specific revision of a specific file.
370 -q, -Q quietness
--- a/tools/lib/pkgIndex.tcl
+++ b/tools/lib/pkgIndex.tcl
@@ -0,0 +1,3 @@
1
+if {![package vsatisfies [package require Tcl] 8.4]} return
2
+package ifneed1.0rcsparsereded vc::tfossil_cmd.tcl]]
3
+packageeed1.0 [list sol
--- a/tools/lib/pkgIndex.tcl
+++ b/tools/lib/pkgIndex.tcl
@@ -0,0 +1,3 @@
 
 
 
--- a/tools/lib/pkgIndex.tcl
+++ b/tools/lib/pkgIndex.tcl
@@ -0,0 +1,3 @@
1 if {![package vsatisfies [package require Tcl] 8.4]} return
2 package ifneed1.0rcsparsereded vc::tfossil_cmd.tcl]]
3 packageeed1.0 [list sol
--- a/tools/lib/rcsparser.tcl
+++ b/tools/lib/rcsparser.tcl
@@ -0,0 +1,182 @@
1
+Parse RCS files (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
2
+#
3
+# ParseAPI--------------------feedback {logcmat)rcs:log::system rcs
4
+ namesp::* return
5
+}
6
+
7
+proc ::vc::rcprogressAPI
8
+
9
+# vc::rcs::parser::procrcs returns a dictionary containing the meta
10
+# data. The following keys are used
11
+#
12
+# Key Meaninrcsion
13
+# 'branch' ?
14
+# 'symbol' dict (symbol -> revision)
15
+# 'lock' dict (symbol -> revision)
16
+# 'crcs'date' (revision -> state)
17
+# 'parent' dict (revision -> parent revircsommit message)
18
+#
19
+# The state 'd----
20
+# Tool packages.# only the meta-data about when revisions were added, the tree
21
+# (branching) structure, commit messages.
22
+#
23
+# The parser is based on Recursive Descent.
24
+
25
+# -----------------------------------------------------------------------------
26
+# Requirements
27
+
28
+package require Tcl 8.4
29
+package require fileutil ; # Tcllib (cat)
30
+package require vc:ace eval ::vc::rcs::parser {
31
+ vc::tools::log::system rcs
32
+ namesp::* return
33
+}
34
+
35
+proc ::vc::rcs::parser::Log {} {
36
+ upvar 1 data data res res
37
+ LiteraAPI
38
+
39
+# vc::rcs::parser::process file
40
+#
41
+# Parses the rcs file and returns a dictionary containing the meta
42
+# following keys are used
43
+#
44
+# Key Meaning
45
+# --- -------
46
+# 'head' head revision
47
+# 'branch' ?
48
+# 'symbol' dict (symbol -> revision)
49
+# 'lock' dict (symbol -> revision)
50
+# 'comment' file comment
51
+# 'expand' ?
52
+# 'date' dict (revision -> date)
53
+# 'author' dict (revision -> author)
54
+# 'state' dict (revision -> state)
55
+# 'parent' dict (revision -> parent revision)
56
+# 'commit' dict (revision -> commit message)
57
+#
58
+# The state 'dead' has special meaning, the user should know that.
59
+
60
+# -----------------------------------------------------------------------------
61
+# API Implementation
62
+
63
+proc ::vc::rcs::parser::configure {key value} {
64
+ variable cache
65
+ switch -exact -- $key {
66
+ -cache {
67
+ set cache $value
68
+ }
69
+ default {
70
+ return -code error "Unknown switch $key, expected one of -cache"
71
+ }
72
+ }
73
+ return
74
+}
75
+
76
+proc ::vc::rcs::parser::process {path} {
77
+ set cachfile exists $cache] &&
78
+ ([file mtime $cach } {
79
+ # Use preparsed data if not invalidated by changes to the
80
+ # archive they are derived from.
81
+ write 4 rcs {Load preparsed data block}
82
+ return [fileutil::cat -encoding binary $cachefile]
83
+ }
84
+
85
+ set res [P]
86
+ }
87
+
88
+ set res [Process $path]
89
+
90
+ # Save parse result for quick pickup by future runs.
91
+ $res
92
+
93
+ return $res
94
+}
95
+
96
+# --ure, commit messages.
97
+#
98
+# The paAuthornly the meta- namesp::* return
99
+}
100
+
101
+proc ::vrn
102
+}
103
+
104
+proc ::vc::rcs::parser::Log {} {
105
+ upvar 1 data data res res
106
+ LiteraAPI
107
+
108
+a
109
+# data. The following keys are used
110
+#
111
+# Key Meaning
112
+# --- -------
113
+# 'head' head revision
114
+# 'branch' ?
115
+# 'symbol' dict (symbol -> revision)
116
+# 'lock' dict (symbol -> revision)
117
+# 'comment' file c
118
+# 'date' dict (IsIdent ;Parse RCS files (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
119
+#
120
+# ParseAPI--------------------feedbaces (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
121
+#
122
+# ParseAPI--------------------feedback {logcmat)rcs:log::system rcs
123
+ namesp::* return
124
+}
125
+
126
+proc ::vc::rcprogressAPI
127
+
128
+# vc::rcs::parser::procrcs returns a dictionary containing the meta
129
+# data. The following keys are used
130
+#
131
+# Key Meaninrcsion
132
+# 'branch' ?
133
+# 'symbol' dict (symbol -> revision)
134
+# 'lock' dict (symbol -> revision)
135
+# 'crcs'date' dict (revision -> date)
136
+# 'author' dict (revision -> author)
137
+# 'state' dict (revision -> state)
138
+# 'parent' dict (revision -> parent revircsommit message)
139
+#
140
+# The state 'dead' has special meaning, the user should knorcs-----------------------------------------------------------
141
+# Tool packages. Parsing RCS files.
142
+#
143
+# Some of the information in RCS files is skipped ov# ---------------------------------------------
144
+# Tool packages. Parsing RCS files.
145
+#
146
+# Some of the information in RCS files is skipped over, most
147
+# importantly the actual delta texts. The users of this parser need
148
+# only the meta-data about when revisions were added, the tree
149
+# (branching) structure, commit messages.
150
+#
151
+# The parser is based on Recursive Descent--------------------------------------------------------------
152
+# Tool pac[^;]*\s*} $data match> revision)
153
+# 'crcs'Def {keyrsive Desceunset res(idIsIden 'date' -> dict (symbol -> revision)
154
+# 'crcs'date' dict (revision -> date)
155
+# 'author' dict (revisionllib (cat)parser::pParse RCS files (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
156
+#
157
+# ParseAPI--------------------feedback {logcmat)rcs:log::system rcs
158
+ namesp::* return
159
+}
160
+
161
+proc ::vc::rcprogressAPI
162
+
163
+# vc::rcs::parser::procrcs returns a dictionary containing the meta
164
+# data. The following keys are used
165
+#
166
+# Key Meaninrcsion
167
+# 'branch' ?
168
+# 'symbol' dict (symbol -> revision)
169
+# 'lock' dict (symbol -> revision)
170
+# 'crcs'date' dict (revision -> date)
171
+# 'author' dict (revision -> author)
172
+# 'state' dict (revision -> state)
173
+# 'parent' dict (revision -> parent revircsommit message)
174
+#
175
+# The state 'dead' has special meaning, the user should knorcs-----------------------------------------------------------
176
+# Tool packages. Parsing RCS files.
177
+#
178
+# Some of the information in RCS files is skipped ov# ---------------------------------------------
179
+# Tool packages. Parsing RCS files.
180
+#
181
+# Some of the information in RCS files is skipped over, most
182
+# importantly t
--- a/tools/lib/rcsparser.tcl
+++ b/tools/lib/rcsparser.tcl
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/tools/lib/rcsparser.tcl
+++ b/tools/lib/rcsparser.tcl
@@ -0,0 +1,182 @@
1 Parse RCS files (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
2 #
3 # ParseAPI--------------------feedback {logcmat)rcs:log::system rcs
4 namesp::* return
5 }
6
7 proc ::vc::rcprogressAPI
8
9 # vc::rcs::parser::procrcs returns a dictionary containing the meta
10 # data. The following keys are used
11 #
12 # Key Meaninrcsion
13 # 'branch' ?
14 # 'symbol' dict (symbol -> revision)
15 # 'lock' dict (symbol -> revision)
16 # 'crcs'date' (revision -> state)
17 # 'parent' dict (revision -> parent revircsommit message)
18 #
19 # The state 'd----
20 # Tool packages.# only the meta-data about when revisions were added, the tree
21 # (branching) structure, commit messages.
22 #
23 # The parser is based on Recursive Descent.
24
25 # -----------------------------------------------------------------------------
26 # Requirements
27
28 package require Tcl 8.4
29 package require fileutil ; # Tcllib (cat)
30 package require vc:ace eval ::vc::rcs::parser {
31 vc::tools::log::system rcs
32 namesp::* return
33 }
34
35 proc ::vc::rcs::parser::Log {} {
36 upvar 1 data data res res
37 LiteraAPI
38
39 # vc::rcs::parser::process file
40 #
41 # Parses the rcs file and returns a dictionary containing the meta
42 # following keys are used
43 #
44 # Key Meaning
45 # --- -------
46 # 'head' head revision
47 # 'branch' ?
48 # 'symbol' dict (symbol -> revision)
49 # 'lock' dict (symbol -> revision)
50 # 'comment' file comment
51 # 'expand' ?
52 # 'date' dict (revision -> date)
53 # 'author' dict (revision -> author)
54 # 'state' dict (revision -> state)
55 # 'parent' dict (revision -> parent revision)
56 # 'commit' dict (revision -> commit message)
57 #
58 # The state 'dead' has special meaning, the user should know that.
59
60 # -----------------------------------------------------------------------------
61 # API Implementation
62
63 proc ::vc::rcs::parser::configure {key value} {
64 variable cache
65 switch -exact -- $key {
66 -cache {
67 set cache $value
68 }
69 default {
70 return -code error "Unknown switch $key, expected one of -cache"
71 }
72 }
73 return
74 }
75
76 proc ::vc::rcs::parser::process {path} {
77 set cachfile exists $cache] &&
78 ([file mtime $cach } {
79 # Use preparsed data if not invalidated by changes to the
80 # archive they are derived from.
81 write 4 rcs {Load preparsed data block}
82 return [fileutil::cat -encoding binary $cachefile]
83 }
84
85 set res [P]
86 }
87
88 set res [Process $path]
89
90 # Save parse result for quick pickup by future runs.
91 $res
92
93 return $res
94 }
95
96 # --ure, commit messages.
97 #
98 # The paAuthornly the meta- namesp::* return
99 }
100
101 proc ::vrn
102 }
103
104 proc ::vc::rcs::parser::Log {} {
105 upvar 1 data data res res
106 LiteraAPI
107
108 a
109 # data. The following keys are used
110 #
111 # Key Meaning
112 # --- -------
113 # 'head' head revision
114 # 'branch' ?
115 # 'symbol' dict (symbol -> revision)
116 # 'lock' dict (symbol -> revision)
117 # 'comment' file c
118 # 'date' dict (IsIdent ;Parse RCS files (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
119 #
120 # ParseAPI--------------------feedbaces (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
121 #
122 # ParseAPI--------------------feedback {logcmat)rcs:log::system rcs
123 namesp::* return
124 }
125
126 proc ::vc::rcprogressAPI
127
128 # vc::rcs::parser::procrcs returns a dictionary containing the meta
129 # data. The following keys are used
130 #
131 # Key Meaninrcsion
132 # 'branch' ?
133 # 'symbol' dict (symbol -> revision)
134 # 'lock' dict (symbol -> revision)
135 # 'crcs'date' dict (revision -> date)
136 # 'author' dict (revision -> author)
137 # 'state' dict (revision -> state)
138 # 'parent' dict (revision -> parent revircsommit message)
139 #
140 # The state 'dead' has special meaning, the user should knorcs-----------------------------------------------------------
141 # Tool packages. Parsing RCS files.
142 #
143 # Some of the information in RCS files is skipped ov# ---------------------------------------------
144 # Tool packages. Parsing RCS files.
145 #
146 # Some of the information in RCS files is skipped over, most
147 # importantly the actual delta texts. The users of this parser need
148 # only the meta-data about when revisions were added, the tree
149 # (branching) structure, commit messages.
150 #
151 # The parser is based on Recursive Descent--------------------------------------------------------------
152 # Tool pac[^;]*\s*} $data match> revision)
153 # 'crcs'Def {keyrsive Desceunset res(idIsIden 'date' -> dict (symbol -> revision)
154 # 'crcs'date' dict (revision -> date)
155 # 'author' dict (revisionllib (cat)parser::pParse RCS files (,v) - ignore the deltas - we need only the commit messages; # Tcllib (cat)parser::process file
156 #
157 # ParseAPI--------------------feedback {logcmat)rcs:log::system rcs
158 namesp::* return
159 }
160
161 proc ::vc::rcprogressAPI
162
163 # vc::rcs::parser::procrcs returns a dictionary containing the meta
164 # data. The following keys are used
165 #
166 # Key Meaninrcsion
167 # 'branch' ?
168 # 'symbol' dict (symbol -> revision)
169 # 'lock' dict (symbol -> revision)
170 # 'crcs'date' dict (revision -> date)
171 # 'author' dict (revision -> author)
172 # 'state' dict (revision -> state)
173 # 'parent' dict (revision -> parent revircsommit message)
174 #
175 # The state 'dead' has special meaning, the user should knorcs-----------------------------------------------------------
176 # Tool packages. Parsing RCS files.
177 #
178 # Some of the information in RCS files is skipped ov# ---------------------------------------------
179 # Tool packages. Parsing RCS files.
180 #
181 # Some of the information in RCS files is skipped over, most
182 # importantly t

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button