| | @@ -0,0 +1,49 @@ |
| 1 | +
|
| 2 | +To perform CVS imports for fossil we need at least the ability to
|
| 3 | +parse CVS files, i.e. RCS files, with slight differences.
|
| 4 | +
|
| 5 | +For the general architecture of the import facility we have two major
|
| 6 | +paths to choose between.
|
| 7 | +
|
| 8 | +One is to use an external tool which processes a cvs repository and
|
| 9 | +drives fossil through its CLI to insert the found changesets.
|
| 10 | +
|
| 11 | +The other is to integrate the whole facility into the fossil binary
|
| 12 | +itself.
|
| 13 | +
|
| 14 | +I dislike the second choice. It may be faster, as the implementation
|
| 15 | +can use all internal functionality of fossil to perform the import,
|
| 16 | +however it will also bloat the binary with functionality not needed
|
| 17 | +most of the time. Which becomes especially obvious if more importers
|
| 18 | +are to be written, like for monotone, bazaar, mercurial, bitkeeper,
|
| 19 | +git, SVN, Arc, etc. Keeping all this out of the core fossil binary is
|
| 20 | +IMHO more beneficial in the long term, also from a maintenance point
|
| 21 | +of view. The tools can evolve separately. Especially important for CVS
|
| 22 | +as it will have to deal with lots of broken repositories, all
|
| 23 | +different.
|
| 24 | +
|
| 25 | +However, nothing speaks against looking for common parts in all
|
| 26 | +possible import tools, and having these in the fossil core, as a
|
| 27 | +general backend all importer macollection of files, some of which may be manifests, others are data
|
| 28 | +files, and if it imports them in a random order it might find that
|
| 29 | +file X, which was imported first and therefore has no delta
|
| 30 | +compression, is actually somewhere in the middle of a line of
|
| 31 | +revisions, and should be delta-compressed, and then it has to find out
|
| 32 | +the predecessor and do the compression, etc.
|
| 33 | +
|
| 34 | +So depending on how the internal logic of delta-compression is done
|
| 35 | +reconstruct might need more logic to help the lower level achieve good
|
| 36 | +compression.
|
| 37 | +
|
| 38 | +Like, in a first pass determine which files are manifests, and read
|
| 39 | +enough of them to determine their parent/child structure, and in a
|
| 40 | +second pass actually imports them, in topological order, with all
|
| 41 | +relevant non-manifest files for a manifest imported as that time
|
| 42 | +too. With that the underlying engine would see the files basically in
|
| 43 | +the same order as generated by a regular series of commits.
|
| 44 | +
|
| 45 | +Problems for reconstruct: Files referenced, but not present, and,
|
| 46 | +conversely, files present, but not referenced. This can done as part
|
| 47 | +of the second pass, aborting when a missing file is encountered, with
|
| 48 | +(un)marking of used files, and at the end we know the unused
|
| 49 | +files. Could also be a separate pass between first and second.
|