[[title
                            Tla 2.0 Plans
]]

  /Updated: `24 Nov 2004'/

  Soon enough, `tla-1.3' will be finalized and work will begin on
  `tla-2.0'.  What is the direction planned for the `tla-2.0' series?

[[cartouche
  [[styled-lines

   /menu:/

   {Archive Format Changes}
   {Project Tree and Changeset Format Changes}
   {Hardlinkless Revlibs}
   {Annotate/Blame/File-history Support}
   {Selective Commit}
   {Librification}
   {Higher-Level Commands}
   {Smarter Caching}
   {Windows Support}
   {But How to Do It?}
   {Rewriting Arch}
  ]]
]]



* {*Archive Format Changes}

  2.0 wil be a good time to change the archive format number.  In
  other words, all 2.0 clients will be able to read archive from
  earlier versions of `tla', but earlier versions won't be able to
  read 2.0 archives.


** Shallower Paths

  I agree with the cries for a (dumb filesystem) archive format with directory
  layouts like this:

[[tty

	./category
	 ./category/category--branch1--1.0
	 ./category/category--branch1--1.1
	 ./category/category--branch2--1.1
	   ./category/category--branch2--1.1/X-category--branch2--1.1
	   ./category/category--branch2--1.1/base-0
	   ./category/category--branch2--1.1/patch-1
	   ./category/category--branch2--1.1/...
	 ./category/category--branch3--1.1

        [etc.]
]]

  In other words: shorter, shallower paths.

  It would be easiest to just say that, in 2.0, archives are free to
  not both creating empty categories, branches, and versions.
  2.0 archives might not support, for example, `tla make-branch'
  (or, for that matter, ever require users to run it).

** Summary Deltas, Summary Log Contents

  I need to drill down through the archived discussions about how
  we might do this to get the details right but, in general,
  the 2.0 archive format has to include these.


** Sub-version Branches

  No, not `"svn branches"'.

  People sometimes ask whether a branch-name divides a category or
  a version.   In real life, people want both kinds of branch name,
  sometimes both in combination.

  2.0 should permit sub-version branching:

[[tty

	gcc--apple-core--8.4--power-pc

]]



** Archive Cached Configurations

  To get arch, currently, you do something like:

[[tty
	% tla get $ARCH-TOP-LEVEL-DIR-REVISION  arch
        % cd arch
        % tla buildcfg "./tla.cfg"
]]

  I think 2.0 should explore the idea of *archive cached
  configurations*.

  So, I can say:

[[tty

	% tla archive-cfg-cache $ARCH-TOP-LEVEL-DIR-REV ./tla.cfg
]]

  Later, the entire `buldcfg' tree can be created by reading
  it from the tar file now cached in the archive:
  
[[tty

	% tla get-cfg $ARCH-TOP-LEVEL-DIR-REV ./tla.cfg arch
]]


  But for the question of how they are named, archive cached
  configurations make a perfectly reasonable format and mechanism
  for source /releases/.


** Case Insensitive Category and Branch Names

  I agree that, within new-format archives, those names can (at least
  optionally) be case insensitive.


** Unicode category and branch Names

  (See below, about unicode support in general.)  



* {*Project Tree and Changeset Format Changes}

  2.0 wil be a good time to change the project tree format number.  

** Shorter Paths

  I agree with the need for shallower, shorter paths to log file.


** The Name `{arch}' 

  I agree that, optionally, `{arch}' should be renameable to `.arch'.
  I wonder, actually, if we can't make this a per-user option with
  a per-tree default?   That is, actually renaming `{arch}' to `.arch'
  or vice versa does not count as a change to the tree.   Changing 
  some file /within/ `{arch}' (or `.arch') changes the default
  name for the tree.   User's can set a persistent option to always
  name the directory the way they prefer, regardless of the default
  for the tree.

** Patch Logs are Sets not Trees

  The in-tree patch log should be "pure" -- any file that aren't
  part of the record of logs should be treated as `unrecognized'
  by inventory.   `mkpatch' and `dopatch' should treat the patch
  log specially: recording patches to individual log files and
  /set operations on the collection of logs/ rather than /tree delta 
  operations/.   

  In other words, the changeset format should be modified to treat
  logs specially.   The format of in-tree patch log directories should
  /not/ be hard-coded in changesets.


** Re-do `inventory'

  The way `tagline' tags are searched for is broken in ways that can't
  be fully fixed until the tree format is rev'ed.

  The syntax of `=tagging-method' is sub-optimal.

  The actual implementation of `inventory' is a mess.

  It needs to be easier to convert between explicit and tagline tags.



* {*Hardlinkless Revlibs}

  AFS and some Windows filesystems lack any useful support for
  hardlinks.

  Arch needs a revision-library-like feature giving the best
  approximation possible of revlib functionality.

  One low-tech approach is to implement revision library revision 
  locking and use only "sliding" revision libraries on systems without
  hardlinks.


* {*Annotate/Blame/File-history Support}

  Arch needs a fast way to show the annotated history of an individual
  file.


* {*Selective Commit}

  `commit' in arch needs to feel a lot more like `commit' in CVS.


* {*Librification}

  People want that.


* {*Higher-Level Commands}

  E.g., the stuff being prototyped in `gtla'.


* {*Smarter Caching}

  
* {*Windows Support}

  Among the changes above are:

  /shortening various paths/

  /getting some semblence of revlib support without relying on
  hardlinks/

  With those changes in place, a native port of the resulting arch
  to windows should be trivially simple (e.g. simply linking against
  a posix compatability library will do most of the job).


* {*But How to Do It?}

** Lot's of Small Transformations or a Complete Rewrite?

 I estimate that completing the above list of tasks using the
 technique of making correctness-preserving transformations on the
 current code base would be equivalent, roughly, to carefully
 reviewing every line of code in the core at least 5 separate times,
 rewriting about 30% of the lines on each pass.

 (I mostly pulled the numbers out of my ass.   My envelope
  identifies a bit more than 5 tasks there, each of which
  requires a nearly complete review and each of which I'm guessing
  will impact about 30% of the code.)

 I estimate that completely rewriting arch, from scratch, getting
 to at least the functionality and reliability of the current code,
 accomplishing many of the tasks in the 2.0 list --- it's hard to say
 but it's certainly in the ballpark of making 5 passes over the
 current code, rewriting 30% each time.

 Transformations and a rewrite look, for all the certainty we can 
 guesstimate them at, about equally hard.


** Which is More Fun?

  Sometimes a program is in a state where it is fun to hack on by
  making transformations, and other times the program is not in that
  state.  Right now /Awiki/ is in that fun state, for example.  It's
  simple code, not yet too *intertwingled*.  You can add a lot of
  functionality quickly by making correctness-preserving
  transformations.

  /Arch/, in the form of `tla', is not so clearly in that pleasant
  state.  For example, the assumption that all strings are ascii
  pervades the code and teasing that apart will be hours and hours of
  assured tedium mixed with opportunity for serious, subtle error.
  Is that /really/ the approach to take for Unicode support, for
  example?  Or case-insensitive filename support?


** Which Produces Better Results

  I think the decision is clinched (at least in my envelope-analysis
  world) by an "opportunities for error" estimate.

  Doing our long task list by correctness-preserving transformations 
  appears to be the same amount of work (as far as we'd care to guess) 
  as doing a complete rewrite of tla.

  Same amount of work -- but work that is different in nature.

  Transformations have a lot of mechanical steps and steps in which a
  programmer has to "brain shift".  For example, the programmer may
  have to skim through every function in 5 files, looking for certain
  coding idioms, and rewriting them when found.  There are many
  opportunities for error: he might skip a file; he might miss an
  instance of the idiom; he might mistake something else for an
  instance of the idiom; he might make a typo during one of the
  rewrites.

  With the transformations approach, we accumulate all those risks for
  every line of core tla code, several separate times for each line.

  The complete rewrite approach, on the other hand, tries to work with
  each resulting line of code at most once or twice (just to write/
  peer-review it in the first place).   There is less mechanical work
  and less "brain shifting" between contexts.

  Two programmers can spend equal number of hours doing the
  transformation approach and doing the rewrite approach --- at least
  we can say confidently that the one using the transformation
  approach has more opportunities to commit errors.


* {*Rewriting Arch}

  I've done this before -- I have some experience in these
  matters. :-)

  I'll write a little plan, next.


* Copyright

 /Copyright (C) 2004 Tom Lord/

 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 2, or (at your option)
 any later version.

 This program is distributed in the hope that it will be useful,
 but /WITHOUT ANY WARRANTY/; without even the implied warranty of
 /MERCHANTABILITY/ or /FITNESS FOR A PARTICULAR PURPOSE/.  See the
 GNU General Public License for more details.

 You should have received a copy of the GNU General Public License
 along with this program; if not, write to the Free Software Foundation,
 Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

 See the file `COPYING' for further information about
 the copyright and warranty status of this work.



[[null
   ; arch-tag: Tom Lord Wed Nov 24 08:42:35 2004 (writings/tla-2.0.txt)
]]
