2011-10-09

Genericity, Recursive Data Types, and Inheritance.

A couple of times recently I've wanted to create a generic recursive data type and then inherit from it. For example suppose we wanted to define the concept of a tree as a data type, and then extend it to create trees that bear specific types of data in their nodes. A second example is a graph consisting of nodes and edges. This is a recursive data type in that, from a node, you can follow an edge and find another node. In both cases you'd expect some homogeneity; in a tree, the children of a node should be nodes of the same type. In a graph, if you start at a node and follow an edge, you should arrive at another node of the same type. This homogeneity requirement is where genericity provides a solution.

To illustrate, I'll use the tree example. We'd like to build, say trees of integers, and tree of floating point numbers. These are two types, but as they share something in common: their "treeness". We'd like them to inherit the characteristics common to all trees from a common base type.

The key to making this work is to use bounded genericity.

This abstract class captures the "treeness" of its subclasses. Any object of type Tree<T> has children of type T where T is a type that extends Tree<T>.

Now we can both specialize and concretize to create a type that represents trees of integers


Each object of type IntTree is a tree that has IntTree objects as children and a field of type int.

Equally we could have a tree of floating point variables.


That we can do this is only really interesting if we can write generic code to deal with data (trees in this case) at the generic level of abstraction. And we can. For example here is a class of depth-first tree traversers.

Then we can specialize and concretize this class to create a visitor that finds the sum of the integers in an IntTree.

Finally I'll just briefly describe the systems where these issues came up.
  •  For the hierarchical graph types of the JHigraph system we defined: first, a mutually recursive set of generic interfaces; next, a mutually recursive set of abstract classes; and, finally, a mutually recursive set of abstract classes that use a tag system similar to XML and ensure validity. These three sets of types define the model at various levels of abstraction. Based on the interfaces, we then defined a set of generic view classes and  layout managers for visualizing the nodes and edges of the graph. An application can be built by specializing and concretizing either set of generic, abstract node and edge classes to hold the appropriate kind of data and by specializing and concretizing the view classes.
  • The second application is a language processor in which the parser produces an abstract syntax tree and then the type checker produces a typed abstract syntax tree. The plain and the typed abstract syntax tree types share a common generic superclass.

2011-04-13

How to do undo and redo

Introduction

Almost all desktop software has some form of undo functionality. In earlier days there might have only been the ability to undo one change --- unix's vi/ex editor had this, at least when I used it in the 1980s and 1990s. You still see this in some photo editing software. Now it is common to provide either unlimited undo or at least many levels of undo. Of course where there is undo there is usually redo.
Usually undo and redo operate in a last-in-first-out manner: Undo means undo the most recent action not yet undone, while redo means redo the most recently undone action, not yet redone. Whenever a new action is done for the first time, the ability to redo any previously undone actions disappears. This is called "linear undo/redo". Nonlinear undo allows the user to pick certain past actions to undo. Nonlinear undo is useful for documents that are being edited by more than one person; a user might want to undo their last action without (if possible) affecting changes made by other users. Nonlinear undo might also allow a user to go back to an earlier point and insert a new action and then redo all the actions just undone. Nonlinear undo/redo can have problems with coherence; what does it mean for Alice to undo an insertion that Bob has just altered? For this article, I'll assume we are dealing with linear undo/redo.
To the software engineer, an obvious question is "How do they do that?" And when we have to write something, the question is "How can I do that?"
This essay discusses a few ways to do undo and, in some cases, redo.
We can assess these various approaches according to efficiency. Another important consideration is layering. If we have an existing data structure implementation, then we'd probably want to layer undo-redo functionality on top of it. On the other hand, in creating a new implementation, it can be convenient to layer it on top of a layer that supports undo and redo; this way the clients of the data type are hardly influenced by the need to support undo and redo. 

Running Example

As a running example, I'll consider that the structure that we want to do undo and redo on is some sort of tree, such as you might use to represent an HTML document, only much simpler. There is one kind of node. Each node is associated with a string and zero or more children.
The interface to the data structure is as follows
interface TreeI {
    NodeI makeNode( String text ) ;


    NodeI getRoot() ;

    void insert( NodeI parent, int posn, NodeI child ) ;

    void remove( NodeI parent, int posn ) ;
}

Objects adhering to this interface are the model for our system. I'll assume there is also some sort of GUI built on top of the model, perhaps following some sort of model-view-controller pattern. If we are using the observer pattern between the model and the GUI, then undoing changes to the model should cause the GUI to automatically revert the view back to where it was, or at least to a state that shows the correct model state. There may be some changes to the view that are not reverted. For example, if the view has scrolling, the scroll position might not change back to exactly where it was in response to an undo action.
(Note I'm using Java notation, but the ideas here should adapt to any imperative object oriented language. I haven't run the Java code presented here through a compiler, so there will no doubt be some minor errors. I also haven't worried too much about accessibility levels: public, private, protected, default.)

Checkpointing

Checkpointing is perhaps the most obvious way to deal with undo and redo. It's likely what nonprogrammers think is going on, if they think about it at all. The idea is just to copy the whole state of the model before responding to user events that might change the model.
We'll layer checkpointing on top of an underlying model like this.
public class TreeWithCheckpointing implements TreeI {
    Tree tree = new Tree()  ;
   
    Stack<Tree> olderTrees() = new Stack<Tree>() ;

    Stack<Tree> newerTrees() = new Stack<Tree>() ;

    public void checkpoint() {
        olderTrees.push( tree.clone() ) ;
        newerTrees.clear() ; }
   
    public void undo() {
        if( olderTrees.size() > 0 ) {
            newerTrees.push( tree ) ;
            tree = olderTrees.pop() ;
            notifyAllObservers() ; } }
   
    public void redo() {
        if( newerTrees.size() > 0 ) {
            olderTrees.push( tree ) ;
            tree = newerTrees.pop() ;
            notifyAllObservers() ; } }
   
    public void insert( NodeI parent, int posn, NodeI child ) {
        tree.insert( parent, posn, child ) ;
        notifyAllObservers() ; }
    .
    .
    .
}

The GUI uses this class as follows: At the start of doing anything that might affect the model, the GUI calls "checkpoint". Then the command is implemented just as you might expect. To implement "undo" and "redo", the corresponding methods are called. It is important that the clone method (or whatever mechanism is doing the duplicating —in C++ it might be a copy constructor) does a "deep copy", that is all the objects that make up the model must be copied. (In a garbage collected language, it is ok to stop at objects that are immutable.)
In summary, the GUI behaves as follows
  • In response to each user input event (e.g., key press or mouse action) that might change the model:
    Send a checkpoint message to the model
  • Modify the model
  • In response to a user input event that requests "undo":
    Send an "undo" message to the model
  • In response to a user input event that requests "redo":
Send a "redo" message to the model
I've assumed here that the observer pattern is being used (see the calls to "notifyAllObservers"). This is not necessary; what is necessary is just that at the end of each user action, the view resyncs with the model.

Merits

  • This approach is very simple. It demands of the model only that it be copyable. It demands of the view and controller only that they access to model through a particular interface.

Demerits

  • The main disadvantage is the space required to keep all the old versions. It also requires time to make the copies. For a large model, these disadvantages may prove to serious.

Command Pattern

Probably the best known approach is the use of the Command Pattern described in Gamma et al.s book Design Patterns: Elements of Reusable Object-Oriented Software. In fact I've seen discussions of the Command Pattern where the writer just assumed that undo/redo was the only application of the pattern, which it certainly isn't.
The idea is simple and appealing. In response to input user input, the system creates a "change" object that can do the required changes and that can then undo them. We might have a change generator that translates user actions (e.g. keypresses and mouse gestures) into changes.
class ChangeGenerator {

        ChangeI makeChangeForKeyPress( Key key ) { ... }
        ...
}

However changes are generated, they all obey a simple interface consisting of methods execute, undo, and redo. (I would have called "execute" "do", but "do" is a keyword in Java.)
interface ChangeI {
    enum State{ READY, DONE, UNDONE, STUCK } ;

    State getState() ;

    void execute() ;

    void undo() ;

    void redo() ;
}

Each change object starts in the READY state. A call to "execute" moves the change from READY to DONE. While changing the state it records whatever information is needed to move the state of the model back to its original state. A call to "undo" moves the state back from DONE to UNDONE, while setting the state of the model back to where it was. Of courses there is an assumption that when "undo" is invoked, the model is in exactly the same state that it was in at the end of "execute". Finally the "redo" method moves from the UNDONE state to the DONE state, changing the model as it goes.
Often the "redo" and "execute" methods are exactly the same, but there are reasons to sometimes make them different. For example, the "execute" method may prompt for some input from the user. We probably wouldn't want to re-prompt the user again.
One thing that should not be overlooked is the possibility of errors. For this design, I've created a special state STUCK, which the change gets into if an error occurs as part of "execute", "undo", or "redo". It's important that when a change gets stuck, the model is left in a consistent state. Once a change reaches the STUCK state, none of execute, undo, or redo is allowed.
We can capture the model independent aspects of a change with an "abstract change" as follows
abstract class AbstractChange implements ChangeI {
    State state = State.READY ;

    public State getState() { return state ; }

    public void execute() {
        assert state == State.READY ;
        try { doHook() ; state = State.DONE ; }
        catch( Failure e ) { state = State.STUCK ; }
        catch( Throwable e ) { assert false ; }
    }

    public void undo() {
        assert state == State.DONE ; }
        try { undoHook() ; state = State.UNDONE ; }
        catch( Failure e ) { state = State.STUCK ; }
        catch( Throwable e ) { assert false ; }
    }

    public void redo() {
        assert state == State.UNDONE ;
        try { redoHook() ; state = State.DONE ; }
        catch( Failure e ) { state = State.STUCK ; }
        catch( Throwable e ) { assert false ; }
    }

    protected abstract void doHook() throws Failure ;

    protected abstract void undoHook() throws Failure ;

    protected void redoHook() throws Failure { doHook() ;} ;
}

We can fill in the three hook methods for specific change objects. For example, a change to remove a node might look as follows.
class RemoveChange extends AbstractChange {
   
    TreeI tree ;
    NodeI parent ;
    int posn ;
    NodeI child ;

    RemoveChange( TreeI tree, NodeI parent, int posn ) {
        this.tree = tree ;
        this.parent = parent ;
        this.posn = posn ; }

    protected void doHook() {
        child = tree.getChild( parent, posn ) ;
        redoHook() ; }
   
    protected void undoHook() {
        tree.insert( parent, posn, child ) ; }

    protected void redoHook() {
        tree.remove( parent, posn ) ; }
}

Once we have built a change, it can be passed off to an undo manager. The manager, "executes" the change and then stores it on an undo stack. When a change is undone, it goes onto a redo stack. When it is redone, it goes onto a redo stack. The manager looks like this.
class ChangeManager {
   
    Stack<ChangeI> undoStack = new Stack<ChangeI>() ;
    Stack<ChangeI> redoStack = new Stack<ChangeI>() ;
   
    public void execute( ChangeI change ) {
        change.execute() ;
        if( change.getState == ChangeI.States.DONE ) {
            undoStack.push( change ) ;
            redoStack.clear() ; }
        else { // Presumably STUCK
            undoStack.clear() ;
            redoStack.clear() ; }
    }

    public void undo() {
        if( undoStack.size() > 0 ) {
            ChangeI change = undoStack.pop() ;
            change.undo() ;
            if( change.getState() == ChangeI.States.UNDONE ) {
                redoStack.push( change ) ; }
            else { // Presumably stuck
                undoStack.clear() ;
                redoStack.clear() ; } }
    }

    public void redo() {
        if( redoStack.size() > 0 ) {
            Change change = redoStack.pop() ;
            change.undo() ;
            if( change.getState() == DONE ) {
                undoStack.push( change ) ; }
            else { // Presumably STUCK
                undoStack.clear() ;
                redoStack.clear() ; }
        }
    }
}

In summary, the GUI behaves as follows
  • In response to each user input event (i.e. key press or mouse action) that might change the model:
    Build a change.
  • Send the change to the ChangeManager for execution.
  • In response to a user input event that requests "undo":
    Send an "undo" message to the ChangeManager
  • In response to a user input event that requests "redo":
Send a"redo" message to the ChangeManager

Notes

Each change needs to be deterministic, i.e., produce the same changes to the model, each time it is redone, as it did when first executed. An example of a nondeterministic change is "paste". If the contents of the system clip-board have changed, then "paste" has a different result. Another example is a change that initiates some sort of dialog with the user, for example asking the user to confirm that they really want to do something. First you wouldn't want to pester the user with the same question again during the replay; secondly, even if you did, if the user gives a different answer the second time, then the next change to be redone with the model in a different state than expected. The same goes for any information that comes from an unpredictable source: asking the operating system for the current time, reading a disk file, making an http request. One way to make a nondeterministic change deterministic is to record all input that does not come from the model in the change object somehow. E.g. within "doHook" for the paste change we might have:
    ifthis.toInsert == null ) {
        this.toInsert = clipboard.contents() ; }
    document.insert( this.toInsert ) ;

Here is a case of nondeterminism I learned about the hard way. If the "execute" method creates objects, the "redo" method should not create objects; at least not objects that future changes might refer to. Rather, "redo" should reuse objects that were created in the earlier call to "execute". (And it follows that the "undo" method should not delete objects that the "execute" method created.) To explain why, here is an example. Suppose we execute a change u that makes a new node and inserts it into the tree; call this node x. Next we change the text in node x; the change v that does this presumably keeps a pointer to x, so that it can be undone and redone. Now suppose we undo both changes and then redo both. If redoing the u creates a new node x', then the redo of the second change will change x, which is no longer in the tree, rather than x'. The correct way to do it is that the undo of u should remove x from the tree, but keep a pointer it; then the redo of u will simply put x back in the tree. As mentioned, this example is really just another case of nondeterminism. The execute and redo methods must add exactly the same object to the tree. It is not good enough for "redo" to add an object x' that is equal to, but not the same as, the object x that "execute" had put in the tree.
So far I've assumed that each user interaction results in one change. A better system is to allow several changes bind together to make one compound change.

Merits

  • One of the big merits of using changes is that it separates the concerns of undo/redo from the model. There is nothing very special about the model we use, only the way we use it. This is handy if you are using someone else's model, but building your own GUI above it. For example, in one editor I built, I was using Mozilla's implementation of the W3C's DOM interface. I went with the command pattern.
  • The approach also makes it easy to incorporate changes to multiple models. For example we might have a selection model built above our data model. Changes to the selection are easily dealt with in their own set of changes.

Demerits

  • The big disadvantage is that the entire change needs to be worked out in advance before any of it can be executed. For many sorts of editors, this is fine, we can work out a simple and reversible sequence of modifications to an underlying data structure. For more complex systems, it might be hard to preplan all the changes, there might be choices to be made part way through. This can always be handled, but every choice made has to be recorded so it can be undone. Essentially one writes the code for the change twice, once going forward and once going backward. The consistency of these two bits of code requires careful coding and careful testing.
    • I ran into exactly this problem in the editor I mentioned above. What I did was to implement big changes via a series of fairly simple change objects. As it executes them, the undo manager joins these change objects together to make a compound change object. The only change to the interface needed is a call to the undo manager to say when a new interaction with the user has started.
  • This approach does affect the way you write the controller part of the GUI. Every thing that might affect the model has to be accomplished by building a change and sending that change to the change manager.

Log and Replay

The first piece of software I remember using that supported undo was vi. As mentioned above, the early versions of vi supported only one level of undo: You'd press "u" to undo the last change and then pressing the "u" again would undo the undo, i.e. redo. My first encounter with unlimited undo happened in 1985. I was hired to do some programming on a computer running the VMS operating system. The editor was called EDT; it was a text based "visual" editor, similar to Emacs or vi. Output was on a 25 by 80 character display --- a VT-100, if I remember rightly. Input was a sequence of keystrokes. Crucially there was no mouse input. (Indeed, there was no mouse -- this was 1985.) Anyway this editor had a peculiar, but very useful feature: Every keystroke entered was immediately saved to a log file. The main use of this feature was that, if the editor crashed before you had saved the file, you could start the editor in a special mode where it read its input from the log file rather than the keyboard ---at least until the log file ran out, after which it would take keyboard input again. This let you recover from the occasional crash. It also had the side effect of providing unlimited undo. If you wanted to undo the last 100 keystrokes, you'd simply quit without saving, edit the log file, delete the last 100 characters, save the log file, and then replay the log file on the original file. This technique also held some entertainment value: as the changes replayed, the changes to the document were displayed on the screen at high speed.
The application of this idea to undo/redo is straightforward. As with the command pattern approach above, we use change objects, but this time the only methods needed are "execute" and "redo", there is no "undo" method for changes.
As with the command pattern approach, we require that each change be deterministic. Normally the "redo" method will just call "execute", but there are special cases where the two need to be different. These include
  • User interaction. The "execute" method may pop-up dialogs to inform the user of things or to obtain more input from them. The "redo" method should not.
  • Obtaining information from outside the model. The execute method records any information it obtains from the user, the operating system, or other agents. The redo method should use this recorded information.
  • Creating objects that future changes might reference. The execute method sets a pointer to the object. The redo method uses the recorded object.
We can accomplish these differences in behaviour between execute and redo, either by writing them separately, or by writing "execute" so that, the first time it runs, it records any information it obtains and then, in subsequent executions, it uses the recorded information and skips all user interactions. (See the "paste" example, above in the command pattern approach.)
I'll ignore any possibility of changes failing, partly to keep things simpler, but also because a failure to execute is not really a problem, as long as it is repeatable. With the command pattern approach, a change that doesn't finish is problematic because it can't be undone. With the Log and Replay approach, we can undo even a change that stops part way through, so long as the model is left in a consistent state.
public abstract class LogAndReplayManager {
   
    Stack<ChangeI> undoStack = new Stack<ChangeI>() ;
    Stack<ChangeI> redoStack = new Stack<ChangeI>() ;
   
    public void execute( ChangeI change ) {
        change.execute() ;
        undoStack.push( change ) ;
        redoStack.clear( ) ;
    }

    public void undo() {
        if( undoStack.size() > 0 ) {
            ChangeI change = undoStack.pop() ;
            redoStack.push( change ) ;
            reset() ;
            for( ChangeI c : undoStack ) c.redo() ; }
    }

    public void redo() {
        if( redoStack.size() > 0 ) {
            ChangeI change = redoStack.pop() ;
            change.redo() ;
            undoStack.push( change ) ; }
    }

    // Reset the model
    protected abstract void reset() ;
}

We need some way to reset the model back to its original state. I'll leave "reset" as abstract so that the LogAndReplayManager class can be reused for various models.

Merits

  • This approach is very space efficient. All that needs to be saved is the changes, which mostly consist of user input and pointers to newly allocated objects.
  • It is very simple to implement.
  • As with the command pattern approach it demands nothing of the model. Unlike the change approach it does not require the execution of the change to be simple. Thus the change objects can correspond very closely to user interface actions. There is no need to carefully plan out the construction of changes that are easy to undo.

Demerits

  • Clearly this approach could have time problems: undos can be expected to take quite a bit longer than they would with say the command pattern approach. If the model can be copied, we can combine Log and Replay with Checkpointing. For example if we take a checkpoint every 50 changes, then the combined approach is almost 50 times more space efficient than checkpointing alone, but an undo takes no more time than executing one copy and 49 changes.
  • The other down side is determinism. Changes that obtain information from the user, or some other unpredictable source, need to somehow record the user's responses to the dialog somewhere accessible to the change.
  • It does not fit well with the Observer pattern.

Note

This idea came to me from Guido Rößling who used something like it in his Animal algorithm animation system. He calls it "reverse by fast forward".

Timestamping

In building the Teaching Machine, one thing I wanted was undo. The command pattern approach wasn't very appealing as each change could involve a great deal branching and looping. The command pattern approach requires that you write everything twice, once going forward and once going backwards. Undoing an if-then-else requires recording which direction was taken. Loops are worse. Recursion is nasty. The Teaching Machine is a program animation system, which means that it has within it an interpreter for a programming language (either Java or C++ in the TM). A single user action might execute several lines of C++ or Java code. Checkpointing was not appealing, because of the large and complex model state. Had I thought of it, the time required by the Log and Replay method would have worried me. The Transactional Memory approach below would have been suitable, but I didn't think of it at the time.
What I came up with is based on maintaining a global clock which advances by one tick with each user interaction. The following description addresses only undo, not redo, which was not a requirement for the Teaching Machine.
The model is made up of objects that have fields that never change; for mutable fields we use special Var objects. Each Var object has "get" and "set" methods that let us get or set a value.
class Var<T> {
   
    T get() {...}

    void set( T val ) { ... }

    ...
}

Each Var maintains a stack of all the values ever assigned to it and, along with each value, the times at which that value was valid. For example we might have a Var that is null until time 4, when it is set to 99, then it is set to, 100 at time 10. If the last time the variable was accessed was time 12, it might be represented by a stack of records as follows
Value First valid time Last valid time
100 10 12
99 4 9
We only update this stack and the records in it when the variable is accessed (read or written).
Undo is implemented simply by subtracting 1 from the global time variable. In our example, suppose time roles back to say time 7; then the get method of the Var is called. It should notice that the top record is no longer of use. That record is discarded and the next record is adjusted to reflect that there is no time 8 or 9. The new stack is
Value First valid time Last valid time
99 4 7
The alert reader will have noticed a flaw in this scheme. Suppose that we didn't access the Var at time 7, the stack remains as
Value First valid time Last valid time
100 10 12
99 4 9
Now if 5 more user actions happen with no accesses to this Var, it will be time 12 again, only in this time line the assignment of 100 to our variable never happened. If an access happens now we get a result of 100 instead of the correct 99.
The solution is simple. We don't use integers for times, but rather objects that point back to the previous time. (The first time points to null). Thus we can have a branching tree of times. It turns out to be useful to keep the integer as well. Thus our times are represented as objects of this class.
class Time {
   
    private final Time previous ;
   
    private final int count ;

    public Time() { previous = null ; count = 0 ; }

    private Time( Time previous ) {
        this.previous = previous ; count = previous.count + 1 ; }

    public Time makeNextTime() { return new Time( this ) ; }
   
    ...
}

Keeping track of the current time is a TimeManager.
public class TimeManager {
   
    private Time currentTime = new Time() ;

    public Time getCurrentTime() { return currentTime ; }

    public void checkpoint() {
        currentTime = currentTime.makeNextTime() ; }

    public void undo() {
        currentTIme = currentTime.getPrevious() ; }
}

The current time together with its previous time, and the time before that, and so on, constitute a "time line" -- the current time line. When a Var is accessed (whether via "get" or "set"), the variable is first put onto the current time line. I.e. we discard records from its stack, and/or adjust final times, until all times mentioned by its stack are on the current time line. Then we proceed with the get or set.
What I like about this scheme is that you only pay for what you use; this in the sense that undo costs essentially nothing at first. It is only as variables are accessed that there is a cost to be paid.
So far, this scheme does not allow for redo. Redo can be accommodated at some extra complication. Each Var needs to be equipped with two stacks. One for "past" values" and one for "future" values. I put "past" and "future" in quotes, as these are relative to the last time the variable was accessed, not to the current time. Upon access, these stacks are first adjusted so they make sense relative to the current time.
Building the whole model just out of Var objects can be a bit too low level. For the Teaching Machine, I created a suite of undoable data structures that the client could use to build their model: arrays, vectors, maps. This also allowed different implementation approaches to be used. For example needed a big array of bytes to represent memory. I could have used a big array of Var<Byte> objects, but though this would use up too much memory -- several words to represent each byte. So, while Vars were implemented much as shown above, arrays used a method more like the Transactional Memory approach discussed below.

Merits

  • No change objects are needed. From the GUI's point of view, it needs to call "checkpoint" and "undo" on the TimeManager at appropriate times, but is otherwise not complicated by the needs of undo and redo.
  • The undo is incremental. Only the parts of the state that are actually accessed need to be undone.
  • In contrast to checkpointing, the space overheads are modest. Only data that has changed takes extra space.

Demerits

  • In contrast to all the schemes above, there is a major effect on how you code the model. You must be sure to use Var objects throughout. All these and the Var class form a layer below the model, on which the model is built. It would be impossible to use some third party model.
  • Bugs occur when someone forgets to use a Var instead of a regular field.
  • The system is admittedly a bit complicated. Once working, it simply works and you can forget about how it works, but until then it is tricky coding.
  • Only unlimited undo is supported. There isn't a good way to support limited undo (e.g. only the last 100 changes can be undone). For long running, memory intensive applications, this could be an issue.
  • Since each Var requires a link to the time manager, we end up passing the time manager to every routine that might need to build a Var. One could use the singleton pattern for the time manager, but the singleton pattern can be troublesome. For example, consider an editor with a "multiple document interface"; we want two models with independent current time lines, thus two time managers.

Transactional Memory

As we've seen, the drawback of the command pattern approach is having to write code for each change in two or three places. We need to write "execute", "undo", and possibly "redo". This usually constrains what we do in "execute" to be fairly simple. The more decisions that are made during "execute", the more complex will be the recorded state and the undo operation. Timestamping is one way around this. Another is a scheme inspired by "software transactional memory", which is a method of dealing with concurrent updates to data structures.
This Transactional Memory scheme, like the Timestamping scheme, assumes that the GUI periodically calls a checkpoint method, typically at the end of each user interaction that might change the data structures. A period of time between checkpoints (or before the first) is called an epoch. (Actually this is a bit of a lie, but it's a good first approximation. I'll define "epoch" precisely later on.)
The idea of transactional memory is to make, for each epoch, a record of all variables changed during the epoch, together with their values as of the start of the epoch. Such a record is called a transaction. For this to work, the data structure must be built using special "transactional variables", for which I will use objects of class TVar. Each TVar holds a current value. Similar to the Timestamping approach, each field that makes up the model will either be a constant or will be a TVar.
class TVar<T> {
    private TransactionManager manager ;

    private T currentValue ;

    public T get() {return currentValue;}

    public void set( T val ) { manager.notifyOfSet(this) ; currentValue = val ; }

    ...
}

Each TVar knows a transaction manager, which holds two stacks of transactions. As you've probably guessed, these are a stack of past transactions (the undo stack) and a stack of future transactions (a redo stack). Each transaction is, abstractly, a partial function that maps TVars to values. Transactions on the undo stack map TVars changed during an epoch to the values of those TVars at the start of that epoch. Conversely, transactions on the redo stack map TVars to their values at the end of an epoch.
Applying a transaction t means swapping the values held by the transaction with the current values in the TVars
for v in domain(t) do (t(v), v.currentValue) := (v.currentValue, t(v))
Applying the top transaction on the undo stack will change the values of the TVars back to the way they were before the transaction started; at the same time the transaction becomes suitable for placement on the redo stack. Conversely, applying the top transaction of the redo stack will change the values of the TVars to the way they were at the end of an epoch; at the same time the transaction becomes suitable to be placed back on the undo stack.
The transaction manager has two states, DOING and NOTDOING. An epoch is defined to be the time from a transition from NOTDOING to DOING until a transition from DOING to NOTDOING. When the transaction manager is in the DOING state, it additionally has a current transaction, which is a transaction that is being built to represent the current epoch. The current transaction records the original values of all TVars that have changed during since the start of the current epoch, i.e. since the last time the transaction manager was in the NOTDOING state. When the transaction manager is in the DOING state, its redo stack is empty.
The interface of the transaction manager is similar to that of the TimeManager in the Timestamping approach. The transaction manager works as follows.
  • Initially: both stacks are empty; the transaction manager is in the NOTDOING state, and thus there is no current transaction.
  • On receiving a notifyOfSet message from a TVar v:
    • First, if in the NOTDOING state, the transaction manager creates a new transaction t, with an empty domain, and makes t the current transaction, then it clears the redo stack and switches to the DOING state. This marks the start of an epoch.
    • Now the manager is in the DOING state.
    • Second, if v is not in the domain of the current transaction, v is added to the current transaction t and is mapped, in that transaction, to its current value
      t(v) := v.currentValue
      This records the value of v as of the start of the current epoch.
  • When a checkPoint message is received
    • If the manager is in the DOING state, the current transaction is pushed onto the undo stack. The manager switches to the NOTDOING state. This ends the epoch and the building of that transaction.
    • If the manager is in the NOTDOING state, there is nothing to be done.
  • When an undo message is received
    • In case we are in an epoch, i.e., the manager is in the DOING state, the current epoch is ended exactly as if a checkPoint message was received. I.e., the current transaction is pushed onto the undo stack. The manager switches to the NOTDOING state.
    • Now the manager is in the NOTDOING state.
    • If the undo stack is now empty, there is nothing to undo.
    • Otherwise, the top transaction t of the undo stack is applied and then moved to the redo stack.
  • When a redo message is received
    • If the redo stack is empty, there is nothing to do.
    • If the redo stack is not empty, the transaction manager must be in the NOTDOING state. The transaction manager applies the top transaction of the redo stack and then moves it to the undo stack. It stays in the NOTDOING state.

Discussion

Transactional memory can be thought of as a optimized form of checkpointing. Rather than saving the whole model at each checkpoint, we save only the parts changed between checkpoints.
Like the Timestamping approach, it requires that we use special variables to build the model. For the GUI coder, it is as easy to use as the Timestamping approach. Indeed to its clients an implementation of transactional memory and an implementation of Timestamping have essentially the same interface.
The space efficiency is probably about the same as for Timestamping. In both cases we pay only for changes.
For time efficiency, the crucial operations are getting values from the model and setting values. Getting values is very quick. Each TVar holds its current value. This is in contrast to the Timestamping approach where each Var must ensure it is on the current timeline at the start of each get. The main cost of a set operation is checking whether the TVar is already a part of the the current transaction. This can be optimized by caching, in each TVar, a pointer to the last transaction it was recorded in.
public class TVar<T> {
    ...
    private Transaction lastTrans ;
    ...
    public void set( T val ) {
        if( lastTrans != manager.currentTrans || !manager.isDOING ) {
            manager.notifyOfSet(this) ;
            lastTrans = manager.currentTrans ; }
        currentValue = val ; }

    ...
}

Merits

  • Space efficiency is probably acceptable. There is an overhead of about 4 times.
  • Time efficiency is probably better than Timestamping.
  • It is easy to write the GUI code. The only requirement is to call checkpoint at the end of each user interaction cycle.
  • As with Timestamping, we can create optimized data structures, such as arrays, that use more space efficient techniques.
  • Unlike Timestamping, it is clear how to support redo.
  • Unlike Timestamping, it is easy to implement limited undo.

Demerits

  • You have to layer the model above the transactional memory system. You can't just use an existing model.
  • Bugs occur when someone uses a regular field instead of a TVar.
  • Unlike TimeStamping, undo is not incremental.
  • Since each TVar requires a link to the transaction manager, we end up passing the transaction manager to every routine that might need to build a TVar. Alternatively, one could use the singleton pattern for the transaction manager, but the singleton pattern can be troublesome. For example, consider an editor with a "multiple document interface"; we want two models with independent undo and redo stacks and thus two transaction managers.

Note

Transactional memory is well known as an approach to interference control in concurrent programming. I haven't seen the idea applied to the undo/redo problem before.

Summary

Dependence

As always in software design the separation of concerns is crucial. In designing a complete interactive system that supports undo/redo we need to consider (at least) four major components and how they are coupled.
  • The model. This is the data structure.
  • The view. Displays the data structure.
  • The controller. Interprets user input and causes the data structure to change. For the command pattern approaches, this includes the concrete change classes and the code that constructs the change objects.
  • The undo/redo infrastructure.
In all the approaches we've looked at the undo/redo infrastructure does not depend on the view or controller; we can reuse the undo/redo infrastructure with other views and controllers. What about the model?
  • We can layer the undo/redo on top of the model, as I did in my example of the checkpointing approach, where the TreeWithCheckpointing implements the same interface as the underlying model. If we have an existing model/view/controller system, the checkpointed model is a drop in replacement for the model. The only place where the view and controller depend on on the model being checkpointable is in implementing undo and redo buttons and calling checkpoint at the start of each user interaction that leads to a change. The downside is that the undo/redo infrastructure is tied to one model interface and is not reusable with another. Since the undo/redo infrastructure is fairly trivial, this is not a big price.
  • In the command pattern approach (and log and replay) the change manager, together with the change interface and the abstract change class, is reusable, but the individual changes are not. Furthermore the controller is strongly coupled to the undo system, as it must construct the change objects. On the other hand, the view and the model are independent of the undo/redo system.
  • The last two approaches --I'll call them "magic memory approaches" make the model strongly coupled to the undo/redo infrastructure, as the model must be built out of special variables. This rules out reusing an existing model short of completely rewriting it. However the view and controller are then essentially written as if there was no undo/redo; we just have to add enough calls to checkpoint and the undo and redo buttons.
In summary: If you want to reuse an existing model and view, the command based approach is probably best. On the other hand, if you want to write the controller without packaging each change into a change object, then the magic memory based approaches are simplest, as long as you are writing new model code or don't mind rewriting existing model code. The checkpointing method seems to combine the best of both worlds, but is obviously costly.
I've implemented two complex systems supporting undo
  • In the SIMPPLE project, I'm building a WYSIWYG XML editor. The most complex part is the view, as it needs to support CSS, which is a fairly complex standard. The best software is the software you don't have to write, so I decided to use the Gecko layout engine as a view component. This meant using Mozilla's DOM implementation for a model. Since I couldn't have the model depend on the undo/redo system, a command pattern based solution seemed the best.
  • In the Teaching Machine, it did not seem that commands could be packaged to be easily undone. Checkpointing would be too space consuming. I could have tried a log and replay approach but didn't think of it. I opted for Timestamping. While I think that Timestamping can be extended to support redo as well as undo, I'll probably switch to Transactional Memory when I want to add redo; this is for the sake of simplicity and efficiency.

Efficiency

The checkpointing system is conceptually simple, but costs a lot in space and time.
The magic memory approaches store fine grained changes and so you only pay for the modifications made. This saves space and time.
The command based approach is potentially very memory efficient.