2018-04-29

Direct manipulation of abstract syntax trees part 0. Infix and prefix tree entry

Direct manipulation of abstract syntax trees part 0. Infix and prefix tree entry


This is the first of a two part post. Part 1 is here.


Mental friction

Programmers are constantly going back and forth between multiple levels of abstraction. Of course this is true in terms of data structures and libraries, but here I'm just thinking about the text of the program itself. There is a basic, read, think, change cycle.  We start with reading. At the lowest level we have an array of coloured dots on a screen. The programmers brain can quickly turn this into a two dimensional array of characters. Next we group the characters into tokens. Then we mentally parse the arrangement of tokens to form a tree or --screen size being limited-- a portion of a tree. At this point we have read the code. Then we have to think about the meaning of the tree, decide what change needs to be made to the tree in order to perform the next task, be it fixing a bug or adding functionality or whatever. Once we know how the tree should be, we need to think about what the arrangement of tokens should be, then the arrangement of characters, finally we need to find a sequence of mouse motions, mouse clicks, and key presses that will change the current arrangement of characters to the one we want. Sometimes, I then go back and read the text again to make sure that the change I made matches the change I intended. Essentially we are constantly doing much of the work of a compiler -- lexing and parsing -- and then unparsing and unlexing.  Not every experienced programmer is going to think of it this way, but this is my way of thinking about it. And of course people don't lex and parse the same way as compilers, for example we use heuristics based on indentation, which is something compilers generally ignore.
All this lexing, parsing, unparsing, and unlexing creates friction in the programming process. One of the goals of the PLAAY visual programming language is to eliminate much of this mental busy work by presenting the abstract syntax tree in a more direct form and to get away from the distractions of lexing and parsing.
However eliminating friction on the reading and comprehension end is not helpful if we create the same or more friction on the production and manipulation end. Furthermore, for experienced programmers, there is the additional hurdle that they are used to producing and editing code using text and some text editor or IDE be it Vim, Sublime, Eclipse, ... .
So we need a low friction way to enter and manipulate abstract syntax trees.
The rest of this post shows that, using the PLAAY editor, entering trees is easy and flexible.
It's easy in that it takes about as many keystrokes as entering the equivalent expression in a text-based language.
It's flexible in that it allows trees to be created either top-down (prefix entry) or left-to-right (infix entry).
A later post will look at how the required transformations on trees are implemented.

Entering a simple expression in PLAAY

I had two pleasant surprises when I developed the PLAAY editor.
  • First, I was surprised that we could accommodate entering expressions in either prefix or infix order without having to make any concessions.  That is we could both methods easy without having to make either less easy than we would have if making both easy wasn't a goal.
  • Second, the total amount of effort on the part of the user is about the same as for a textual language. Of course "effort" is subjective, but, if we measure effort by counting keystrokes, this claim can be proved objectively.
The video below illustrates entering expressions using either prefix or infix order.




Prefix entry

I'll start with entering expressions in using prefix order, i.e. entering operations before their operands.
Let's start with a simple expression

We can enter it with the following keystrokes
  • '+' creates an addition operation with two placeholders as operands, giving
  • '*' replaces the selected place holder with an multiplication operation, giving
  • 'a' creates an identifier and displays a text field, giving
  • tab closes the text field and selects the next place holder, giving
  • 'x' creates an identifier and displays a text field, giving
  • A tab closes the text field and selects the next place an expression can be added, giving
  • (Note that the multiplication operator can have more than 2 operands. That's why an area left of the second operand is selected. This gives us a chance to enter a third operand.) Another tab selects the next place holder, giving
  • '*', 'b', tab, 'y', tab give
That's 12 keystrokes versus 7 to type a*x+b*y.

Infix entry

We can also enter the same expression in using more traditional infix notation. We have:
  • 'a' creates an identifier and displays a text field.
  • An enter closes the text field and selects the identifier.
  • '*' engulfs the identifier with a multiplication operation; it then selects a placeholder.
  • 'x' creates an identifier and displays a text field.
  • An enter replaces the place holder with an identifier and displays a text field.
  • A space selects the parent of the current selection.
  • '+' engulfs the identifier with an addition operation.
  • 'b', enter, '*', 'y', tab give
This took 12 keystrokes, the same as prefix entry.
One reason that it takes more keystrokes in PLAAY than in Java, C++ etc. is that in PLAAY identifiers can include characters like *,+ and spaces. If we imagine modifying the language so that this is not the case, we could arrange that typing an operator into a text field is equivalent to typing enter followed by the operator or space.  That accounts for 4 characters. A second reason is that PLAAY has no operator precedence. This means that a space is needed to expand the selection. It should be noted that the expression corresponding to (a+x)*(b+y) also takes 12 keystrokes to enter in PLAAY: no increase; whereas in C there are 4 additional keystrokes.

No keyboard

If a keyboard is not available (e.g., on an iPad or similar tablet) or the user prefers not to use it, there are alternative user actions that can construct the trees.

A larger example

Prefix entry a larger example

Although we've seen prefix and infix entry of trees before, it might be good to look at another example and think about the edit applied in response to each keystroke.
Suppose we want the tree

We can enter this in prefix form with following key strokes
  • ? < a tab b tab tab
  • : x tab * a tab ` s i n tab t h e t a tab down-arrow down-arrow
  • : x tab * b tab ` c o s tab t h e t a tab
That's 43 key strokes. (The back tick character ` is used to create a function call. The down-arrow moves the selection right until a drop zone is found under an expression sequence node some other node that is formatted vertically.)

Comparing to C or Java we have
    if(a < b) 
        x = b*sin(theta);
    else
        x = b*cos(theta);
which takes 53 key strokes with most editors

Infix entry of a larger example

The same tree can be entered in infix.
  • a enter < b enter space ?
  • x enter : a enter * t h e t a enter ` s i n enter down-arrow down-arrow
  • x enter : b enter * t h e t a enter ` c o s enter
That's 43 keystrokes, same as prefix entry.

Postfix

In the last example, you'll notice that the function calls to sin and cos are done in postfix order. I.e., the argument, theta, first and then the function name: t h e t a enter ` s i n enter. Postfix works fine for operators with one operand. We can also use postfix for operators with multiple operands, but the UI is not optimized for this case.

Other manipulations

Entering code in the first place is probably where a visual language such as PLAAY has the biggest disadvantages. Once a tree is entered, it is much easier to manipulate in PLAAY's editor than with a text editor that is mostly oblivious to the structure of the program. The PLAAY editor makes it easy to select, move, delete, or duplicate entire subtrees representing units of meaning such as variables, expressions, commands, declarations, and types. The price is that some manipulations are harder. In particular any edit that would produce a syntactically incorrect program is forbidden.

PART 1 explores the coding behind the editor.

No comments:

Post a Comment