Thoughts on Structured Editing: Breaking Away from Syntax

I was aimlessly browsing Github and came across sapling, a very early-stage structured editor. It was inspired by this blog post, which talks about structured editing.

Structured editing has had my interest for a while. I’ve put a lot of time and effort into getting really good at text editing, but I’m holding out hope that in 10 years plaintext editing will be outdated. The idea behind structured editing is to edit code instead of text, so instead of “move cursor to next parentheses, delete in parentheses”, you’d have operations like “change function argument”.

Formatting

With auto-formatters like rustfmt, gofmt, and prettier, some of the promise of structured editors have been achieved in regular editors. Because now the ambiguity of syntax is eliminated by formatters. These editors take the text, transform it into an AST (abstract syntax tree), and then print the AST, using their formatting rules.

Conveniently, they provide an easy way to get started with creating a structured editor. Ex. since rustfmt can already print an AST, why not use the output of rustfmt to display the AST in a structured editor. I added some syntax highlighting to rustfmt and got something like this:

Customization

I’m not a semicolon fan; sometimes I wish Rust was written python-style, with newlines meaning new statement, instead of semicolons representing the end of a statement. It’s always a pain when I forget a semicolon, and I wish my editor would just automatically insert them for me. Since they’re always followed by a newline, they’re just a wasted character, in my mind.

Well now I’ve got this alpha structured editor, how about I just remove the semicolons when printing the AST? Obviously I couldn’t do this in a plaintext editor, since the code wouldn’t be valid. But the underlying AST is the same regardless of how I print it, so let’s print without semicolons:

Taking it further

There’s so many details of a language that matter more than syntax; performance, type system, available libraries, community, standard library, dev tools, etc. Unfortunately, syntax is the first thing people see, and therefore gets a disproportionate amount of attention. There’s also the inertia of C-style syntax; languages that want to have any chance of being popular have to use curly brackets for blocks, parens for invocations, dots for method access, etc.

I think structured editors offer an escape from this. For example, here’s the same example from above, with the same AST, but with python-esque syntax instead:

Now when learning Rust coming from Python, I wouldn’t need to learn about implicit returns, or re-learn that {} means a block, instead of :. The underlying AST has the block and return nodes, and I can view them however I’m most comfortable. This would free up time to learn more important things, like lifetimes and borrowing.

Other syntax thoughts

Unicode symbols

Some languages, like Haskell, have embraced unicode symbols. Using ∀ instead of forall, or → instead of -> is becoming more popular. The problem is that these are a lot harder to type. With a structured editor, you wouldn’t be typing these anyway, so it’s no longer such a tradeoff between unicode and ASCII characters.

Non-text representations

When you have an editor representing an AST, you’re not limited to text to convey information. For example, instead of writing mut foo for mutable params to a function, you may want mutable params to have a squiggly line below them. Or maybe you want a graph-like representation of your code instead of text, to get an idea of which functions call which other functions.

Parsing considerations

Part of the discussion around Go generics involved the use of <> brackets for generics, like in other languages. If you use angle brackets, seeing a < means you then have to determine whether what follows is a type or an expression, which results in unbounded lookahead, something the compiler devs wanted to avoid for performance reasons. So now Go developers have to get used to seeing and using [] brackets for generics. If structured editors and ASTs were used instead, not only would there not be a parsing step at all, eliminating the unbounded lookahead problem, but Go developers could use whatever generic annotation they like.

Conclusion

Don’t know if I have much of a conclusion, besides that I hope that syntax will continue to lose importance as a defining characteristic of a language. It’s already been losing importance with auto-formatters. In the languages I use where auto-formatters are the norm, I’ve seen way less bikeshedding about spaces vs tabs, bracket styles, trailing commas, etc.