4 minutes
Thoughts on Structured Editing: Breaking Away from Syntax
I was aimlessly browsing Github and came across sapling, a very early-stage structured editor. It was inspired by this blog post, which talks about structured editing.
Structured editing has had my interest for a while. I’ve put a lot of time and effort into getting really good at text editing, but I’m holding out hope that in 10 years plaintext editing will be outdated. The idea behind structured editing is to edit code instead of text, so instead of “move cursor to next parentheses, delete in parentheses”, you’d have operations like “change function argument”.
Formatting
With auto-formatters like rustfmt, gofmt, and prettier, some of the promise of structured editors have been achieved in regular editors. Because now the ambiguity of syntax is eliminated by formatters. These editors take the text, transform it into an AST (abstract syntax tree), and then print the AST, using their formatting rules.
Conveniently, they provide an easy way to get started with creating a structured editor. Ex. since rustfmt can already print an AST, why not use the output of rustfmt to display the AST in a structured editor. I added some syntax highlighting to rustfmt and got something like this:
Customization
I’m not a semicolon fan; sometimes I wish Rust was written python-style, with newlines meaning new statement, instead of semicolons representing the end of a statement. It’s always a pain when I forget a semicolon, and I wish my editor would just automatically insert them for me. Since they’re always followed by a newline, they’re just a wasted character, in my mind.
Well now I’ve got this alpha structured editor, how about I just remove the semicolons when printing the AST? Obviously I couldn’t do this in a plaintext editor, since the code wouldn’t be valid. But the underlying AST is the same regardless of how I print it, so let’s print without semicolons:
Taking it further
There’s so many details of a language that matter more than syntax; performance, type system, available libraries, community, standard library, dev tools, etc. Unfortunately, syntax is the first thing people see, and therefore gets a disproportionate amount of attention. There’s also the inertia of C-style syntax; languages that want to have any chance of being popular have to use curly brackets for blocks, parens for invocations, dots for method access, etc.
I think structured editors offer an escape from this. For example, here’s the same example from above, with the same AST, but with python-esque syntax instead:
Now when learning Rust coming from Python, I wouldn’t need to learn about
implicit returns, or re-learn that {}
means a block, instead of :
. The
underlying AST has the block
and return
nodes, and I can view them however
I’m most comfortable. This would free up time to learn more important things,
like lifetimes and borrowing.
Other syntax thoughts
Unicode symbols
Some languages, like Haskell, have embraced unicode symbols. Using ∀
instead
of forall
, or →
instead of ->
is becoming more popular. The problem is
that these are a lot harder to type. With a structured editor, you wouldn’t be
typing these anyway, so it’s no longer such a tradeoff between unicode and ASCII
characters.
Non-text representations
When you have an editor representing an AST, you’re not limited to text to
convey information. For example, instead of writing mut foo
for mutable
params to a function, you may want mutable params to have a squiggly line below
them. Or maybe you want a graph-like representation of your code instead of
text, to get an idea of which functions call which other functions.
Parsing considerations
Part of the discussion around Go generics involved the use of <>
brackets for
generics, like in other languages. If you use angle brackets, seeing a <
means
you then have to determine whether what follows is a type or an expression,
which results in unbounded lookahead, something the compiler devs wanted to
avoid for performance reasons. So now Go developers have to get used to seeing
and using []
brackets for generics. If structured editors and ASTs were used
instead, not only would there not be a parsing step at all, eliminating the
unbounded lookahead problem, but Go developers could use whatever generic
annotation they like.
Conclusion
Don’t know if I have much of a conclusion, besides that I hope that syntax will continue to lose importance as a defining characteristic of a language. It’s already been losing importance with auto-formatters. In the languages I use where auto-formatters are the norm, I’ve seen way less bikeshedding about spaces vs tabs, bracket styles, trailing commas, etc.
If you play games, my PSN is mbuffett, always looking for fun people to play with.
If you're into chess, I've made a repertoire builder. It uses statistics from hundreds of millions of games at your level to find the gaps in your repertoire, and uses spaced repetition to quiz you on them.
If you want to support me, you can buy me a coffee.