Juan F. Meleiro

Smartphone Editors

I want better text editors in smartphones.

Editors don't need to be point-and-click, type-to-insert, as seems to be the default anytime a developer adds a little text editor into their software. Indeed, it hasn't historically been the case. Not to say anything of ed (the standard text editor), vi and its derivatives show this as well as any other: aside from its modal architecture, the so called /normal/ mode is not the one in which the user inputs text.

My point is that a text editor is not a text /inputer/. Editing, in fact, has less to do with inputting text than with deleting it, moving it around, and modifying it.

So, when conceiving of a text editor, it makes sense to think about what operations are to be supported, and how do they best fit the input hardware it targets.

vi and its successors' model approach is, I believe, the undisputed champion in its… ehr… modality. That is, desktop computers with a multi-line screen and a keyboard as input.

But touchscreens have long been common enough that we ought to think about what a text editor that targets /them/ is to behave. So that's what I want to think about.

---

Software interface design is, in a sense, a matter of managing complexity. A matter of presenting complexity in ways which are parseable by a human mind. A blender, for example, might be a complex machine on the inside, but it carries but a few operating models – basically, multiple speeds. So a simple dial suffices.

Text editing, on the other hand, has /hundreds/ of possible functions (moving, deleting, inserting, converting, re-flowing, indenting, correcting, etc), and that's not counting context. If we do count the current text's contents, opened files, languages being used, etc, it goes beyond counting. Its combinatorial.

And the way one manages combinatorial complexity is by using combinatorial gadgets. So vi brings its modes into the fight: each mode has a manageable amount of basic operations (which also compose combinatorially), bunched together in a way that 1) collapses complexity (e.g., having movement related commands together), but also fits well enough in a human mind. More than that, it allows one to build habits, and chunking operations into larger, more abstract ones.

So, modes are great. But vi's /input/ model is the keyboard. Now, keyboards are great for one thing: hitting many different keys and key combos in quick succession. One thing they are terrible at is being modified on the fly. And those are the characteristics that informed vi's interface: the same keys perform different functions in different modes, and key positions inform what they are.

Touchscreens are basically the opposite of a keyboard, particularly a smartphone's. Thumbs are dumb pointers that like to hit big buttons in a touch screen (which also has no tactile feedback), and screens, which quicker-than-the-eye refresh rates, are as mutable as it gets (that's what they're designed for).

So, can we do better than a tiny virtual keyboard with edgeless keys?

Notably, for some tasks, no, we can't. For example, for actually typing text, keyboards as basically the only good solution (alternative layouts notwithstanding). And lo, that's usually good enough for people who type all day everyday. But what about all the other tasks that go into editing text?

For example: page movement. Scrolling is amazingly simple and intuitive in touch screens, so that should be a feature. Thankfully, it usually is.

But then, what about cursor movement? Remember that thumbs have a hard time hitting tiny targets, so the usual “click the spot where you want it” is /bad/. But… scrolling is good. We could use it. Alas, it's already taken by page movement! Unless…

Yes! Modes!

So here we have two modes (both without keyboard, maximizing screen use, mind you).

Scroll: where scrolling moves the page around.
Cursor: where scrolling moves the /cursor/ around.

I'd argue they should be independent (i.e., the cursor can go out of view, but stay put in scroll mode). However, there should be functions within the modes to do the other job [1]. So, e.g., in cursor-mode a two finger scroll moves the page around, which in scroll mode we'd have a function for bringing the cursor to the start of the visible text.

To switch from scroll to cursor mode, a simple tap suffices. To go the other way around (a kind of “escape key”), we need a new method. We could have some thumb-friendly large buttons at the bottom for that. Also, holding them for a bit would bring a sub-menu with variations (just as accents and symbols work in keyboards).

So cursor mode could have theses:

Esc (back to scroll mode)
Sel (select text, as in vi's visual mode; alternative for block-visual mode)
Move (move to next word (default), character, paragraph, etc).

Tapping the text would take the user to input mode, now with a keyboard.

---

This is what I've thought of so far. What I'd like to do next is formulate a fuller plan for an interface, perhaps taking inspiration from vis [2] instead of vi.

vis: A text editor combining modal editing with structural regular expressions.

Footnotes:

1. Just as ex-mode has ways of inputting text in vi.