What is a Virtual Terminal?

This is a slightly different sort of article than I usually write here, but someone happened to ask me about what is and isn't possible in the UI of a CLI application like Terraform, and I thought the answer to that might be something useful to refer to later, and so here's a slightly long-winded answer to that question.

What is a Virtual Terminal?

Modern command-line-based applications like Terraform CLI typically, at least when run by a user interactively, run in a so-called terminal emulator. The "emulator" in that name is a clue that we're dealing with an implementation in software of something that was once hardware. In this case, it's equipment like in the following image, which is of a "VT100" video terminal created by a company called Digital Equipment Corporation, or DEC. (photo by Jason Scott.)

To modern eyes this may appear to be a "computer", but what we have here is a late-70s-era device which serves as an input and output facility to a big computer nearby. Rather than running software itself, it instead just sends keypresses to the real computer and responds to commands from the computer to display text on-screen.

The "video" in "video terminal" is a reference to the fact that this is the successor of an earlier kind of terminal which perhaps more resembles a typewriter, with output presented onto continuous-fed paper rather than onto a video screen.

Because a video terminal is continuously refreshing an on-screen image rather than committing ink to paper, it has the advantage of being able update parts of the screen to show new content, which allowed it to feel more interactive than a printer-based terminal. However, both printer-based terminals and these early video terminals have in common that what they are being sent from the computer is a mixture of text and simple control codes for basic formatting, rather than arbitrary graphics.

A virtual terminal or terminal emulator is a piece of software running on a modern computer that pretends to be a video terminal. My earliest encounters with virtual terminals were when I went to my doctor some time in the late 1990s and found that where each doctor had previously had a video terminal similar to the one described above, they now had computers running some version of Microsoft Windows, but all of the patient data still lived in some big, old-fashioned computer in a room somewhere nearby that was still running the same software it had been before.

As far as that software was concerned it was still connected to video terminals, but really it was just a normal Windows application doing its best to mimic how a video terminal would behave.

And that's a pretty typical (though unfortunate) consequence of how technology improvements always seem to lurch forward inconsistently: existing systems typically get upgraded gradually rather than as a whole, and so we end up inadvertently dragging old conventions and limitations into newer technology realms.

Because of this, modern desktop operating systems like Windows, macOS, and various Linux distributions all happen to include some sort of virtual terminal as a built-in feature, originally motivated by allowing continued use of older software. But it also serves as a convenient common convention across all of these operating systems, and so command-line-based software like HashiCorp Terraform is designed to run inside that same virtual terminal software. It means that Terraform can rely on a certain set of capabilities being available across all of its supported platforms, but it also brings along with it various limitations that originate in devices like the 1978 VT-100 terminal.

What Video Terminals Do

A modern technical audience might appreciate an analogy to the technology we see in web browsers, which also originated as a way to send input to software running elsewhere (in this case, possibly in another part of the world) and see its output.

The original design of the web used HTML as a compromise to allow the remote system to describe to the web browser approximately what ought to be displayed on-screen, but to leave various details up to the browser itself. Sending an exact picture of what the result should look like would give the remote software designer exacting control over the result, but it would take longer to transmit that detail over the network and it would not allow the browser to adapt the content to the size of the user's display, to provide high-quality zoom for those who need it, and to present the text in very different ways for those who can't consume visual media.

In some ways, the virtual terminal model is a similar tradeoff: the target software itself (such as Terraform) doesn't draw pixels directly onto the screen, but rather sends to the virtual terminal a stream of data which consists of normal text mixed with simplistic control codes such as "turn on bold", or "move the cursor to the top-left of the screen", or "clear the screen".

Software designed to run inside a virtual terminal can therefore only control the final rendered result on-screen to the extent allowed by the control codes the virtual terminal supports. Virtual terminal control codes are typically far more constrained than web developers are accustomed to with HTML/CSS, and in particular typically include no facility for displaying bitmap or vector images.

Video Terminal Display Model

The best way to think of what a video terminal can do is to consider a two-dimensional grid where each cell of the grid can contain one text character, and each of those text characters can have a limited set of formatting options applied to it.

In the original hardware video terminals "character" was typically defined very strictly to include only around 224 distinct characters, including a mixture of characters we might typically consider to be "text" -- letters, digits, punctuation -- along with characters intended to approximate graphics in a limited way, called "box-drawing characters" and "block characters", like the ones I've shown below:

─: A fine horizontal line
━: A thicker/bolder horizontal line
┗: The bottom-left corner of a boldly-ruled rectangle
║: A pair of vertical lines intended to make a double-ruled border
▌: The left half of the cell is "filled" and the right half is unfilled

Applications running in video terminals (and virtual terminals) can use these graphical characters to produce low-fidelity, highly-constrained graphics, such as rectangular boxes and blocky pictures. I've included some examples below of some typical uses in some admittedly-contrived situations:

╒════════ A "window" ════════╕
│ Software might use the     │
│ lines and corners to       │
│ suggest what in a desktop  │
│ environment we might call  │
│ a "window" or "dialog box" │
└────────────────────────────┘

┓
┣━ Some software uses the box-drawing characters
┃  to create the impression of an expanding
┃  "tree" structure.
┗━ This might be familiar from older software
   design for navigating nested "folders" in
   the desktop paradigm.

I'm using web technologies to "fake" virtual terminal rendering here, so how well this will work will unfortunately depend on some details of your operating system and chosen web browser. Your system might not have a font containing all of the characters I used above, and it might not position all of the character cells directly adjacent to one another and thus there might be gaps in the borders at the seams between the cells, and unfortunately your browser might've even chosen different fonts to render different characters and so the boxes might not even line up at all. Yuck!

Typical virtual terminals do guarantee to draw these pseudo-graphics characters with the cells directly connected and aligned so that the lines and boxes will seem solid.

One thing that modern virtual terminals have that their 70s predecessors did not is support for Unicode, which defines a way to send to a virtual terminal tends of thousands of characters, rather than just the two-hundred-or-so the old ones supported, and so we can now potentially include letters and digits from languages around the world and a slightly-wider array of pseudo-graphical characters.

They do still have a similar problem as web browsers when it comes to font repertoire, though, and the font(s) used for rendering are chosen globally in the terminal program's settings and not something the author of the underlying software gets to choose, and so for example when we're designing UI for Terraform CLI we try to limit ourselves to characters that are typically available in the default configurations of virtual terminals on the three platforms we officially support: Windows, macOS, and Linux.

In current versions of Terraform at the time of writing we typically stick only to the finest single-line borders, because we've seen those to have universal support across all of the supported platforms. We don't typically use the bold or double-rule lines, or any more sophisticated characters like diagonal lines and curved corners.

The virtual terminals we aim to support offer no means for us to control exactly which font or font size will be used, and thus we also can't dictate how many columns across or rows down the character grid will be. Terminal-based programs will always look a little different depending on which program they are running on.

Colors and Styles

So far I've focused mainly on what characters we're able to put inside the cells, but video terminals also allow us to choose from a small set of different display attributes on a cell-by-cell basis.

For example, it's possible to include a control sequence which says that a subsequent series of characters ought to be rendered in bold, and another which says they ought to be rendered in red, and thus recreate Terraform's typical introduction to an error message:

Error: Spline reticulation failed

Terraform failed to reticulate your splines.

The only non-color-related styling attribute with broad support is bold, but the default virtual terminal applications on the three platforms Terraform supports all also support underline, and so modern Terraform uses a combination of bold and underline to try to create subtle visual heirarchy beyond what can be achieved through individual characters alone.

Terminals vary in how many colors they support, and we typically specify these colors not as specific RGB color values like in CSS but as abstract names like "red" where the terminal configuration itself decides exactly what color that will take when rendered on-screen. Again, since I'm using web technologies to simulate virtual terminal display here, I was able to use my own site's CSS to decide exactly how to interpret "red" for my example above, but a real terminal might be a much darker red that's harder to read against a dark background.

To make matters worse, users can typically reconfigure whether their terminal uses a light-on-dark or dark-on-light scheme by default, but software running in the terminal can't determine which has been selected. We must rely on the user to have chosen a theme where all of the available colors are mutually-legible.

In order to minimize problems for folks who may not have a color theme which produces legible text in all combinations, for Terraform we typically avoid showing large stretches of text with coloring enabled, and instead limit coloring only to small pseudo-graphical elements and to small captions that aim to help with visual heirarchy and navigation, such as the "Error: " prefix on error messages.

Interactivity

Speaking pretty broadly and non-exhaustively, I think it's fair to say that there are broadly two different modes of interactivity for applications designed to run in terminals.

The first is more typical for older software that originates in the era when video terminals were the primary interface to a computer, and involves the program taking full control over the entire screen and using the pseudo-graphical characters in conjunction with colors to create something with similar conventions to a modern graphical user interface, which in modern times we typically refer to as a text-based user interface, or "TUI".

The patient-data-management software I mentioned earlier from my 90s doctor visit was an example of this. Although the software takes over the entire display, video terminals typically don't have a mouse and so navigation is via keyboard navigation similar to what we see today with "tabbing" between UI elements, and activating them using the enter key or space bar, often combined with modal keyboard shortcuts which will directly trigger particular actions without the need for spatial navigation first.

The following picture shows some other full-screen terminal software, called Midnight Commander:

However, when thinking of modern software designed to work in terminals, such as Terraform, I think it's fair to say that most fit into a second category of interactivity, where the program prints out gradually more information, scrolling automatically to reveal more in a similar manner as the message history in a chat program, occasionally stopping to prompt the user with specific questions when needed.

In this second category of application, the interactivity is considerably more limited and typically behaves more like a "Wizard-style" workflow in a GUI application, asking one question at a time which the user must answer before continuing to the next step.

Users of such applications also typically expect that the information will broadly be append-only, adding new information to the end of the output rather than changing earlier text already displayed. However, there is some room for compromise here, as long as the output includes affordances to help the user understand which parts of the output are still "active" and subject to updating, and as long as that area is small enough to appear simultaneously in the viewport of a "typically-sized" terminal (which unfortunately is not an exact amount we can nail down, but more subjective).

Accessibility in Terminals

The modern web platform includes various extensions to HTML to allow UI designers to specify the rules for visual rendering with spacial navigation and rules for aural rendering and linear navigation together in the same output, so that a well-designed application can be accessible both to sighted and non-sighted users.

Unfortunately the display model and control sequences for virtual terminals are much less sophisticated, and so terminal-based applications typically prioritize visual rendering and spatial navigation and are less accessible to non-sighted users.

However, the character-grid-based model lends itself to a different kind of non-visual rendering: braille terminals.

A braille terminal can present one or more lines of character cells in a one-to-one mapping from visual character cell to braille character cell, but with some caveats.

The most significant caveat is that only letters, digits and punctuation characters can translate to braille in a meaningful way. Any use of pseudo-graphics characters like the boxes I discussed earlier will not necessarily translate in an understandable way. Use of punctuation characters for purposes other than punctuation may not produce a sensible result when translated to braille either; for example, Terraform uses the character sequence <- to represent "read from a data source", but that interpretation relies on the visual similarity to a left-pointing arrow, but braille equivalents of those two characters would not have the same effect.

Braille displays also cannot translate formatting attributes such as bold and colors.

The upshot of all of this is that we must make some tough compromises when deciding how to present output in a terminal-based application. Overuse of pseudo-graphical flourishes and reliance on visual attributes alone to share information will detract from the usability of software in a braille terminal. Braille terminals also lend themselves better to the "wizard-prompt-style" interactivity model than to the full-screen TUI-based model, because they typically have only a small number of active lines and don't have a means to signal when characters have updated elsewhere in the character grid.

(I do want to be clear here that not everyone who needs a non-visual presentation has access to a braille terminal. Aural presentation is also very common and we need to be cognizant of both.)

Fake "terminals" on the web

Finally I want to touch on a rather inconvenient truth for modern terminal-based software: lots of users access it only indirectly via web applications.

Terraform Cloud was historically an example of this, giving users a remote environment in which to run Terraform CLI and with the CLI output included in a box inside one of the pages of the web application. (Modern Terraform Cloud has a first-class web UI for most cases, so this is not so visible as it once was.)

These web-based views of terminal-based software bring with them some new challenges. Firstly, they are often totally non-interactive, and so although the underlying software can render into a terminal-style character grid there is no means to prompt for additional input within the "faked" terminal inside the web application.

They are also typically considerably more limited in which control sequences they will accept, possibly only supporting bold, underline, and a small set of colors.

These limitations aren't necessarily true in all cases -- there are full-blown virtual terminals implemented with JavaScript, HTML, and CSS -- but I'm considering here the common case of software like release automation and Terraform Cloud.

However, a potential advantage is that the author of the web application can use web technologies such as CSS and web fonts to tailor the conventions used in the visual representation of the terminal to better suit the branding of the application it's embedded in. The terminal-like examples I have on this page, for example, have color choices intended to match nicely with the color scheme of this site, although of course it's a matter of opinion whether I (as a software engineer, not a visual designer) actually met that goal!

Conclusion

I hope this article has given a sense of the capabilities and limitations of presenting output from software designed to run in typical virtual terminals shipped with modern operating systems.

I've skipped a lot of details because this article is pretty long as it is, and I'm sure some readers are grimacing at some generalizations I made and details/complexities I ignored, but my main goal here was to give an overview that might be useful to someone who is accustomed to designing for the web but who is now intending to design for a CLI-based application.

If you want to see all of the gory details of how virtual terminal control codes work, etc, this is certainly not the article for that but you should be able to find some references to the details by searching online for queries like "vt100 control sequences", or "terminal escape codes". You may also be interested in the ANSI standard for control codes, which is approximately a subset of what different terminals have supported since the era of hardware video terminals.