Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Anatomy of a Terminal Emulator (poor.dev)
304 points by imsnif on Nov 2, 2021 | hide | past | favorite | 70 comments


Casey Muratori's refterm video series is a great example of how to do application design. It focuses on writing a terminal emulator for Windows, but almost all the concepts apply to nixes as well -- even down to the level of OS primitives like memory mappings to create a ring buffer, etc. He builds a nearly complete terminal that appears to better support unicode and escape sequences than windows-terminal, in drastically less code and with literally orders of magnitude better performance.

In particular, I like his focus on experimentation to figure out a reasonable upper bound on performance (so you can measure success/failure), "non-pessimisation" (don't do things extrinsic to the actual task), understanding the actual machine and its capabilities (he doesn't use the term, but sometimes called "mechanical sympathy"), and tactics for isolating "bad code" that you probably have to use (at least to get started).

https://www.youtube.com/watch?v=pgoetgxecw8


Nice recommendation, that was an enjoyable watch.

I like the conceptual separation of the two (real) optimization techniques into: "hardcore optimization" (my own term) which is intensive, careful measurement and fine-tuning; and this idea of "non-pessimization" which is the principle of doing the least possible amount of work while delivering the required features, analyzed from a "basic algorithms" and "fuzzy mechanical sympathy" perspective. "Hardcore optimization" is inordinately expensive so can only be used sparingly, but "non-pessimization" is much easier and should be used thoroughout your codebase. His argument is that just applying non-pessimization alone represents a 100x-1000x speedup compared to typical code. And in particular if a codebase is pessimized, it's largely pointless to do hardcore optimization because every time a new section is optimized it just reveals countless other sections that are hopelessly unoptimized and become the new bottleneck. I.e. non-pessimization is a prerequisite to effective hardcore optimization.

I also like the idea of isolating "bad code" as he puts it, but I feel that term is a bit uncharitable and implies an unnecessarily narrow scope. I don't think "bad" in the sense of a general value judgement is the operative term here, and it would be better stated more flatly as "slow code" or maybe "expensive code" because it would apply as equally well but to more situations. For example we'd still like to cache the output of a library that is written as well as could be expected for what it does but is still too expensive compared to our required throughput.

Caching with a key derived from hash(concat([input bytes],[output byte #])) is a pretty clever idea to multiplex the hash table, but I don't really love that it has to hash the input bytes multiple times, and it doesn't solve the problem of figuring out how many output glyphs it should look up. I suspect that adding an output length parameter to the cache entry, and using a glyph cache that supports outputting multiple glyph sizes (even as separate arrays of power-of-2 sizes, 1,2,4,8,16) would have been less pessimized. :)


Agree on all counts except I believe the hash operates on the input bytes only once, along the lines of:

    HashValue =  HashRound(HashRound(NothingUpSleeveSeed, BytesCount), Bytes)
Then computing the other hashes just use one more round HashRound(HashValue, Index) for each Index. The rounds are just aesdec instructions.

He calls this “pelican hashing” which is presumably some reference to the hashing literature, I’m not familiar.


Ah, that's not so bad I guess. Though I'm still curious how it would hold up to a multiply-sized texture cache.


While I was an economics student, I somehow landed a job to build out an elevator monitoring system at my college. Figured out that terminal emulators could connect to the elevator controller and that the output spec was called Wyse-60. My first "professional" program was written in Python using ncurses to parse the serial output [0].

The program is very cringe but it was my best attempt at the time as I couldn't find any literature on terminal emulators (what actually happened was that I didn't know what to search for)

Anyways, I've successfully pivoted my career away from economics and into software development which was the goal when I took the job in college.

[0]: https://github.com/lee-pham/Pyse-60


I used to use a white phosphor Wyse 60 as my daily driver. It was connected to a DG AViiON running DG/UX - most of the tools were GNU.

Those were the days!


I recommend going through st source: https://git.suckless.org/st/file/st.c.html

I may have fallen out of love to suckless.org, but the code is usually simple so one can at least learn from it.


Why have you fallen out of love to suckless? I recently switched from dwm back to bspwm and I was also looking for a st replacement, but couldn't find one. Which is ok, but were you able to find something "better" in the terminal emulator or window manager space?


I wanted/tried to be a part of the community. I could say I was not a cultural fit.

As much as I sympathize with cutting things down and making frugal software I am too messy to make it all work for my advantage. I now prefer to mix and match and not attaching myself to any group. But I am fond of the actual code. It is refreshingly simple and easy to track. I was using dwm the most from all of the projects and I still think of going back to it. Though currently I'm getting tired of software all together.


The fact that they're frickin' Nazis might have something to do with it: https://mobile.twitter.com/kuschku/status/115648842041336217...


I'm just a dumb American, but are torch hikes something only Nazis do? I've never heard of them.


Why not ask the local Klu Klux Klan chapter in your neighbourhood? They have a "fine" tradition of conducting such hikes.

Most recent example - Charlottesville?


It's not just the torch hikes, though those are strongly associated with Nazis, neo-Nazis, and American white supremacists.

It's also the naming of their mail server after Hitler's hidden base in Poland, and the railing against "cultural marxism" (itself a Nazi trope). Taken together, these things should register as "yep, probably a Nazi" even if one of them alone wouldn't.


> are torch hikes something only Nazis do?

No. Here in France they are sometimes organized as local events, sometimes they're used to --- surprise --- protest. We also have "descente aux flambeaux" (downhill skying with torches.)


I recommend it as well. It's the simplest terminal emulator I've found, other projects I've explored were a lot more complex.


Why do I have to click a link to read the article? I don't understand why it's a thing to add a "fold" to a web article.

More on topic, this does seem like a decent overview.


Sorry about that! The SVGs are a bit on the heavy side and the various social preview crawlers didn't appreciate it.


I put my heavy blog content on BunnyCDN. It’s impossibly cheap. I’m happy to pay a few quarters if a blog post happens to blow up.


Hey, I really like the animated diagrams in your posts. Could you share a few tips about how you create them?


Thanks! I use Inkscape to create the SVGs and then animate them with javascript and greensock. I still haven't quite found the right layer of abstraction for this, so it is a tedious process that takes a lot of time. Once I do, I might release a tool that will help with it.


Ahh so it was an informed choice. I understand. But Hacker News doesn't do previews anyway :)


Would a `<details>` block also work?


I'm not sure. Wouldn't the webcrawler still have to load it?


Not a design/advertising guy, but often times I think it's to be able to display an ad at the bottom without the user to fully read the article, and without putting it ahead of the article (which is ugly IMO).

Here though I don't think it's the case.


There are no ads on my blog and will never be. That's a promise :)


That is why you are a poor dev :-)


Ads on dev blogs pay peanuts, if that. It's a way to stay poor.


Yeah a direct link would have been better: https://www.poor.dev/blog/terminal-anatomy/


For those interested in learning more about terminal emulators and character graphics, checkout Nick Black's book Hacking the Planet (with Notcurses): A Guide to TUIs and Character Graphics[1].

While the overall focus of the book is on programming with Notcurses[2], the author shares a wealth of related info and history throughout its pages.

[1] https://nick-black.com/htp-notcurses.pdf

[2] https://github.com/dankamongmen/notcurses#readme


One thing I've been wondering for a long time (and my Google-fu is apparently too weak to find on my own): how do certain console applications change the output color without inserting ANSI escape sequences into the output stream? The specific case I have in mind is when writing console programs in C#/.NET, and using the System.Console.ForegroundColor property to vary output colors on the fly. The resulting output text does not have ANSI escape characters in it, yet the colors are displayed properly in the terminal.


On Linux, .NET does write ANSI escape sequences to stdout for you. ConsolePal.Unix.cs (Pal in this context referring to 'platform abstraction layer') does the work for you behind the scenes: https://github.com/dotnet/runtime/blob/10eb1a9ff1c09de4a2a1f...


Is this on Windows? If so Windows provides a typical Win32 type of HANDLE API for the console which can be used to change the properties of the console among other things.


To clarify: .NET being cross platform, I write console apps that I run on both Windows and MacOS, and in both cases if I redirect the color output to a .txt file and open it in an editor, there are no ANSI escape sequences in the file. So I was wondering if there is some other standard for sending control codes to a terminal that is independent of the output stream.

Or is there some voodoo by which the ANSI sequences get stripped from the output when redirected to a file?


> Or is there some voodoo by which the ANSI sequences get stripped from the output when redirected to a file?

Not quite, but programs in general use some voodoo to detect when they are being redirected and won't output ANSI codes when they detect that.

Often there will be a flag to enable/disable color, or let it detect when color is desired. On Linux ls accepts the --color=WHEN parameter, where WHEN can be always (`ls --color=always > ls-with-ansi-codes.txt` will output color codes), it can be never (just don't output ANSI codes, no matter whether you detect output redirection) or it can be auto (will show colors when you execute `ls --color=auto` but not when you do `ls --color=auto > ls-without-ansi-codes.txt`).


On windows, there are APIs for setting color that could be ignored if STDOUT isn't a terminal.

On posix platforms, there's an API to check if a given file handle is a tty or not. https://github.com/corasaurus-hex/isatty/blob/main/isatty.c is an example of using the API, in a Janet context.

I assume that .NET's support for Linux/macOS is using that API decide if it should strip color codes or not.


Well yes like I said on Windows you can access the terminal and set its properties without encoding terminal escape sequences in the output stream. On Linux afaik escape sequences are the only way. On MacOS i have no idea. Generally speaking the escape sequences aren't "standard" but different terminals (and these days terminal emulators) all have their own codes. Hence ncurses which tries to encapsulate this and provide a uniform API so that the application doesn't have to deal with these murky details.

Have you double checked with s hex editor what is in the txt file? Suppose it could also be your text editor that doesn't want to render those codes.


Are you manually inserting the ANSI sequences into the stream? Or are you using a library that does this for you?

Often times libraries that let you specify output color will helpfully query the capabilities of the output device, and avoid writing the control codes if it’s (for example) a plain text file.


They are sometimes stripped when redirecting. Maybe try executing it headless and reading the output?


Personally I'm not a C#/.NET developer so wouldn't know where to look for the source code, but to check you can run the program in the examples of the post with the compiled binary on the other side (in place of the SHELL) and see what output you get.

I'm 99% sure it's inserting ANSI escape codes (I'm maintaining a terminal emulator myself, and really that's how everything works), but I could of course be wrong.


The Windows console is a bit more sophisticated than the traditional *nix "shuffle text back and forth" approach; the Windows Command Line blog goes into some detail here (and in the rest of this series): https://devblogs.microsoft.com/commandline/windows-command-l...

Short version: there is support for ANSI and VT sequences in the Windows console (which relatively recently got substantially expanded), but that's not what it speaks "natively"-- for Win32 console applications, there's a native console API that works by passing IOCTLs back and forth between the app and console driver.

(If you're writing a terminal emulator for Windows, you don't necessarily see this-- the new ConPTY mechanism you use to build these as of Windows 10 abstracts away the Windows specifics so you see text and VT sequences just as you would on *nix.)


WAY back in the DOS days Qbasic and command.com could both change foreground and background colours via system hooks (INT bios calls) without using ANSI. I know this because Qbasic console apps could be colourful without having to load ansi.sys in config.sys

I am thinking that .foregroundColor and .backgroundColor do the same thing via legacy emulation in conhost.


Most likely it was writing text and attribute data directly to video memory. You didn't even need to call into the BIOS to do that shit (though the BIOS may have helped from a convenience standpoint to set the video mode).


Turbo Pascal had a Crt library that did this.


True, but you could also get the address of the screen buffer and define a structured type to overlayed it that allowed you to put values directly into memory.

Each screen location was two bytes, one byte for the character value, and one for the character attributes which were a set of bits that controlled red, green, blue and intensity for both the foreground color and background color.

Then you could write routines that would fill in a rectangular region with a color, scroll the text of a region up or down, and do all sorts of other windowy things. There were also interrupt routines you could call that did some of these, and certainly routines in Crt that did some of this. Then you were on your way to developing your own TUI library!

Alternatively, you would issue an interrupt call to put the screen into 320x200 256 color mode, get the address of that buffer ($B800 if memory serves), similarly overlay it with a typed grid, then start poking byte values in and getting all sorts of nice colors out of it. Super fun!!


Maybe I'm biased, but I think learning to code in that environment was way better than today's Javascript-based in-browser graphics environment. I didn't know a damn thing about coding, but I was still able to make a rudimentary GUI paint program in one semester of high school.

I looked recently to see if an equivalent of Crt is still out there, but it doesn't look like any of the modern Pascal versions support it.


A000:0000 was the VGA graphics buffer. B800:0000 is the text mode buffer though. (Including the segment and offset since that works out to A0000 I believe.)


Wow, I got the right offset for the other mode! Not bad for a 30-year impressionistic memory!


This comment wins.


Great article. Terminal emulators / shells are an area where we haven't seen much improvement for years. I can't help but think there could be a much nicer command-line-esque interface other than a classic psuedoterminal and shell. At the same time, I doubt any improvements could really be large enough to gain adoption.


There have been many attempts. Sixel and ReGIS being among the oldest attempt at augmenting terminals with graphics. But most modern apps that run in terminals don't even push the limits of what terminals can do, and that makes it kind-of hard to justify putting effort into adding more capabilities to terminals.

Part of the challenge is to find capabilities to add that are sufficiently simple and compelling to use in command line apps vs. as a web app or native GUI app that'd be more widely accessible than a new terminal capability.


IMO the issue is not the capabilities of the terminal, but rather the culture around it. It is very often seen as a "leet" platform for hackers and advanced users. Its users looked at with respect and envy. Which is unhealthy for the kind of adoption one would wish for in their apps.

Really, I think this mostly requires imagination from those of us developing for this platform. We have all the capabilities we need. We just need to create apps that don't require you to look up cheat sheets and man pages to use, and provide powerful alternatives to existing GUIs. There's a bit of a renaissance in this area as of late, but I feel this is just the beginning.


I think we're looking for very different things.

I use my own text editor. I expect to remain its only user, and that's fine. It's not intentional. It just isn't a priority for me to make it usable for anyone else (though I do separate out parts of it into libraries / gems as functionality matures).

As such, I'm looking for improved terminals not to make things that doesn't require cheat sheets (because to use my editor you'd have to be comfortable with reading the source). It's fine that others want other things, but I think this is part of the challenge with improving terminals: A lot of the user base are advanced users who want to improve their own workflows, not write something user friendly, and who wants tools that are easy to combine and chain, where the priority is not something user friendly.

The balance is fine in that if you're looking for something particularly user friendly in the sense of friendly to less advanced users, targeting a GUI framework or the web is very quickly going to provide a better experience.

EDIT: I saw zellij mentioned in a sibling, and looked at it, and incidentally it's quite similar to my setup, except I use bspwm and scripts to manipulate the tiling so that e.g. doing a vertical or horizontal split in my editor results in a new editor instance in a new tiled window rather than in a new pane in a single terminal. That way I can treat the few GUI apps I use exactly the same way (mostly Chrome). I'd love it if there was a standard API to split panes and start apps in a given pane, so both terminal multiplexers and tiling wms could be targeted with a single standard mechanism.


I agree that's how things are today. I get where you're coming from and I'm certainly not trying to take this away from you.

My experience has taught me that tools become better the more people use them and are able to participate in their creation and maintenance. The more user friendly a tool is, the more users you'll have. Personally, I think a tool can be very user friendly and still powerful enough for advanced users.

I used a very similar setup to yours before I started Zellij! The reason I started the project was in order to be able to formalize such a setup (for me it was implemented as a soup of bash scripts which I dreaded moving to another machine, not to mention handing to another user).

One of the things we want to do with Zellij is port it to the web and maybe even in the future to use it as a sort of "backend" to power tiling window managers. I'd be totally open to do this in a standardized way if maintainers of other tiling managers are game.


> My experience has taught me that tools become better the more people use them and are able to participate in their creation and maintenance. The more user friendly a tool is, the more users you'll have. Personally, I think a tool can be very user friendly and still powerful enough for advanced users.

I've taken the approach that rather than try to shoehorn my use into a full app, I aim for my text editor to be as small as possible, by reusing external tools whenever possible, as I do agree with you that it's worthwhile to have as much as possible of the code used by other people. But at the same time I realised there are editors smaller than my old Emacs config. So e.g. my editor relies on bspwm to split panes, and on rofi (or anything that can take a list of things to choose from and return the chosen thing) to select files or themes or buffers, Rouge for syntax highlighting, and anything that I can make generic enough I'm splitting into gems (the editor itself is written in Ruby). The way I see it, I want the editor to be a tiny little core that's mostly configuring other components. Currently it's about ~2.6kloc, but much of that is code that can be split out or will disappear as I clean some things up. I don't want it to get much bigger than that - preferably it'll get smaller.

> I used a very similar setup to yours before I started Zellij! The reason I started the project was in order to be able to formalize such a setup (for me it was implemented as a soup of bash scripts which I dreaded moving to another machine, not to mention handing to another user).

The big limiting factor for me with something like Zellij over my current setup would be having it work alongside e.g. Chrome and the occasional other gui app.

> One of the things we want to do with Zellij is port it to the web and maybe even in the future to use it as a sort of "backend" to power tiling window managers. I'd be totally open to do this in a standardized way if maintainers of other tiling managers are game.

If you provide a mechanism that allows a client to split panes already, then maybe the easiest starting point is for someone to just pick a config location/format tools can look for a command line to do splits in. E.g. on bspwm, the command given to split horizontally might simply be sh -c 'bspc node -p east ; exec #{cmd}' &. On i3wm, the same would be i3-msg 'split horizontal; exec #{cmd}'. Currently my editor just blindly executes "split-horizontal re --buffer numeric-id-of-the-buffer", and I have a "split-horizontal" script in my ~/bin. It'd be trivial enough to have it read a config file to find out what command to execute instead.


Could not agree more with this. Thanks for the interesting article and your work on zellij.


Can terminals even be improved without breaking everything? Even maintenance work and fixing bugs has historically caused problems:

https://lwn.net/Articles/343828/


I think that for sure something line the Plan9 shell woukd be something cooler to have


As in graphics in the terminal? The way that works is through the draw(3) kernel device which is a 2d engine with an rpc interface. You load text and bitmaps into the draw device and then issue rendering commands. The kernel terminal device, cons(3) is where the terminal text is written to and cons sends that to to draw. When you start a graphical program, it overwrites the window graphics from cons(3) until the graphical program exits or the window deleted. There is no in band cursor control in plan 9 as it is a graphics oriented OS.

So its not just porting a terminal. It's the entire OS. Of course there is p9p, plan 9 port, which is a port of the core plan 9 user space tools to Unix systems. It does offer a draw server that can be mounted.


A lot of terminals supports at least somewhat similar functionality via Sixel (bitmaps) and ReGIS (vector graphics). It's limited and certainly could be improved a lot, though.


I totally agree. I'm actively working in that direction.


TempleOS comes to mind.


Very well written article, thank you.

Especially when my daughter is taking a course on Unix and the instructor is superficially touching in these fundamental topics (due to their own limited understanding), I am going to use this to teach the underlying concepts.


Well done guide! Was visually enjoyable to read/watch as well as being well-written.


I realize naming things is hard, but they confuse me. Are these due to historical quirks, or am I missing something?

E.g., why is it called a terminal emulator and not a terminal user interface? Why is there a "y" in pty if it stands for pseudo terminal? Why is it call it a "pseudo terminal" if it's acting like a communication pipe?


Expanding on what others have written:

- The original terminal was a teletypewriter, printing onto paper, connected to a computer, usually through a serial interface. A teleltype is abbreviated "TTY", and the original Unix terminal interfaces were given the device file names /dev/tty, /dev/tty1, /dev/tty2, ... Those were the hard-defined terminals. Yes, these were communication pipes, handling stdin (keyboard), and stdout and stderr (the printer).

- As psuedo terminals came to be used, for users connected via "glass TTYs" (CRT terminals such as the venerable VT-100 and VT-200), or by entiirely virtualised terminals through remote connections (telnet, rsh) or windowing systems (W and X11), pty came to be used for pesudoterminal. Again, these were communication pipes. I used a variant similar to this VT-320, notable for its amber phosphr display: https://yewtu.be/watch?v=RuZUPpmXfT0

Yes, historical quirks.

You can still hook Unix up to a teletype, by the way:

https://yewtu.be/watch?v=2XLZ4Z8LpEE


Because a terminal (aka tty... or teletype) was a physical piece of hardware that we now emulate in software.

The y in pty is because of tty.


Pty is a historical quirk: it stands for pseudo-tty, and tty is short for teletype. Actually there are a lot of things in old Unix that were influenced by the primary user interaction device being a slow (110 baud) printing terminal with no lowercase.


I absolutely love terminals. But, just about the time I could have learned curses, I switched to GUIs. Now I've switched back and uses curses to make UIs (so I can ssh into remote computers with low bandwidth and CPU). It's so funny how terminals are controlled and all the edge cases of each implementation.


Interesting article. It's too late now for me to focus on it but I'll check it out in the morning! :D Thanks for posting.


The domain name is hilarious




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: