An interesting thesis to be sure, but his pictionary example highlights where it breaks down. Using a graphical UI is not like drawing a picture of what you want; it's manipulating a pseudo-physical interface using a mouse (for desktops) or your actual fingers (mobile devices and tablets).
In the real world, if you were sitting next to a stereo, would you prefer to instruct the person sitting next to you to load a song and hit play, or would you prefer to just do it yourself? That's the difference between a command line and a GUI: the difference between describing an action and performing it directly.
As for discoverability: saying you need to "know a language" to know that your song is in a folder, and that folders are nested, and that they reside on drives, ignores the fact that you need to know one for a command line too: you need to know the command "play" exists; you need to know that song names should be in double-quotes. You need to know the keyword "random" (rather than "shuffle"), and whether or not it understands commas.
In a GUI you can spot an icon with a musical note on it, and that's your first clue. Then you can poke around and see a round button with a triangle on it, a 40-year-old visual convention. The inherent discoverability of a visual interface is its primary advantage.
I clearly remember 15 years ago, sitting at a freshly-installed Linux command prompt, and being utterly frustrated. I wanted to explore the system, but didn't know the words. How do you discover "ls"? You can't even use "man" (if you knew about that, which I didn't) because you need to know the command in advance.
Discoverability of GUI components is definitely a big advantage of graphical programs. A bit of NLP magic could help. Most CLIs have this at an extremely basic level -- invoke the command with invalid arguments and it will print out a short "help" blurb.
Some tools (e.g. git, sbt) take it a step further and will suggest alternatives based on similar commands. 'git sattus' will ask if you mean 'git status', for example.
WolframAlpha has a particularly interesting interface where you basically type natural language and it tries to parse it into a formal language. It then shows you the formal input interpretation (in addition to the output). This lets you immediately see if the program interpreted your command correctly, because a person can understand the formal specification of the command very easily. If you, the person, detect a misunderstanding it is then very simple to update your input so it's parsed correctly the next time.
For example, look at these queries. Note the formal input interpretation of the first, and the "closest interpretation" of the second:
Finally, it works. But it highlights the fact that your syntax has to be very specific to be understood by Wolfram. Which makes the whole thing very error-prone and not discoverable.
Good catch! Guess we have a ways to go before this sort of system works. The key is having the formal specification unambiguous. I guess this proved the a human cannot, in fact, parse the formal interpretation easily :)
Identifying common types of errors (e.g. mismatching units) could strengthen the process by showing the closest interpretation and "did you mean"-style suggestions.
and to find out about ls -- apropos "list directory"
This doesn't change the fact that your average shell is not configured for beginners. There should be a banner offering a tutorial or something on first startup. Also, apropos should by default make a AND search instead of OR when given several arguments.
And maybe something like the friendly interactive shell should be made the default -- http://fishshell.com/
Yes, but how do I find out about "apropos"? I've been using Linux daily for 16 years now and this is -- really! -- the first time I've heard about this command.
I came to Linux from a DOS background, where I was pretty proficient. My first attempt went down in flames and I didn't come back for years.
Now that I've learned it enough to consider myself 'good' (but not 'expert'), I prefer it. That took a LONG time. Longer than it took for DOS, actually.
Like I said.. the shell should offer a tutorial by default on first login.
And 'help' should probably be a built-in (like it is in fish!) that explains how to navigate the help files, and give a brief introduction to the shell.
The non-obviousness of these commands is not an inherent flaw of the CLI, it's an implementation error.
Even better than a tutorial, have the shell pass any unrecognized commands to apropos or a help program. This would make things much more discoverable.
@tg-login1:~> crap
If 'crap' is not a typo you can run the following command to lookup the package that contains the binary:
command-not-found crap
-bash: crap: command not found`
stavros@stavrosvise ~> crap
fish: Unknown command “crap”
No command 'crap' found, did you mean:
Command 'cap' from package 'capistrano' (universe)
Command 'czap' from package 'dvb-apps' (universe)
Command 'crip' from package 'crip' (universe)
Command 'grap' from package 'grap' (universe)
crap: command not found
I think that autocomplete, as is known from programming editors (IDE's) would work really well for discoverability in a text based shell. I don't know if anyone have tried to implement that?
Well, yes, but I was thinking something a bit more verbose and p-front than just the old press-tab-to-autocomplete. Or did you have something else in mind?
The author had already described command line tools in terms of language learning when he mentioned the "language" of graphical tools. His point was "yeah, you have to learn a language to use command-line tools, but you have to learn a different one to use graphical tools."
I like your point about describing versus doing. It seems to capture not only the difference in paradigm, but why you would want to use either one for a given task.
Your radio analogy holds for people who are new to something, but for experts, it's exactly backwards. Using the CLI is often much quicker and easier than the GUI.
Unless you meant that it's easier to ask someone to play a tape for you, if you've never done it for yourself before... And then I guess it holds both ways.
It's certainly easier for experts, but that was not his point. His thesis is that the command-line is really easier for everyone, and I just don't think that holds water.
Studies where conducted (at Apple I think), experts made more mistakes about vi current edit state than beginners who where constantly checking it visually. Other studies checked experts on GUI stuff, and they where making more mistake with shortcuts and took longer to correct them than by using the mouse all the way.
But there was a big problem: when asked they never recognized it and had to be shown the actual numbers to undertand it.
Looks like a confirmation bias all the way down.
Not to mention that a poorly designed GUI can also be frustrating. I couldn't get this working on my ipad when I tried:
http://news.ycombinator.com/item?id=2809451
Nothing was discoverable.
I like the premise of the article, but I feel there is an unspoken assumption made by the author that most people can touch-type or are willing to learn how.
Consider this command:
play 10 random country then "Teenage Dream"
A proficient touch typist can rattle this off in < 10sec but a "hunt and peck" or other ad hoc typist will be essentially trading pecking at the screen with a finger on the mouse to pecking at the keyboard with the same finger.
The larger assumption, I think, is that people will know precisely what they want to do.
How many people sit down to fire up some music and know exactly what they want to listen to?
One could trivially add a "list playlists" concept or "list country" or "browse artists" for the uninspired listener. But how many users prefer text to GUI for that sort of browsing/discovery experience?
To say nothing of the additional inefficiency generated by typing in what is now clearly going to be a total of several commands in many, if not most, cases.
I find the command line superior to the GUI for exactly one thing: firing off specific commands. As soon as you step outside of that, its advantages wane and disadvantages take over.
And it's an unfortunately-common mistake of technical users, to only consider use cases where the user knows precisely what they want to do and executes the process correctly.
Ok, but I don't think typing is the point. What if we could just say it?
Isn't it much more intuitive to say to the computer "play 10 random country" than enunciate a succession of commands / menu items?
We now take all of those menu items for granted, but really, does it make any sense to have to do "File / Open" to play a song or continue to write something we started working on earlier?
In the real world the only things I ever open are books; I would open folders if I still had any, but I certainly never open any "file" (whatever that is). We open things that contain other things that we're interested in, at a conceptual level. A file does not contain anything (bits? characters? sentences?) -- it is the thing itself.
And "save"? Why does this have to be?? In the real world if I leave a paper on my desk I expect to find it in the exact same place and state the next day; but on the computer if I don't save it I lose it. And I need to give it a name, too! All of this is crazy (one way to see how crazy it is, is to try to teach it to old people).
I think iCloud is starting to address all of this. It's about time. But new interfaces and ways to interact with the computer shouldn't be dismissed right away. This guy could be onto something.
> Isn't it much more intuitive to say to the computer "play 10 random country" than enunciate a succession of commands / menu items?
Have you ever tried to interact with one of those "natural language" telephone systems? I loath the whole experience. Discoverability is either painful (waiting for a read-through of your options) or futile (guessing blindly and not knowing whether it is your word choice or enunciation that is to blame).
> Discoverability is either painful (waiting for a read-through of your options) or futile (guessing blindly and not knowing whether it is your word choice or enunciation that is to blame).
That is due to the limitations of such voice menus being an audio only interface. There is no reason that a computer couldn't list helpful commands, modifiers (switches), and arguments on the screen as a person rambled about the kind of thing(s) they wanted to do. Given fast, context-aware speech recognition, obviously.
"play 10 random country" isn't really intuitive at all. "Play a bunch of country songs" is what someone might actually say, or maybe "I wanna listen to some c&w". The computer's got to be able to understand this, which isn't likely.
What usually puzzles me in all these discussions is the concept of keyboard shortcuts. They're a middle ground between GUI and command line (stretching the definition of middle ground a little). But my personal experience with people like my parents and even sometimes colleagues who use computers more than 3-4 hrs a day as part of their job is that there is a big aversion to keyboard shortcuts. I can't imagine how Ctrl-C and Ctrl-V are in any sense harder than painfully clicking 2-3 times through hierarchical menus, but somehow my anecdotal experience leads me to believe that people just don't value time efficiency in a way power users do (or maybe that can be used to define power users).
Maybe understanding why keyboard shortcuts are used so little might help us understand why the command line has lesser adoption among the non-power users.
> somehow my anecdotal experience leads me to believe that [some] people just don't value time efficiency
If you stop an average (and I mean genuinely average) person on the street, they probably won't even understand 'time efficiency' unless you lead them. If you stop an above average person, they will understand the idea, but admit that they don't care about it. What they care about is 'mental efficiency' (ie "I don't mind if takes 3 times as long. I don't want to have to think").
If you encounter a significantly above average person in the work place, and try to point out how they could do something twice as quickly, they will generally respond "Just let me get it done".
(Bear in mind that in terms of computer use, the average person only just owns a computer, and anyone reading this comment is easily in the top 2%).
Keyboard shortcuts don't usually offer much feedback that you're on the right track. If you make a typo while using them they can become very confusing.
Ctrl-C and Ctrl-V are actally good examples. With no feedback on Ctrl-C at all, you can often mis-type Ctrl-V instead and have no idea until you try to paste later. Weirdly, some programs don't support those shortcuts, which leads to similar problems.
I think that sort of faceless failure makes many users feel really, really dumb and out of control. I suspect they prefer menus because they can see when they're getting things right and when they're goofing up.
It's not insoluble, but no system I'm familiar with makes it a priority, either.
The problem with keyboard shortcuts is you're still wedded to the GUI. Instead of being able to type 'Play "Teenage Dream"' you have to navigate down a GUI tree, something like File-->Play-->Album-->Teenage Dream-->Teenage Dream. For me the time is in reading the GUI context at each level, so it's just as fast to use the mouse.
My grandma struggles to use a computer. She has no real ability to scan a UI or intuit where she needs to click. She rarely remembers even basic operations, so she writes them down in a notepad, step by step:
1) Click on "File" in top left of screen
2) Click on "Print" in drop-down menu
3) On new screen, click little up arrow next to right hand number for more copies
4) Click "Print" button in bottom right of new screen
If she is unsure how to proceed or accidentally clicks the wrong thing, she has essentially no ability to recover; Her usual solution is to restart the computer and begin from scratch. Even after years of using a computer, she has no sense of what GUI metaphors mean and relies upon memorising a series of incantations by rote.
Unsurprisingly, she finds her computer to be frustrating, opaque and a little bit frightening. I have absolutely no doubt that she would be happier and more productive in a suitable CLI environment. I don't think she is alone in this and in many respects I feel that WordPerfect 5.1 was a high-water mark in computer usability. A modern GUI simply presents too much information for someone with a slow OODA loop to process. In a GUI, all actions are mutable and a user must constantly be aware of what mode they are in, what window is active, what menu is selected and so on. A CLI requires no situational awareness beyond the current working directory.
I think it's easy for mid-level users to forget the amount of accumulated knowledge they have, how honed their intuition is, how well they understand the implicit grammar of interfaces. When we view the world through the eyes of someone with no cognitive model of how a computer works, we can see that uttering incantations into a CLI is far more straightforward than trying to navigate the labyrinths of modern GUIs.
Context-sensitive tab-completion addresses this to a degree. Things like Emacs' ido and dmenu can often take you the rest of the way. I play my music using a script rather similar to what he's described in the article except it narrows down the list of completions as I type using dmenu: https://github.com/technomancy/dotfiles/blob/master/bin/musi...
I really wish this style of completion were more common.
> “Yeah, but really complicated things are going to require searching through manuals.” Maybe, but you already do that in the GUI, and very often you have to do that to accomplish some of the simpler tasks above.
I think this guy is just arguing for the sake of hits to his blog...
(1.) How many power users check a manual to do advanced tasks on a GUI? I bet most of you use 'intuition'.
(2.) How often does your mum check a manual ...ever. How long does she wish to spend to understand how to work it?
(3.) When you're explaining over the phone how she can get the program to work do you wish to be explaining a list of commands that need to be typed exactly and in order: almost like teaching your mum how to code? Do you...
> (3.) When you're explaining over the phone how she can get the program to work do you wish to be explaining a list of commands that need to be typed exactly and in order: almost like teaching your mum how to code? Do you...
The only reason a GUI is better for verbal communication is because most users already have some familiarity with the idioms. Trying to explain what to click on when a user has never seen that control before can be very hard/frustrating.
For written communication, CLI is much, much easier to deal with (you can just copy paste the command and expected output). Trying to convey GUI steps usually involves lots of screenshots and some work in Paint to highlight the important controls. I recently walked (via email) a friend who had never used a CLI before through restarting a server on a Linux box using the terminal. Doing the same with a GUI would have been a lot more painful on my end (so many screenshots) and that's assuming she is familiar enough with Linux's GUI idioms to not get confused by differences in basic controls.
Human brains work well when manipulating objects as they are being interpreted. A graphical UI is an extension of reality. We manipulate the UI until it reaches a desired goal. This is somewhat instinctive, while remembering commands is not as they must be learned before going further.
UIs are not the same in this way. A UI can be manipulated before having any knowledge of what it does, eventually giving an impression of its use.
Playlists, or media controls are easy. But what if you wanted to navigate a spreadsheet? Do I instinctively know that I have to type spreadsheet move to D:3? Not likely.
Clicking, dragging or scrolling on the other hand is much more intuitive to how humans interact with reality. This is true with early learning as well. Children are encouraged to learn shapes, colours and dimensions first as it is much easier than demanding them to know exactly what shapes, colours and dimensions are useful for in everyday life.
I really think the two paradigms can be combined: you can have a command line which makes suggestions as to what you mean (like tab completion), but also just takes what you've typed and works out what to display in response, even before you've hit enter. For instance, typing "play" may as well make a few suggestions as to what to type next, but also present a list of your favourite artists, so you can just click on one, or you can carry on typing.
A lot of people's fear of the command line is the empty imposing box, with a cursor and nowhere to look for help. I think we just need to help users exploring the interface, and maybe rename some of those cryptic commands, too.
It's a shame there's so little work in this field, really.
(I posted this comment there, too, for the record.)
As frosty pointed out, typing itself doesn't "just work" for lots of people. Moreover, command interfaces aren't ignored by the "just works" crowd: voice command has been a holy grail for years. It's typing that no one wants to do.
The concepts outlined in the article remind me of Ubiquity (the Firefox addon), particulary its combination of the speed of a shell with the discoverability and ease-of-use of a GUI.
For the few months that I used Ubiquity extensively, I felt like that was what exactly how computer interaction was supposed to work. Too bad they abandoned the app.
I think command-line is considered bad compared to the GUI is the lack of NLP/AI and the use of widely varying syntaxes and semantics for each command line argument.
For example, how do you know that the syntax "play 10 random country songs" would work? If it did work, would it play random (country songs) or (random country) songs? -- is country a type of music or a category to sort by?
The fact that everyone likes to classify things in different (and sometimes seemingly-illogical) ways leads to problems even if the syntax is well known - look at issues with SQL. Can you do "select id from company" on all databases?
Human communication is very difficult and error-prone, even for humans. Redundancy and back-forth is necessary to share understanding.
Perhaps we should simply acknowledge the fact that different people are... well, different.
For many years I worked with a guy who was probably about as good a programmer as I am, but with a very different mind.
My mind works on muscle memory: ask me what key combination I just hit to create a dummy function prototype, and I have to put my fingers on the keys and mentally read them off: he on the other hand, would think of it as control-shift-F.
He would code to country music - I couldn't code with any music with words to it (whether or not I could understand them).
Some people are more comfortable with GUIs, some with VUIs.
In XXIV century you speak to your computer and if you're supposed to point'n'click some panels, you do it in predefined special areas - it's modal work - you're occupied with one panel at time, not with dozens of overlapping windows. It's like some old games were operated. It's fun! ..and vastly different from what we use today for working.
In the real world, if you were sitting next to a stereo, would you prefer to instruct the person sitting next to you to load a song and hit play, or would you prefer to just do it yourself? That's the difference between a command line and a GUI: the difference between describing an action and performing it directly.
As for discoverability: saying you need to "know a language" to know that your song is in a folder, and that folders are nested, and that they reside on drives, ignores the fact that you need to know one for a command line too: you need to know the command "play" exists; you need to know that song names should be in double-quotes. You need to know the keyword "random" (rather than "shuffle"), and whether or not it understands commas.
In a GUI you can spot an icon with a musical note on it, and that's your first clue. Then you can poke around and see a round button with a triangle on it, a 40-year-old visual convention. The inherent discoverability of a visual interface is its primary advantage.
I clearly remember 15 years ago, sitting at a freshly-installed Linux command prompt, and being utterly frustrated. I wanted to explore the system, but didn't know the words. How do you discover "ls"? You can't even use "man" (if you knew about that, which I didn't) because you need to know the command in advance.